Física Quântica - Eisberg &amp; Resnick

Rafaela Pere

Física Quântica - Eisberg & Resnick

Rafaela Pere

visibility

…

description

866 pages

link

1 file

Useful Constants and Conversion Factors Quoted to a useful number of significant figures. Speed of light in vacuum Electron charge magnitude Planck's constant Boltzmann's constant Avogadro's number Coulomb's law constant c = 2.998 x 108 m/sec e = 1.602 x 1 0 = 19 coul h = 6.626 x 10 -34 joule-sec h = h /27c = 1.055 x 10 -34 joule-sec = 0.6582 x 10 -15 eV-sec k = 1.381 x 10 -23 joule / °K = 8.617 x 10 -5 eV/ °K No = 6.023 x 1023/mole 1 /47rE0 = 8.988 x 109 nt - m2 /coul2 Electron rest mass me = 9.109 x 10 -31 kg = 0.5110 MeV/c 2 p = 1.672 x 10 -27 kg = 938.3 MeV/c2 m Proton rest mass Neutron rest mass m„ = 1.675 x 10 -Z7 kg = 939.6 MeV/c 2 Atomic mass unit (C 12 = 12) -27 kg = 931.5 MeV/c 2 u=1.6x0 ub = eh/2me = 9.27 x 10 -24 amp-m2 (or joule/tesla) µn = eh/2m, = 5.05 x 10 -27 amp-m2 (or joule /tesla) ao = 47c€0h2/mee2 = 5.29 x 10 -11 m = 0.529 A E1 = — mee 4/(4rcE0)22h2 = —2.17 x 10 -18 joule = —13.6 eV Electron Compton wavelength Ac = h/mec = 2.43 x 10 -12 m = 0.0243 A Fine-structure constant a = e2 /4nE 0hc = 7.30 x 10 -3 1/137 kT at room temperature k300 °K = 0.0258 eV ^ 1/40 eV Bohr magneton Nuclear magneton Bohr radius Bohr energy 1eV= 1.602 x 10 -19 joule 1 A=10 -10 m 1F=10 -15 m i joule = 6.242 x 10 18 eV l barn (bn)= 10-28m2 QUANTUM PHYSICS Assisted by yid O CaIgweal Univer^^#y^qf^#^rni^ ^^ arbara United'•°Stalês C^^t^^ ^,;^^ Odemy figure on the cover is frori ; èction 9-4, where it is used to show the tendency for two identical spin 1/2 particles (such as electrons) to avoid each other if their spins are essentially parallel. This tendency, or its inverse for the antiparallel case, is one of the recurring themes in quantum physics explanations of the properties of atoms, molecules, solids, nuclei, and particles. The „ QUANTUM PHYSICS of Atoms, Molecules, Solids, Nuclei, and Particles Second Edition ROBERT EISBERG University of California, Santa Barbara JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore Copyright © 1974, 1985, by John Wiley & Sons, Inc. All rights reserved. Published simultaneously in Canada. Reproduction or translation of any part of this work beyond that permitted by Sections 107 and 108 of the 1976 United States Copyright Act without the permission of the copyright owner is unlawful. Requests for permission or further information should be addressed to the Permissions Department, John Wiley & Sons. Library of Congress Cataloging in Publication Data: Eisberg, Robert Martin. Quantum physics of atoms, molecules, solids, nuclei, and particles. Includes index. 1. Quantum theory. I. Resnick, Robert, 1923— II. Title, QC174.12.E34 1985 ISBN 0-471-87373-X 530.1'2 84-10444 Printed in the United States of America Printed an d bound by the Hamilton Printing Comp any. 30 29 28 27 26 25 24 23 PREFACE TO THE SECOND EDITION The many developments that have occurred in the physics of quantum systems since the publication of the first edition of this book—particularly in the field of elementary particles—have made apparent the need for a second edition. In preparing it, we solicited suggestions from the instructors that we knew to be using the book in their courses (and also from some that we knew were not, in order to determine their objections to the book). The wide acceptance of the first edition made it possible for us to obtain a broad sampling of thought concerning ways to make the second edition more useful. We were not able to act on all the suggestions that were received, because some were in conflict with others or were impossible to carry out for technical reasons. But we certainly did respond to the general consensus of these suggestions. Many users of the first edition felt that new topics, typically more sophisticated aspects of quantum mechanics such as perturbation theory, should be added to the book. Yet others said that the level of the first edition was well suited to the course they teach and that it should not be changed. We decided to try to satisfy both groups by adding material to the new edition in the form of new appendices, but to do it in such a way as to maintain the decoupling of the appendices and the text that characterized the original edition. The more advanced appendices are well integrated in the text but it is a one-way, not two-way, integration. A student reading one of these appendices will find numerous references to places in the text where the development is motivated and where its results are used. On the other hand, a student who does not read the appendix because he is in a lower level course will not be frustrated by many references in the text to material contained in an appendix he does not use. Instead, he will find only one or two brief parenthetical statements in the text advising him of the existence of an optional appendix that has a bearing on the subject dealt with in the text. The appendices in the second edition that are new or are significantly changed are: Appendix A, The Special Theory of Relativity (a number of worked-out examples added and an important calculation simplified); Appendix D, Fourier Integral Description of a Wave Group (new); Appendix G, Numerical Solution of the TimeIndependent Schroedinger Equation for a Square Well Potential (completely rewritten to include a universal program in BASIC for solving second-order differential equations on microcomputers); Appendix J, Time-Independent Perturbation Theory (new); Appendix K, Time-Dependent Perturbation Theory (new); Appendix L, The Born Approximation (new); Appendix N, Series Solutions of the Angular and Radial Equations for a One-Electron Atom (new); Appendix Q, Crystallography (new); Appendix R, Gauge Invariance in Classical and Quantum Mechanical Electromagnetism (new). Problem sets have been added to the ends of many of the appendices, both old and new. In particular, Appendix A now contains a brief but comprehensive set of problems for use by instructors who begin their "modern physics" course with a treatment of relativity. v PREFA CE TO THE S ECO ND EDITIO N A large number of small changes and additions have been made to the text to improve and update it. There are also several quite substantial pieces of new material, including: the new Section 13-8 on electron-positron annihilation in solids; the additions to Section 16-6 on the Mössbauer effect; the extensive modernization of the last half of the introduction to elementary particles in Chapter 17; and the entirely new Chapter 18 treating the developments that have occurred in particle physics since the first edition was written. We were very fortunate to have secured the services of Professor David Caldwell of the University of California, Santa Barbara, to write the new material in Chapters 17 and 18, as well as Appendix R. Only a person who has been totally immersed in research in particle physics could have done what had to be done to produce a brief but understandable treatment of what has happened in that field in recent years. Furthermore, since Caldwell is a colleague of the senior author, it was easy to have the interaction required to be sure that this new material was closely integrated into the earlier parts of the book, both in style and in content. Prepublication reviews have made it clear that Caldwell's material is a very strong addition to the book. Professor Richard Christman, of the U.S. Coast Guard Academy, wrote the new material in Section 13-8, Section 16-6, and Appendix Q, receiving significant input from the authors. We are very pleased with the results. The answers to selected problems, found in Appendix S, were prepared by Professor Edward Derringh, of the Wentworth Institute of Technology. He also edited the new additions to the problem sets and prepared a manual giving detailed solutions to most of the problems. The solutions manual is available to instructors from the publisher. It is a pleasure to express our deep appreciation to the people mentioned above. We also thank Frank T. Avignone, III, University of South Carolina; Edward Cecil, Colorado School of Mines; L. Edward Millet, California State University, Chico; and James T. Tough, The Ohio State University, for their very useful prepublication reviews. The following people offered suggestions or comments which helped in the development of the second edition: Alan H. Barrett, Massachusetts Institute of Technology; Richard H. Behrman, Swarthmore College; George F. Bertsch, Michigan State University; Richard N. Boyd, The Ohio State University; Philip A. Casabella, Rensselaer Polytechnic Institute; C. Dewey Cooper, University of Georgia; James E. Draper, University of California at Davis; Arnold Engler, Carnegie-Mellon University; A. T. Fromhold, Jr., Auburn University; Ross Garrett, University of Auckland; Russell Hobbie, University of Minnesota; Bei-Lok Hu, University of Maryland; Hillard Huntington, Rensselaer Polytechnic Institute; Mario Iona, University of Denver; Ronald G. Johnson, Trent University; A. L. Laskar, Clemson University; Charles W. Leming, Henderson State University; Luc Leplae, University of Wisconsin-Milwaukee; Ralph D. Meeker, Illinois Benedictine College; Roger N. Metz, Colby College; Ichiro Miyagawa, University of Alabama; J. A. Moore, Brock University; John J. O'Dwyer, State University of New York at Oswego; Douglas M. Potter, Rutgers State University; Russell A. Schaffer, Lehigh University; John W. Watson, Kent State University; and Robert White, University of Auckland. We appreciate their contribution. Santa Barbara, California Troy, New York Robert Eisberg Robert Resnick PREFACE TO THE FIRST EDITION The basic purpose of this book is to present clear and valid treatments of the properties of almost all of the important quantum systems from the point of view of elementary quantum mechanics. Only as much quantum mechanics is developed as is required to accomplish the purpose. Thus we have chosen to emphasize the applications of the theory more than the theory itself. In so doing we hope that the book will be well adapted to the attitudes of contemporary students in a terminal course on the phenomena of quantum physics. As students obtain an insight into the tremendous explanatory power of quantum mechanics, they should be motivated to learn more about the theory. Hence we hope that the book will be equally well adapted to a course that is to be followed by a more advanced course in formal quantum mechanics. The book is intended primarily to be used in a one year course for students who have been through substantial treatments of elementary differential and integral calculus and of calculus level elementary classical physics. But it can also be used in shorter courses. Chapters 1 through 4 introduce the various phenomena of early quantum physics and develop the essential ideas of the old quantum theory. These chapters can be gone through fairly rapidly, particularly for students who have had some prior exposure to quantum physics. The basic core of quantum mechanics, and its application to one- and two-electron atoms, is contained in Chapters 5 through 8 and the first four sections of Chapter 9. This core can be covered well in appreciably less than half a year. Thus the instructor can construct a variety of shorter courses by adding to the core material from the chapters covering the essentially independent topics: multielectron atoms and molecules, quantum statistics and solids, nuclei and particles. Instructors who require a similar but more extensive and higher level treatment of quantum mechanics, and who can accept a much more restricted coverage of the applications of the theory, may want to use Fundamentals of Modern Physics by Robert Eisberg (John Wiley & Sons, 1961), instead of this book. For instructors requiring a more comprehensive treatment of special relativity than is given in Appendix A, but similar in level and pedagogic style to this book, we recommend using in addition Introduction to Special Relativity by Robert Resnick (John Wiley & Sons, 1968). Successive preliminary editions of this book were developed by us through a procedure involving intensive classroom testing in our home institutions and four other schools. Robert Eisberg then completed the writing by significantly revising and extending the last preliminary edition. He is consequently the senior author of this book. Robert Resnick has taken the lead in developing and revising the last preliminary edition so as to prepare the manuscript for a modern physics counterpart at a somewhat lower level. He will consequently be that book's senior author. The pedagogic features of the book, some of which are not usually found in books at this level, were proven in the classroom testing to be very suçcessful. These features are: detailed outlines at the beginning of each chapter, numerous worked out vii PREFACE TO THE FIRS T EDITIO N examples in each chapter, optional sections in the chapters and optional appendices, summary sections and tables, sets of questions at the end of each chapter, and long and varied sets of thoroughly tested problems at the end of each chapter, with subsets of answers at the end of the book. The writing is careful and expansive. Hence we believe that the book is well suited to self-learning and to self-paced courses. We have employed the MKS (or SI) system of units, but not slavishly so. Where general practice in a particular field involves the use of alternative units, they are used here. It is a pleasure to express our appreciation to Drs. Harriet Forster, Russell Hobbie, Stuart Meyer, Gerhard Salinger, and Paul Yergin for constructive reviews, to Dr. David Swedlow for assistance with the evaluation and solutions of the problems, to Dr. Benjamin Chi for assistance with the figures, to Mr. Donald Deneck for editorial and other assistance, and to Mrs. Cassie Young and Mrs. Carolyn Clemente for typing and other secretarial services. Santa Barbara, California Troy, New York Robert Eisberg Robert Resnick CONTENTS 1 THERMAL RADIATION AND PLANCK'S POSTULATE 1-1 Introduction 1-2 Thermal Radiation 1-3 Classical Theory of Cavity Radiation 1-4 Planck's Theory of Cavity Radiation 1-5 The Use of Planck's Radiation Law in Thermometry 1-6 Planck's Postulate and Its Implications 1-7 A Bit of Quantum History 2 PHOTONS—PARTICLELIKE PROPERTIES OF RADIATION 1 2 2 6 13 19 20 21 26 2-1 Introduction 2-2 The Photoelectric Effect 2-3 Einstein's Quantum Theory of the Photoelectric Effect 2-4 The Compton Effect 2-5 The Dual Nature of Electromagnetic Radiation 2-6 Photons and X-Ray Production 2-7 Pair Production and Pair Annihilation 2-8 Cross Sections for Photon Absorption and Scattering 27 27 29 34 40 40 43 48 3 DE BROGLIE'S POSTULATE—WAVELIKE PROPERTIES OF PARTICLES 55 3-1 Matter Waves 3-2 The Wave-Particle Duality 3-3 The Uncertainty Principle 3-4 Properties of Matter Waves 3-5 Some Consequences of the Uncertainty Principle 3-6 The Philosophy of Quantum Theory 4 BOHR'S MODEL OF THE ATOM 4-1 Thomson's Model 4-2 Rutherford's Model 4-3 The Stability of the Nuclear Atom 4-4 Atomic Spectra 4-5 Bohr's Postulates 4-6 Bohr's Model 4-7 Correction for Finite Nuclear Mass 4-8 Atomic Energy States 4-9 Interpretation of the Quantization Rules 4-10 Sommerfeld's Model 4-11 The Correspondence Principle 4-12 A Critique of the Old Quantum Theory 56 62 65 69 77 79 85 86 90 95 96 98 100 105 107 110 114 117 118 ix CO N TENTS 5 SCHROEDINGER'S THEORY OF QUANTUM MECHANICS 5-1 Introduction 5-2 Plausibility Argument Leading to Schroedinger's Equation 5-3 Born's Interpretation of Wave Functions 5-4 Expectation Values 5-5 The Time-Independent Schroedinger Equation 5-6 Required Properties of Eigenfunctions 5-7 Energy Quantization in the Schroedinger Theory 5-8 Summary 6 SOLUTIONS OF TIME-INDEPENDENT SCHROEDINGER EQUATIONS 6-1 Introduction 6-2 The Zero Potential 6-3 The Step Potential (Energy Less Than Step Height) 6-4 The Step Potential (Energy Greater Than Step Height) 6-5 The Barrier Potential 6-6 Examples of Barrier Penetration by Particles 6-7 The Square Well Potential 6-8 The Infinite Square Well Potential 6-9 The Simple Harmonic Oscillator Potential 6-10 Summary 7 ONE-ELECTRON ATOMS 7-1 Introduction 7-2 Development of the Schroedinger Equation 7-3 Separation of the Time-Independent Equation 7-4 Solution of the Equations 7-5 Eigenvalues, Quantum Numbers, and Degeneracy 7-6 Eigenfunctions 7-7 Probability Densities 7-8 Orbital Angular Momentum 7-9 Eigenvalue Equations 124 125 128 134 141 150 155 157 165 176 177 178 184 193 199 205 209 214 221 225 232 233 234 235 237 239 242 244 254 259 8 MAGNETIC DIPOLE MOMENTS, SPIN, AND TRANSITION RATES 266 8-1 Introduction 8-2 Orbital Magnetic Dipole Moments 8-3 The Stern-Gerlach Experiment and Electron Spin 8-4 The Spin-Orbit Interaction 8-5 Total Angular Momentum 8-6 Spin-Orbit Interaction Energy and the Hydrogen Energy Levels 8-7 Transition Rates and Selection Rules 8-8 A Comparison of the Modern and Old Quantum Theories 267 267 272 278 281 284 288 295 9 MULTIELECTRON ATOMS—GROUND STATES AND X-RAY EXCITATIONS 9-1 Introduction 9-2 Identical Particles 9-3 The Exclusion Principle 9-4 Exchange Forces and the Helium Atom 9-5 The Hartree Theory 300 301 302 308 310 319 10 MULTIELECTRON ATOMS—OPTICAL EXCITATIONS 10-1 Introduction 10-2 Alkali Atoms 10-3 Atoms with Several Optically Active Electrons 10-4 LS Coupling 10-5 Energy Levels of the Carbon Atom 10-6 The Zeeman Effect 10-7 Summary 11 QUANTUM STATISTICS 11-1 Introduction 11-2 Indistinguishability and Quantum Statistics 11-3 The Quantum Distribution Functions 11-4 Comparison of the Distribution Functions 11-5 The Specific Heat of a Crystalline Solid 11-6 The Boltzmann Distributions as an Approximation to Quantum Distributions 11-7 The Laser 11-8 The Photon Gas 11-9 The Phonon Gas 11-10 Bose Condensation and Liquid Helium 11-11 The Free Electron Gas 11-12 Contact Potential and Thermionic Emission 11-13 Classical and Quantum Descriptions of the State of a System 12 MOLECULES 12-1 Introduction 12-2 Ionic Bonds 12-3 Covalent Bonds 12-4 Molecular Spectra 12-5 Rotational Spectra 12-6 Vibration-Rotation Spectra 12-7 Electronic Spectra 12-8 The Raman Effect 12-9 Determination of Nuclear Spin and Symmetry Character 322 331 337 347 348 349 352 356 361 364 370 375 376 377 380 384 388 391 392 398 399 399 404 407 409 415 416 416 418 422 423 426 429 432 434 13 SOLIDS—CONDUCTORS AND SEMICONDUCTORS 442 13-1 Introduction 13-2 Types of Solids 13-3 Band Theory of Solids 13-4 Electrical Conduction in Metals 13-5 The Quantum Free-Electron Model 13-6 The Motion of Electrons in a Periodic Lattice 13-7 Effective Mass 13-8 Electron-Positron Annihilation in Solids 13-9 Semiconductors 13-10 Semiconductor Devices 443 443 445 450 452 456 460 464 467 472 x S1N3lNO0 9-6 Results of the Hartree Theory 9-7 Ground States of Multielectron Atoms and the Periodic Table 9-8 X-Ray Line Spectra CONTENTS 14 SOLIDS—SUPERCONDUCTORS AND MAGNETIC PROPERTIES 14-1 Superconductivity 14-2 Magnetic Properties of Solids 14-3 Paramagnetism 14-4 Ferromagnetism 14-5 Antiferromagnetism and Ferrimagnetism 15 NUCLEAR MODELS 15-1 Introduction 15-2 A Survey of Some Nuclear Properties 15-3 Nuclear Sizes and Densities 15-4 Nuclear Masses and Abundances 15-5 The Liquid Drop Model 15-6 Magic Numbers 15-7 The Fermi Gas Model 15-8 The Shell Model 15-9 Predictions of the Shell Model 15-10 The Collective Model 15-11 Summary 16 NUCLEAR DECAY AND NUCLEAR REACTIONS 16-1 Introduction 16-2 Alpha Decay 16-3 Beta Decay 16-4 The Beta-Decay Interaction 16-5 Gamma Decay 16-6 The Mössbauer Effect 16-7 Nuclear Reactions 16-8 Excited States of Nuclei 16-9 Fission and Reactors 16-10 Fusion and the Origin of the Elements 17 INTRODUCTION TO ELEMENTARY PARTICLES 17-1 Introduction 17-2 Nucleon Forces 17-3 Isospin 17-4 Pions 17-5 Leptons 17-6 Strangeness 17-7 Families of Elementary Particles 17-8 Observed Interactions and Conservation Laws 18 MORE ELEMENTARY PARTICLES 18-1 Introduction 18-2 Evidence for Partons 18-3 Unitary Symmetry and Quarks 18-4 Extensions of SU(3)—More Quarks 18-5 Color and the Color Interaction 18-6 Introduction to Gauge Theories 18-7 Quantum Chromodynamics 18-8 Electroweak Theory 18-9 Grand Unification and the Fundamental Interactions 483 484 492 493 497 503 508 509 510 515 519 526 530 531 534 540 545 549 554 555 555 562 572 578 584 588 598 602 607 617 618 618 631 634 641 643 649 653 666 667 667 673 678 683 688 691 699 706 S1N3L N O J Appendix A The Special Theory of Relativity Appendix B Radiation from an Accelerated Charge Appendix C The Boltzmann Distribution Appendix D Fourier Integral Description of a Wave Group Appendix E Rutherford Scattering Trajectories Appendix F Complex Quantities Appendix G Numerical Solution of the Time-Independent Schroedinger Equation for a Square Well Potential Appendix H Analytical Solution of the Time-Independent Schroedinger Equation for a Square Well Potential Appendix I Series Solution of the Time-Independent Schroedinger Equation for a Simple Harmonic Oscillator Potential Appendix J Time-Independent Perturbation Theory Appendix K Time-Dependent Perturbation Theory Appendix L The Born Approximation Appendix M The Laplacian and Angular Momentum Operators in Spherical Polar Coordinates Appendix N Series Solutions of the Angular and Radial Equations for a One-Electron Atom Appendix O The Thomas Precession Appendix P The Exclusion Principle in LS Coupling Appendix Q Crystallography Appendix R Gauge Invariance in Classical and Quantum Mechanical Electromagnetism Appendix S Answers to Selected Problems Index QUANTUM PHYSICS 1 THERMAL RADIATION AND PLANCK'S POSTULATE 1-1 INTRODUCTION 2 old quantum theory; relation of quantum physics to classical physics; role of Planck's constant 1 2 THERMAL RADIATION - 2 properties of thermal radiation; blackbodies; spectral radiancy; distribution functions; radiancy; Stefan's law; Stefan-Boltzmann constant; Wien's law; cavity radiation; energy density; Kirchhoff's law 1 3 CLASSICAL THEORY OF CAVITY RADIATION - 6 electromagnetic waves in a cavity; standing waves; count of allowed frequencies; equipartition of energy; Boltzmann's constant; Rayleigh-Jeans spectrum 1 4 PLANCK'S THEORY OF CAVITY RADIATION - 13 Boltzm an n distribution; discrete energies; violation of equipartition; Planck's constant; Planck's spectrum 1 5 THE USE OF PLANCK'S RADIATION LAW IN THERMOMETRY - - 1 6 - 19 optical pyrometers; universal 3°K radiation and the big bang PLANCK'S POSTULATE AND ITS IMPLICATIONS 20 general statement of postulate; quantized energies; quantum states; quantum numbers; macroscopic pendulum 1 7 - A BIT OF QUANTUM HISTORY 21 Planck's initial work; attempts to reconcile quantization with classical physics QUESTIONS 22 PROBLEMS 23 1 THERMAL RAD IATIO N AND PLAN CK 'S P OSTU LATE N Q s U 1-1 INTRODUCTION At a meeting of the German Physical Society on Dec. 14, 1900, Max Planck read his paper, "On the Theory of the Energy Distribution Law of the Normal Spectrum." This paper, which first attracted little attention, was the start of a revolution in physics. The date of its presentation is considered to be the birthday of quantum physics, although it was not until a quarter of a century later that modern quantum mechanics, the basis of our present understanding, was developed by Schroedinger and others. Many paths converged on this understanding, each showing another aspect of the breakdown of classical physics. In this and the following three chapters we shall examine the major milestones, of what is now called the old quantum theory, that led to modern quantum mechanics. The experimental phenomena which we shall discuss in connection with the old quantum theory span all the disciplines of classical physics: mechanics, thermodynamics, statistical mechanics, and electromagnetism. Their repeated contradiction of classical laws, and the resolution of these conflicts on the basis of quantum ideas, will show us the need for quantum mechanics. And our study of the old quantum theory will allow us to more easily obtain a deeper understanding of quantum mechanics when we begin to consider it in the fifth chapter. As is true of relativity (which is treated briefly in Appendix A), quantum physics represents a generalization of classical physics that includes the classical laws as special cases. Just as relativity extends the range of application of physical laws to the region of high velocities, so quantum physics extends that range to the region of small dimensions. And just as a universal constant of fundamental significance, the velocity of light c, characterizes relativity, so a universal constant of fundamental significance, now called Planck's constant h, characterizes quantum physics. It was while trying to explain the observed properties of thermal radiation that Planck introduced this constant in his 1900 paper. Let us now begin to examine thermal radiation ourselves. We shall be led thereby to Planck's constant and the extremely significant related quantum concept of the discreteness of energy. We shall also find that thermal radiation has considerable importance and contemporary relevance in its own right. For instance, the phenomenon has recently helped astrophysicists decide among competing theories of the origin of the universe. Another example is given by the rapidly developing technology of solar heating, which depends on the thermal radiation received by the earth from the sun. 1-2 THERMAL RADIATION The radiation emitted by a body as a result of its temperature is called thermal radiation. All bodies emit such radiation to their surroundings and absorb such radiation from them. If a body is at first hotter than its surroundings, it will cool off because its rate of emitting energy exceeds its rate of absorbing energy. When thermal euilibxium_is reached the rates of emission and absorption are equal. Matter in a condensed state (i.e., solid or liquid) emits a continuous spectrum of radiation. The details of the spectrum are almost independent of the particular material of which a body is composed, but they depend strongly on the temperature. At ordinary temperatures most bodies are visible to us not by their emitted light but by the light they reflect. If no light shines on them we cannot see them. At very high temperatures, however, bodies are self-luminous. We can see them glow in a darkened room; but even at temperatures as high as several thousand degrees Kelvin well over 90% of the emitted thermal radiation is invisible to us, being in the infrared part of the electromagnetic spectrum. Therefore, self-luminous bodies are quite hot. Consider, for example, heating an iron poker to higher and higher temperatures in a fire, periodically withdrawing the poker from the fire long enough to observe its properties. When the poker is still at a relatively low temperature it radiates heat, but it is not visibly hot. With increasing temperature the amount of radiation that the Distribution functions, of which spectral radiancy is an example, are very common in physics. For example, the Maxwellian speed distribution function (which looks rather like one of the curves in Figure 1-1) tells us how the molecules in a gas at a fixed pressure and temperature are distributed according to their speed. Another distribution function that the student has probably already seen is the one (which has the form of a decreasing exponential) specifying the times of decay of radioactive nuclei in a sample containing nuclei of a given species, and he has certainly seen a distribution function for the grades received on a physics exam. The spectral radiancy distribution function of Figure 1-1 for a blackbody of a given area and a particular temperature, say 1000°K, shows us that: (1) there is very little power radiated in a frequency interval of fixed size dv if that interval is at a frequency v which is very small compared to 10 14 Hz. The power is zero for v equal to zero. (2) The power radiated in the interval dv increases rapidly as v increases from very small values. (3) It maximizes for a value of v ^z 1.1 x 10 14 Hz. That is, the radiated power is most intense at that frequency. (4) Above ^, 1.1 x 10 14 Hz the radiated power drops slowly but continuously as v increases. It is zero again when v approaches infinitely large values. The two distribution functions for the higher values of temperature, 1500°K and 2000°K, displayed in the figure show us that (5) the frequency at which the radiated power is most N N011b'I a `dEI 1 `dWa3H1 poker emits increases very rapidly and visible effects are noted. The poker assumes a dull red color, then a bright red color, and, at very high temperatures, an intense blue-white color. That is, with increasing temperature the body emits more thermal radiation and the frequency of the most intense radiation becomes higher. The relation between the temperature of a body and the frequency spectrum of the emitted radiation is used in a device called an optical pyrometer. This is essentially a rudimentary spectrometer that allows the operator to estimate the temperature of a hot body, such as a star, by observing the color, or frequency composition, of the thermal radiation that it emits. There is a continuous spectrum of radiation emitted, the eye seeing chiefly the color corresponding to the most intense emission in the visible region. Familiar examples of objects which emit visible radiation include hot coals, lamp filaments, and the sun. Generally speaking, the detailed form of the spectrum of the thermal radiation emitted by a hot body depends somewhat upon the composition of the body. However, experiment shows that there is one class of hot bodies that emits thermal spectra of a universal character. These are called blackbodies, that is, bodies that have surfaces which absorb all the thermal radiation incident upon them. The name is appropriate because such bodies do not reflect light and appear black when their temperatures are low enough that they are not self-luminous. One example of a (nearly) blackbody would be any object coated with a diffuse layer of black pigment, such as lamp black or bismuth black. Another, quite different, example will be described shôrtly._ Independent of the details of their composition, it is found that all blackbodies at the same temperature emit thermal radiation with the same spectrum. This general fact can be understood on the basis of classical arguments involving thermodynamic equilibrium. The specific form of the spectrum, however, cannot be obtained from thermodynamic arguments alone. The universal properties of the radiation emitted by blackbodies make them of particular theoretical interest and physicists sought to explain the specific features of their spectrum. The spectral distribution of blackbody radiation is specified by the quantity R T(v), called the spectral radiancy, which is defined so that R T (v) dv is equal to the energy emitted per unit time in radiation of frequency in the interval y to y + dv from a unit area of the surface at absolute temperature T. The earliest accurate measurements of this quantity were made by Lummer and Pringsheim in 1899. They used an instrument essentially similar to the prism spectrometers used in measuring optical spectra, except that special materials were required for the lenses, prisms, etc., so that they would be transparent to the relatively low frequency thermal radiation. The experimentally observed dependence of R T(v) on y and T is shown in Figure 1-1. THERMAL R AD IATION A ND PLAN CK 'S POSTU LATE 3 2000° K 1500°K 1000°K 0 1 2 3 v(10 14 Hz) The spectral radiancy of a blackbody radiator as a function of the frequency of radiation, shown for temperatures of the radiator of 1000 ° K, 1500° K, and 2000 ° K. Note that the frequency at which the maximum radiancy occurs (dashed line) increases linearly with increasing temperature, and that the total power emitted per square meter of the radiator (area under curve) increases very rapidly with temperature. Figure 1 1 - intense increases with increasing temperature. Inspection will verify that this frequency increases linearly with temperature. (6) The total power radiated in all frequencies increases with increasing temperature, and it does so more rapidly than linearly. The total power radiated at a particular temperature is given simply by the area under the curve for that temperature, f ô R T(v) dv, since R T (v) dv is the power radiated in the frequency interval from v to v + dv. The integral of the spectral radiancy R T(v) over all y— is the total energy emitted per unit time per unit area from a blackbody at temperature T. It is called the radiancy RT. That is co RT = J R T (v) dv (1-1) o As we have seen in the preceding discussion of Figure 1-1, RT increases rapidly with increasing temperature. In fact, this result is called Stefan's law, and it was first stated in 1879 in the form of an empirical equation (1-2) RT = aT 4 where a = 5.67 x 10 -S W/m2-°K4 is called the Stefan-Boltzmann constant. Figure 1-1 also shows us that the spectrum shifts toward higher frequencies as T increases. This result is called Wien's displacement law (1-3a) Vmax G T is the frequency v at which R T(v) has its maximum value for a particT increases, Vmax is displaced toward higher frequencies. All these results where vmax ular T. As are in agreement with the familiar experiences discussed earlier, namely that the amount of thermal radiation emitted increases rapidly (the poker radiates much more heat energy at higher temperatures), and the principal frequency of the radiation becomes higher (the poker changes color from dull red to blue-white), with increasing temperature. A cavity in a body connected by a small hole to the outside. Radiation incident on the hole is completely absorbed after successive reflections on the inner surface of the cavity. The hole absorbs like a blackbody. In the reverse process, in which radiation leaving the hole is built up of contributions emitted from the inner surface, the hole emits like a blackbody. Another example of a blackbody, which we shall see to be particularly important, can be found by considering an object containing a cavity which is connected to the outside by a small hole, as in Figure 1-2. Radiation incident upon the hole from the outside enters the cavity and is reflected back and forth by the walls of the cavity, eventually being absorbed on these walls. If the area of the hole is very small compared to the area of the inner surface of the cavity, a negligible amount of the incident radiation will be reflected back through the hole. Essentially all the radiation incident upon the hole is absorbed; therefore, the hole must have the properties of the surface of a blackbody. Most blackbodies used in laboratory experiments are constructed along these lines. Now assume that the walls of the cavity are uniformly heated to a temperature T. Then the walls will emit thermal radiation which will fill the cavity. The small fraction of this radiation incident from the inside upon the hole will pass through the hole. Thus the hole will act as an emitter of thermal radiation. Since the hole must have the properties of the surface of a blackbody, the radiation emitted by the hole must have a blackbody spectrum; but since the hole is merely sampling the thermal radiation present inside the cavity, it is clear that the radiation in the cavity must also have a blackbody spectrum. In fact, it will have a blackbody spectrum characteristic of the temperature T on the walls, since this is the only temperature defined for the system. The spectrum emitted by the hole in the cavity is specified in terms of the energy flux R T (v). It is more useful, however, to specify the spectrum of radiation inside the cavity, called cavity radiation, in terms of an energy density, p T (v), which is defined as the energy contained in a unit volume of the cavity at temperature T in the frequency interval y to y + dv. It is evident that these quantities are proportional to one another; that is PT(v) cc R T (v) (1 4) - Hence, the radiation inside a cavity whose walls are at temperature T has the same character as the radiation emitted by the surface of a blackbody at temperature T. It is convenient experimentally to produce a blackbody spectrum by means of a cavity in a heated body with a hole to the outside, and it is convenient in theoretical work to study blackbody radiation by analyzing the cavity radiation because it is possible to apply very general arguments to predict the properties of cavity radiation. Example 1-1. (a) Since Av = c, the constant velocity of light, Wien's displacement law (1-3a) can also be put in the form (1-3b) 2max T = const where Amax is the wavelength at which the spectral radiancy has its maximum value for a particular temperature T. The experimentally determined value of Wien's constant is 2.898 x 10 -3 m-°K. If we assume that stellar surfaces behave like blackbodies we can get a good estimate of their temperature by measuring Amax. For the sun Amax = 5100 A, whereas for the North Star Amax = 3500 A. Find the surface temperature of these stars. (One angstrom = 1A =10 -10 m.) NOIlt/I ab'a 1`dWa3 H1 Figure 1-2 TH ERMAL RADIATION AND PLANC K 'S POSTULATE co Q. t O ^ For the sun, T = 2.898 x 10 -3 m-°K/5100 x 10 -1° m = 5700°K. For the North Star, T = 2.898 x 10 -3 m-°K/3500 x 10 -1° m = 8300°K. At 5700°K the sun's surface is near the temperature at which the greatest part of its radiation lies within the visible region of the spectrum. This suggests that over the ages of human evolution our eyes have adapted to the sun to become most sensitive to those wavelengths which it radiates most intensely. • (b) Using Stefan's law, (1-2), and the temperatures just obtained, determine the power radiated from 1 cm 2 of stellar surface. ■For the sun -8 RT = TT' = 5.67 x 10 W/m 2 - °K4 x (5700°K)4 = 5.90 x 107 W/m 2 ^ 6000 W/cm 2 For the North Star RT = 6T 4 = 5.67 x 10 -8 W/m2 K. x (8300°K)4 = 2.71 x 108 W/m 2 ^ 27,000 W/cm2 ( Example 1 2. Assume we have two small opaque bodies a large distance from one another supported by fine threads in a large evacuated enclosure whose walls are opaque and kept at a constant temperature. In such a case the bodies and walls can exchange energy only by means of radiation. Let e represent the rate of emission of radiant energy by a body and let a represent the rate of absorption of radiant energy by a body. Show that at equilibrium - ei = e2= 1 a i a2 (1-5) This relation, (1-5), is known as Kirchhoff's law for radiation. ■The equilibrium state is one of constant temperature throughout the enclosed system, and in that state the emission rate necessarily equals the absorption rate for each body. Hence and e2 = a2 el = a l Therefore e1 =1—e2 al a2 If one body, say body 2, is a blackbody, then a 2 > a l because a blackbody is a better absorber than a non-blackbody. Hence, it follows from (1-5) that e 2 > e 1 . The observed fact that good absorbers are also good emitters is thus predicted by Kirchhoff's law. 4 1-3 CLASSICAL THEORY OF CAVITY RADIATION Shortly after the turn of the present century, Rayleigh, and also Jeans, made a calculation of the energy density of cavity (or blackbody) radiation that points up a serious conflict between classical physics and experimental results. This calculation is similar to calculations that arise in considering many other phenomena (e.g., specific heats of solids) to be treated later. We present the details here, but as an aid in guiding us through the calculations we first outline their general procedure. Consider a cavity with metallic walls heated uniformly to temperature T. The walls emit electromagnetic radiation in the thermal range of frequencies. We know that this happens, basically, because of the accelerated motions of the electrons in the metallic walls that arise from thermal agitation (see Appendix B). However, it is not necessary to study the behavior of the electrons in the walls of the cavity in detail. Instead, attention is focused on the behavior of the electromagnetic waves in the interior of the cavity. Rayleigh and Jeans proceeded as follows. First, classical electromagnetic theory is used to show that the radiation inside the cavity must exist in the form of standing waves with nodes at the metallic surfaces. By using geometrical arguments, a count is made of the number of such standing waves in the frequency interval v to v + dv, in order to determine how the number depends on v. Then a Figure 1 3 A metallic walled cubical cavity filled with electromagnetic radiation, showing three noninterfering components of that radiation bouncing back and forth between the walls and forming standing waves with nodes at each wall. - NOilt/IQ `d l:l AlU1`dOJO AaO9Hl1`dO ISSb'1 0 result of classical kinetic theory is used to calculate the average total energy of these waves when the system is in thermal equilibrium. The average total energy depends, in the classical theory, only on the temperature T. The number of standing waves in the frequency interval times the average energy of the waves, divided by the volume of the cavity, gives the average energy content per unit volume in the frequency interval y to y + dv. This is the required quantity, the energy density p T(v). Let us now do it ourselves. We assume for simplicity that the metallic-walled cavity filled with electromagnetic radiation is in the form of a cube of edge length a, as shown in Figure 1-3. Then the radiation reflecting back and forth between the walls can be analyzed into three components along the three mutually perpendicular directions defined by the edges of the cavity. Since the opposing walls are parallel to each other, the three components of the radiation do not mix, and we may treat them separately. Consider first the x component and the metallic wall at x = O. All the radiation of this component which is incident upon the wall is reflected by it, and the incident and reflected waves combine to form a standing wave. Now, since electromagnetic radiation is a transverse vibration with the electric field vector E perpendicular to the propagation direction, and since the propagation direction for this component is perpendicular to the wall in question, its electric field vector E is parallel to the wall. A metallic wall cannot, however, support an electric field parallel to the surface, since charges can always flow in such a way as to neutralize the electric field. Therefore, E for this component must always be zero at the wall. That is, the standing wave associated with the x-component of the radiation must have a node (zero amplitude) at x = O. The standing wave must also have a node at x = a because there can be no parallel electric field in the corresponding wall. Furthermore, similar conditions apply to the other two components; the standing wave associated with the y component must have nodes at y = 0 and y = a, and the standing wave associated with the z component must have nodes at z = 0 and z = a. These conditions put a limitation on the possible wavelengths, and therefore on the possible frequencies, of the electromagnetic radiation in the cavity. THERMAL RADIATION AND PLANC K 'S POSTU LATE co Now we shall consider the question of counting the number of standing waves with nodes on the surfaces of the cavity, whose wavelengths lie in the interval 2 to 2 + d2 corresponding to the frequency interval v to v + dv. To focus attention on the ideas involved in the calculation, we shall first treat the x component alone; that is, we shall consider the simplified, but artificial, case of a "one-dimensional cavity" of length a. After we have worked through this case, we shall see that the procedure for generalizing to a real three-dimensional cavity is obvious. The electric field for one-dimensional electromagnetic standing waves can be described mathematically by the function E(x,t) = E0 sin (2irx/2) sin (2irvt) (1-6) where 2 is the wavelength of the wave, v is its frequency, and E 0 is its maximum amplitude. The first two quantities are related by the equation v = c/2 (1-7) where c is the propagation velocity of electromagnetic waves. Equation (1-6) represents a wave whose amplitude has the sinusoidal space variation sin (2irx/A) and which is oscillating in time sinusoidally with frequency v like a simple harmonic oscillator. Since the amplitude is obviously zero, at all times t, for positions satisfying the relation (1-8) 2x/A = 0, 1, 2, 3, ... the wave has fixed nodes; that is, it is a standing wave. In order to satisfy the requirement that the waves have nodes at both ends of the one-dimensional cavity, we choose the origin of the x axis to be at one end of the cavity (x = 0) and then require that at the other end (x = a) 2x //1, = n forx = a (1-9) where n = 1,2,3,4,... This condition determines a set of allowed values of the wavelength A. For these allowed values, the amplitude patterns of the standing waves have the appearance shown in Figure 1-4. These patterns may be recognized as the standing wave patterns for vibrations of a string fixed at both ends, a real physical system which also satisfies (1-6). In our case the patterns represent electromagnetic standing waves. It is convenient to continue the discussion in terms of the allowed frequencies instead of the allowed wavelengths. These frequencies are v = c/ A, where 2a/1 = n. That is v = cn/2a n = 1, 2, 3, 4, ... (1-10) We can represent these allowed values of frequency in terms of a diagram consisting of an axis on which we plot a point at every integral value of n. On such a diagram, the value of the allowed frequency v corresponding to a particular value of n is, by (1-10), equal to c/2a times the distance d from the origin to the appropriate point, or the distance d is 2a/c times the frequency v. These relations are shown in Figure 1-5. Such a diagram is useful in calculating the number of allowed values in frequency , n =1 The amplitude patterns of standing waves in a one-dimensional cavity with walls at x = 0 and x = a, for the first three values of the index n. Figure 1 4 - d=(2a/c) (v+dv) ^ d=(2a/c) v n The allowed values of the index n, which determines the allowed values of the frequency, in a one-dimensional cavity of length a. Figure 1 5 - range v to v + dv, which we call N(v) dv. To evaluate this quantity we simply count the number of points on the n axis which fall between two limits which are constructed so as to correspond to the frequencies v and v + dv, respectively. Since the points are distributed uniformly along the n axis, it is apparent that the number of points falling between the two limits will be proportional to dv but will not depend on v. In fact, it is easy to see that N(v) dv = (2a/c) dv. However, we must multiply this by an additional factor of 2 since, for each of the allowed frequencies, there are actually two independent waves corresponding to the two possible states of polarization of electromagnetic waves. Thus we have N(v)dv = 4a dv (1-11) This completes the calculation of the number of allowed standing waves for the artificial case of a one-dimensional cavity. The above calculation makes apparent the procedures for extending the calculation to the real case of a three-dimensional cavity. This extension is indicated in Figure 1-6. Here the set of points uniformly distributed at integral values along a single n axis is replaced by a uniform three-dimensional array of points whose three coordinates occur at integral valuès along each of three mutually perpendicular n Each point of the array corresponds to a particular allowed three-dimensional axes. fly nx The allowed frequencies in a three-dimensional cavity in the form of a cube of edge length a are determined by three indices nx , n y, nZ , which can each assume only integral values. For clarity, only a few of the very many points corresponding to sets of these indices are shown. Figure 1-6 NOIlV Iab'I:I Alln`d0A OAaOOHl i `dOISSt/1O 0 1 2 3 4••• CD THERMAL RAD IATIONAND PLANCK 'S POSTULATE T standing wave. The integral values of nx, ny, and nz specified by each point give the number of nodes of the x, y, and z components, respectively, of the three-dimensional wave. The procedure is equivalent to analyzing a three-dimensional wave (i.e., one propagated in an arbitrary direction) into three one-dimensional component waves. Here the number of allowed frequencies in the frequency interval v to v + dv is equal to the number of points contained between shells of radii corresponding to frequencies v and v + dv, respectively. This will be proportional to the volume contained between these two shells, since the points are uniformly distributed. Thus it is apparent that N(v) dv will be proportional to v 2 dv, the first factor, v 2, being proportional to the area of the shells and the second factor, dv, being the distance between them. In the following example we shall work out the details and find N(v) dv = 87c3V v 2 dv (1-12) where V = a3, the volume of the cavity. Derive (1-12), which gives the number of allowed electromagnetic standing waves in each frequency interval for the case of a three-dimensional cavity in the form of a metallic-walled cube of edge length a. No-Consider radiation of wavelength 2 and frequency y = c/2, propagating in the direction defined by the three angles a, f, y, as shown in Figure 1-7. The radiation must be a standing wave since all three of its components are standing waves. We have indicated the locations ci of some of the fixed nodes of this standing wave by a set of planes perpendicular to the propagation direction a, 13, y. The distance between these nodal planes of the radiation is just .A/2, where 2 is its wavelength. We have also indicated the locations at the three axes of the nodes of the three components. The distances between these nodes are 2x/2 = 2/2cos a Ay/2 = 2/2cos fl (1-13) .1z/2 = i/2cos y Let us write expressions for the magnitudes at the three axes of the electric fields of the three components. They are E(x,t) = E0x sin (2irx/Ax) sin (27rvt) E(y,t) = Eon, sin (27ry/2y) sin (27rvt) E(z,t) = E0 sin (271z1 A z) sin (2irvt) Example 1 3. - Û z Xx/2 > c Xx/2 Figure 1 7 The nodal planes of a standing wave propagating in a certain direction in a cubical cavity. - 2a/A = V nx ny + nz where nx, ny , take on all possible integral values. This equation describes the limitation on the possible wavelengths of the electromagnetic radiation contained in the cavity. We again continue the discussion in terms of the allowed frequencies instead of the allowed wavelengths. They are v — C =2a vn x +nÿ + 2 (1-14a) , Now we shall count the number of allowed frequencies in a given frequency interval by constructing a uniform cubic lattice in one oct an t of a rectangular coordinate system in such a way that the three coordinates of each point of the lattice are equal to a possible set of the three integers n x , ny , nZ (see Figure 1-6). By construction, each lattice point corresponds to an allowed frequency. Furthermore, N(v)dv, the number of allowed frequencies between y and + dv, is equal to N(r) dr, the number of points contained between concentric shells of radii rv and r + dr, where r= ^nx + nÿ +nz From (1-14a), this is (1-14b) r = 2a v c Since N(r) dr is equal to the volume enclosed by the shells times the density of lattice points, and since, by construction, the density is one, N(r) dr is simply rcr22 dr N(r) dr = 8 4zcr2 dr = (1-15) Setting this equal to N(v)dv, and evaluating r2 dr from (1-14b), we have 3 N(v) dv = 2 v2 dv C2a^ This completes the calculation except that we must multiply these results by a factor of 2 because, for each of the allowed frequencies we have enumerated, there are actually two independent waves corresponding to the two possible states of polarization of electromagnetic radiation. Thus we have derived (1-12). It can be shown that N(v) is independent of the assumed shape of the cavity and depends only on its volume. • ^ CD ^ CLASSIC ALTHEORY O F CAVITY RADIATI ON The expression for the x component represents a wave with a maximum amplitude E ox, with a space variation sin (2nx/1 ), and which is oscillating with frequency v. As sin (27 -cx/1x) is zero for 2x/1x = 0, 1, 2, 3, ... , the wave is a standing wave of wavelength 2x because it has fixed nodes separated by the distance Ax = 1x/ 2. The expressions for the y and z components represent standing waves of maximum amplitudes E0 and Eoz and wavelengths Ay and A Z , but all three component standing waves oscillate with the frequency y of the radiation. Note that these expressions automatically satisfy the requirement that the x component have a node at x = 0, the y component have a node at y = 0, and the z component have a node at z = 0. To make them also satisfy the requirement that the x component have a node at x = a, the y component have a node at y = a, and the z component have a node at z = a, set 2x/Ax = nx for x = a 2y/23,= ny for y = a 2z/A Z = nZ for z = a where nx = 1, 2, 3, ... ; ny = 1, 2, 3, ... ; nZ = 1, 2, 3, .... Using (1-13), these conditions become (2a/A) cos y = nZ (2a/A) cos /3 = ny (2a/2) cos a = nx Squaring both sides of these equations and adding, we obtain (2a/2) 2 (cos2 a + cos 2 f3 + cos2 y) = nx2 + ny + nZ but the angles a, 13, y have the property cos2 a + cos 2 /3 + cos2 y = 1 Thus THERMAL RADIATIO N AND PLANCK 'S POSTULATE Note that there is a very significant difference between the results obtained for the case of a real three-dimensional cavity and the results we obtained earlier for the artificial case of a one-dimensional cavity. The factor of y 2 found in (1-12), but not in (1-11), will be seen to play a fundamental role in the arguments that follow. This factor arises, basically, because we live in a three-dimensional world—the power of y being one less than the dimensionality. Although Planck, in ultimately resolving the serious discrepancies between classical theory and experiment, had to question certain points which had been considered to be obviously true, neither he nor others working on the problem questioned (1-12). It was, and remains, generally agreed that (1-12) is valid. We now have a count of the number of standing waves. The next step in the Rayleigh-Jeans classical theory of blackbody radiation is the evaluation of the average total energy contained in each standing wave of frequency v. According to classical physics, the energy of some particular wave can have any value from zero to infinity, the actual value being proportional to the square of the magnitude of its amplitude constant E0 . However, for a system containing a large number of physical entities of the same kind which are in thermal equilibrium with each other at temperature T, classical physics makes a very definite prediction about the average values of the energies of the entities. This applies to our case since the multitude of standing waves, which constitute the thermal radiation inside the cavity, are entities of the same kind which are in thermal equilibrium with each other at the temperature T of the walls of the cavity. Thermal equilibrium is ensured by the fact that the walls of any real cavity will always absorb and reradiate, in different frequencies and directions, a small amount of the radiation incident upon them and, therefore, the different standing waves can gradually exchange energy as required to maintain equilibrium. The prediction comes from classical kinetic theory, and it is called the law of equipartition of energy. This law states that for a system of gas molecules in thermalequilibrium at temperature T, the average kinetic energy of a molecule per degree of freedom is kT/2, where k = 1.38 x 10 -23 joule/°K is called Boltzmann's constant. The law actually applies to any classical system containing, in equilibrium, a large number of entities of the same kind For the case at hand the entities are standing waves which have one degree of freedom, their electric field amplitudes. Therefore, on the average their kinetic energies all have the same value, k T/2. However, each sinusoidally oscillating standing wave has a total energy which is twice its average kinetic energy. This is a common property of physical systems which have a single degree of freedom that execute simple harmonic oscillations in time; familiar cases are a pendulum or a coil spring. Thus each standing wave in the cavity has, according to the classical equipartition law, an average total energy = kT (1-16) The most important point to note is that the average total energy g is predicted to have the same value for all standing waves in the cavity, independent of their frequencies._ The energy per unit volume in the frequency interval y to y + dv of the blackbody spectrum of a cavity at temperature T is just the product of the average energy per standing wave times the number of standing waves in the frequency interval, divided by the volume of the cavity. From (1-15) and (1-16) we therefore finally obtain/the result 8nv 2 kT 3 dv c This the Rayleigh-Jeans formula for blackbody radiation. p T (v) dv = (1-17) In. Figure 1-8 we compare the predictions of this equation with-experimental data. The discrepancy is apparent. In the limit of low frequencies, the classical spectrum approaches the experimental results, but, as the frequency becomes large, the theoretical prediction goes to infinity! Experiment shows that the energy density always I "Cl assical / theory ! — / I / I / I 1 I I 3 2 v (10 14 Hz) 4 The Rayleigh-Jeans prediction (dashed line) compared with the experimental results (solid line) for the energy density of a blackbody cavity, showing the serious discrepancy called the ultraviolet catastrophe. Figure 1-8 remains finite, as it obviously must, and, in fact, that the energy density goes to zero at very high frequencies. The grossly unrealistic behavior of the prediction of classical theory at high frequencies is known in physics a,s the "ultraviolet catastrophe." This term is suggestive of the importance of the failure of the theory. 1 4 PLANCK'S THEORY OF CAVITY RADIATION - In trying to resolve the discrepancy between theory and experiment, Planck was led to consider the possibility of a violation of the law of equipartition of energy on which the theory was based. From Figure 1-8 it is clear that the law gives satisfactory results for small frequencies. Thus we can assume kT (1-18) v , o that is, the average total energy approaches kT as the frequency approaches zero. The discrepancy at high frequencies could be eliminated if there is, for some reason, a cutoff, so that (1-19) I v.^ - 0 that is, if the average total energy approaches zero as the frequency approaches infinity In other words, Planck realized that, in the circumstances that prevail for the case of blackbody radiation, the average energy of the standing waves is a function of frequency 1(v) having the properties indicated by (1-18) and (1-19). This is in contrast to the law of equipartition of energy which assigns to the average energy I a value independent of frequency. Let us look at the origin of the equipartition law. It arises, basically, from a more comprehensive result of classical statistical mechanics called the Boltzmann distribution. (Arguments leading to the Boltzmann distribution are given in Appendix C for students not already familiar with it.) Here we shall use a special form of the Boltzmann distribution e - g/kT (1-20) kT in which p(e)de is the probability of finding a given entity of a system with energy in the interval between g and g + de, when the number of energy states for the entity in that interval is independent of e. The system is supposed to contain a large P(e) NOIl`dIa `dIi AllAt/JJ OAbO3H1S>IJ Mdid I I — THERMAL RADIATION AND PLAN CK 'S POSTU LATE ..^ U number of entities of the same kind in thermal equilibrium at temperature T, and k represents Boltzmann's constant. The energies of the entities in the system we are considering, a set of simple harmonic oscillating standing waves in thermal equilibrium in a blackbody cavity, are governed by (1-20). The Boltzmann distribution function is intimately related to Maxwell's distribution function for the energy of a molecule in a system of molecules in thermal equilibrium. In fact, the exponential in the Boltzmann distribution is responsible for the exponential factor in the Maxwell distribution. The factor of g1/2 that some students may know is also present in the Maxwell distribution results from the circumstance that the number of energy states for a molecule in the interval C to C + de is not independent of C but instead increases in proportion to 6.112. The Boltzmann dist ribution function provides complete information about the energies of the entities in our system, including, of course, the average value g of the energies. The latter quantity can be obtained from P(C) by using (1-20) to evaluate the integrals in the ratio 0) eP(e) de f g=° . (' J (1-21) p(e)de o The integrand in the numerator is the energy, C, weighted by the probability that the entity will be found with this energy. By integrating over all possible energies, the average value of the energy is obtained. The denominator is the probability of finding the entity with any energy and so should have the value one; it does. The integral in the numerator can be evaluated, and the result is just the law of equipartition of energy = kT (1-22) Instead of actually carrying through the evaluation here, it will be better, for the purpose of arguments to follow, to look at the graphical presentation of P(C) and I shown in the top half of Figure 1-9. There P(C) is plotted as a function of C. Its maximum value, 1/kT, occurs at C = 0, and the value of P(C) decreases smoothly with increasing C to approach zero as C —* oo. That is, the result that would most probably be found in a measurement of C is zero. But the average I of the results that would be found in a number of measurements of C is greater than zero, as is shown on the abscissa of the top figure, since many measurements of C will lead to values greater than zero. The bottom half of Figure 1-9 indicates the evaluation of I from P(C). Planck's great contribution came when he realized that he could obtain the required cutoff, indicated in (1-19), if he modified the calculation leading from P(4') to by treating the energy C as if it were a discrete variable instead of as the continuous variable that it definitely is from the point of view of classical physics. Quantitatively, this can be done by rewriting (1-21) in terms of a sum instead of an integral. We shall soon see that this is not too hard to do, but it will be much more instructive for us to study the graphical presentation in Figure 1-10 first. Planck assumed that the energy C could take on only certain discrete values, rather than any value, and that the discrete values of the energy were uniformly distributed; that is, he took C = 0, AC, 2AC, 3AC, 4AC, ... (1-23) as the set of allowed values of the energy. Here AC is the uniform interval between . kT Top: A plot of the Boltzmann probability distribution P(C) = e -e 'kT /kT. The average value of the energy 6' for this distribution is A T = kT, which is the classical law of equipartition of energy. To calculate this value of er, we integrate CP(C) from zero to infinity. This is just the quantity that is being averaged, C, multiplied by the relative probability P(C) that the value of C will be found in a measurement of the energy. Bottom: A plot of CP(C). The area under this curve gives the value of Figure 1-9 e. successive allowed values of the energy. The top part of Figure 1-10 illustrates an evaluation of e from P(C), for a case in which AC « kT. In this case the result obtained is e ^ kT. That is, a value essentially equal to the classical result is obtained here since the discreteness AC is very small compared to the energy range kT in which P()) changes by a significant amount; it makes no essential difference in this case whether C is continuous or discrete. The middle part of Figure 1-10 illustrates the case in which AC kT. Here we find I < kT, because most of the entities have energy C = 0 since P(C) has a rather small value at the first allowed nonzero value M so C = 0 dominates the calculation of the average value of 4' and a smaller result is obtained. The effect of the discreteness is seen most clearly, however, in the lower part of Figure 1-10, which illustrates a case in which AC » kT. In this case the probability of finding an entity with any of the allowed energy values greater than zero is negligible, since P(C) is extremely small for all these values, and the result obtained is l « kT. Recapitulating, Planck discovered that he could obtain I kT when the difference in adjacent energies M is small, and I ^ 0 when AC is large. Since he needed to obtain the first result for small values of the frequency y, and the second result for large values of v, he clearly needed to make AC an increasing function of v. Numerical work showed him that he could take the simplest possible relation between AC and y having this property. That is, he assumed these quantities to be proportional AC cc v (1-24) Written as an equation instead of a proportionality, this is (1-25) where h is the proportionality constant. Further numerical work allowed Planck to determine the value of the constant h by finding the value which produced the best fit of his theory with the experimental AC = by PLANC K' S THEO RYO FCAVITYRADIATION kT CO THER MAL RADIATION A ND PLANCK 'S P O STULATE T 1 Area = ^ mom:. 1 .: _.4an:: ?>^^ o g 1 kT â 6. ---, - Top: If the energy e is not a continuous variable but is instead restricted to discrete values 0, M, 2A4 , 3& , ... , as indicated by the ticks on theee axis of the figure, the integral used to calculate the average value I must be replaced by a summation. The average value is thus a sum of areas of rectangles, each of width M, and with heights given by the allowed values of é times P(s) at the beginning of each interval. In this figure M « kT, and the allowed energies being closely spaced the area of all the rectangles differs but little from the area under the smooth curve. Thus the average value g is nearly equal to kT, the value found in Figure 1-9. Middle: A6 kT, and g has a smaller value than it has in the case of the top figure. Bottom: tg» kT, and g is further reduced. In all three figures the rectangles show the contribution to the total area of eP(e) for each allowed energy. The rectangle for e = 0 of course is always of zero height. This will make a large effect on the total area if the widths of the rectangles are large. Figure 1-10 data. The value he obtained was very close to the currently accepted value h = 6.63 x 10 -34 joule-sec This very famous constant is now called Planck's constant. The formula Planck obtained for I by evaluating the summation analogous to the integral in (1-21), and that we shall obtain in Example 1-4, is 1(v) = envIkTV (1-26) — 1 Since e"vikr —* 1 + hv/kT for hv/kT -* 0, we see that e(v) -* kT in this limit as predicted by (1-18). In the limit by/kT —> oo n°IkT 0 , and I(v) 0, in agreement with the prediction of (1-19). The formula which he then immediately obtained for the energy density in the blackbody spectrum, using his result for I(v) rather than the classical value 1 = kT, , e 2 hv/ e hv PT(v)dv = gc3 — (1 27) dv - This is Planck's blackbody spectrum. Figure 1-11 shows a comparison of this result of Planck's theory (expressed in terms of wavelength) with experimental results for a temperature T = 1595°K. The experimental results are in complete agreement with Planck's formula at all temperatures. We should remember that Planck did not alter the Boltzmann distribution. "All" he did was to treat the energy of the electromagnetic standing waves, oscillating sinusoidally in time, as a discrete instead of a continuous quantity. Example 1-4. Derive Planck's expression for the average energy I and also his blackbody spectrum. ^ The quantity I is evaluated from the ratio of sums e- - n =0 oo E P(e) n=0 analogous to the ratio of integrals in (1-21). Sums must be used because with Planck's postulate the energy becomes a discrete variable that takes on only the values e = 0, hv, 2hv, 3hv, ... . That is, e = nhv where n = 0, 1, 2, 3, .... Evaluating the Boltzmann distribution P(s)= e eikT/ kT, we have 00 nhv e - nhv/kT E nae na g= n =o kT E _ e - nhv/kT n =0 kT E e — na da E e -na = co d —a — ln n=0 — w S' n—v e —nœ — E e - n. n=0 hv kT n=0 This, in turn, can be evaluated most easily by noting that d °° °° E —a where a = =kTn=^ d a — e-na n=0 da co L n0 e —na CO E nae - na — n=0 co L e- na n= 0 1.75 0.25 0 2 4 X (104 A) Figure 1-11 Planck's energy density prediction (solid line) compared to the experimental results (circles) for the energy density of a blackbody. The data were reported by Coblentz in 1916 and apply to a temperature of 1595 ° K. The author remarked in his paper that after drawing the spectral energy curves resulting from his measurements, "owing to eye fatigue it was impossible for months thereafter to give attention to the reduction of the data." The data, when finally reduced, led to a value for Planck's constant of 6.57 x 10 -34 joule-sec. NOIlVI aVa JIl IAVJ 3O .lt:IO3H1SNONVid is CO T so that THER MAL RA D IATION A ND PLAN CK 'S POSTU LATE d d.. Ç O d E e' ln ln e - "")= -hv = kT( -a da n= 0 \\ da n=0 — ^ Now co E n=0 e n"= 1 + e -œ+e - 2a +e 3a + . .. where X = e - " = 1+X+X2 +X 3 + • but (1- X) - 1 = 1+ X +X 2 +X3 + •• so d = -hv —a ln(1- e - ") -i (1 - e ") i ( hve - " 1 - e -" e-")-2e" 1 )( 1 - hv hv h`'/kT e" - 1 e — 1 We have derived (1-26) for the average energy of an electromagnetic standing wave of frequency v. Multiplying this by (1-12), the number N(v) dv of waves having this frequency derived • in Example 1-3, we immediately obtain the Planck blackbody spectrum, (1-27). is convenient in analyzing experimental results, as in Figure 1-11, to express the Planck blackbody spectrum in terms of wavelength 2 rather than frequency v. Obtain p T (2), the wavelength form of Planck's spectrum, from p T (v), the frequency form of the spectrum. The quantity p T (2) is defined from the equality p T (2) d2 = - pT (v) dv. The minus sign indicates that, though p T (.1) and p T (v) are both positive, and dv have opposite signs. (An increase in frequency gives rise to a corresponding decrease in wavelength.) ■ From the relation v = c/). we have dv = - (c/22 ) d1, or dv/d.l = -(02), so that Example 1 5. It - dv c A = Pr(v) .2 PT(2) = -PT(i') d 3 0 1.0 0.5 15 X (104 A) Figure 1 12 Planck's energy density of blackbody radiation at various temperatures as a function of wavelength. Note that the wavelength at which the curve is a maximum decreases as the temperature increases. - If now we set v = c/ A in (1-27) for p T (v) we obtain 87thc /1 5 d^ (1-28) e hcRicT _ 1 c) In Figure 1-12 we show p T(1) versus 2 for several different temperatures. The trend from "red heat" to "white heat" to "blue heat" radiation with rising temperatures becomes clear as the 4 distribution of radiant energy with wavelength is studied for increasing temperatures. Stefan's law, (1-2), and Wien's displacement law, (1-3), can be derived from the Planck formula. By fitting them to the experimental results we can determine values of the constants h and k. Stefan's law is obtained by integrating Planck's law over the entire spectrum of wavelengths. The radiancy is found to be proportional to the fourth power of the temperature, the proportionality constant 2ir 5 k4/15c2h 3 being identified with a-, Stefan's constant, which has the experimentally determined value 5.67 x 10- 8 W/m2-°K4. Wien's displacement law is obtained by setting dp(2)/d l = O. We find 2max T = 0.2014hc/k and identify the right-hand side of the equation with Wien's experimentally determined constant 2.898 x 10'3 m-°K. Using these two measured values and assuming a value for the speed of light c, we can calculate the values of h and k. Indeed, this was done by Planck, his values agreeing very well with those obtained subsequently by other methods. 1-5 THE USE OF PLANCK'S RADIATION LAW IN THERMOMETRY The radiation emitted from a hot body can be used to measure its temperature. If total radiation is used, then, from the Stefan-Boltzmann law, we know that the energies emitted by two sources are in the ratio of the fourth power of the temperature. However, it is difficult to measure total radiation from most sources so that we measure instead the radiancy over a finite wavelength band. Here we use the Planck radiation law which gives the radiancy as a function of temperature and wavelength. For monochromatic radiation of wavelength 2 the ratio of the spectral intensities emitted by sources at T2 °K and T1 °K is given from Planck's law as e hci.lkT1 — 1 e hci lTz — 1 If T1 is taken as a standard reference temperature, then T2 can be determined relative to the standard from this expression by measuring the ratio experimentally. This procedure is used in the International Practical Temperature Scale, where the normal melting point of gold is taken as the standard fixed point, 1068°C. That is, the primary standard optical pyrometer is arranged to compare the spectral radiancy from a blackbody at an unknown temperature T > 1068°C with a blackbody at the gold point. Procedures must be adopted, and the theory developed, to allow for the practical circumstances that most sources are not blackbodies and that a finite spectral band is used instead of monochromatic radiation. Most optical pyrometers use the eye as a detector and call for a large spectral bandwidth so that there will be enough energy for the eye to see. The simplest and most accurate type of instrument used above the gold point is the disappearing filament optical pyrometer (see Figure 1-13). The source whose temperature is to be measured is imaged on the filament of the Objective lens Pyrometer lamp Source of • radiation Figure 1 - 13 Schematic diagram of an optical pyrometer. Microscope in A}:1131N OW IA3H1NIMd1 NO I1`d lOb'id S, NO Ndid d0 3S f1 3H1 PT(' )d2 = THERMAL RADIATION AND PLANCK 'S POSTULATE O r pyrometer lamp, and the current in the lamp is varied until the filament seems to disappear into the background of the source image. Careful calibration and precision potentiometers insure accurate measurement of temperature. A particularly interesting example in the general category of thermometry using blackbody radiation was discovered by Dicke, Penzias, and Wilson in the 1950s. Using a radio telescope operating in the several millimeter to several centimeter wavelength range, they found that a blackbody spectrum of electromagnetic radiation, with a characteristic temperature of about 3°K, is impinging on the earth with equal intensity from all directions. The uniformity in direction indicates that the radiation fills the universe uniformly. Astrophysicists consider these measurements as strong evidence in favor of the so-called big-bang theory, in which the universe was in the form of a very dense, and hot, fireball of particles and radiation around 10 1° years ago. Due to subsequent expansion and the resulting Doppler shift, the temperature of the radiation would be expected to drop by now to something like the observed value of 3°K. 1-6 PLANCK'S POSTULATE AND ITS IMPLICATIONS Planck's contribution can be stated as a postulate, as follows: Any physical entity with one degree of freedom whose "coordinate" is a sinusoidal function of time (i.e., executes simple harmonic oscillations) can possess only total energies 6' which satisfy the relation e = nhv n=0, 1,2,3,.,.. where v is the frequency of the oscillation, and h is a universal constant. The word coordinate is used in its general sense to mean any quantity which describes the instantaneous condition of the enity. Examples are the length of a coil spring, the angular position of a pendulum bob, and the amplitude of a wave. All these examples happen also to be sinusoidal functions of time. An energy-level diagram, as shown in Figure 1-14, provides a convenient way of illustrating the behavior of an entity governed by this postulate, and it is also useful in contrasting this behavior with what would be expected on the basis of classical physics. In such a diagram we indicate each of the possible energy states of the entity ( tional to the total energy to which it corresponds. Since the entity may have any energy from zero to infinity according to classical physics, the classical energy-level diagram consists of a continuum of lines extending from zero up. However, the entity executing simple harmonic oscillations can have only one of the discrete total energies e = 0, hv, 2hv, 3hv ... if it obeys Planck's postulate. This is indicated by the discrete set of lines in its energy-level diagram. The energy of the entity obeying Planck's postulate is said to be quantized, the allowed energy states are called quantum states, and the integer n is called the quantum number. It may have occurred to the student that there are physical systems whose behavior seems to be obviously in disagreement with Planck's postulate. For instance, an ordi- t i e= 5hv e= 4hv — 3hv e— 2hv — hv Classical e— 0 Planck e- 0 Figure 1-14 Left: The allowed energies in a classical system, oscillating sinusoidally with frequency y, are continuously distributed. Right: The allowed energies according to Planck's postulate are discretely distributed since they can only assume the values nhv. We say that the energy is quantized, n being the quantum number of an allowed quantum state. withaorznle.Tdistacfromhlnezrgyispo- Example 1-6. A pendulum consisting of a 0.01 kg mass is suspended from a string 0.1 m in length. Let the amplitude of its oscillation be such that the string in its extreme positions makes an angle of 0.1 rad with the vertical. The energy of the pendulum decreases due, for instance, to frictional effects. Is the energy decrease observed to be continuous or discontinuous? ^ The oscillation frequency of the pendulum is 9.8 m/sec 2 / 1 1 g = 1.6 sec 0.1 m 2x l 27-c V The energy of the pendulum is its maximum potential energy mgh = mgl(1 — cos 9) = 0.01 kg x 9.8 m/sec t x 0.1 m x (1 — cos 0.1) = 5 x 10 - 5 joule The energy of the pendulum is quantized so that changes in energy take place in discontinuous jumps of magnitude AE = hv, but AE = hv = 6.63 x 10 -34 joule-sec x 1.6/sec = 10 -33 joule whereas E = 5 x 10 -5 joule. Therefore, LE/E = 2 x 10-29. Hence, to measure the discreteness in the energy decrease we need to measure the energy to better than two parts in 10 29 . It is apparent that even the most sensitive experimental equipment is totally incapable of this energy resolution. • We conclude that experiments involving an ordinary pendulum cannot determine whether Planck's postulate is valid or not. The same is true of experiments on all other macroscopic mechanical systems. The smallness of h makes the graininess in the energy too fine to be distinguished from an energy continuum. Indeed, h might as well be zero for classical systems and, in fact, one way to reduce quantum formulas to their classical limits would be to let h —* 0 in these formulas. Only where we consider systems in which v is so large and/or e is so small that AS = hv is of the order of 8 are we in a position to test Planck's postulate. One example is, of course, the high-frequency standing waves in blackbody radiation. Many other examples will be considered in following chapters. 1-7 A BIT OF QUANTUM HISTORY In its original form, Planck's postulate was not so far reaching as it is in the form we have given. Planck's initial work was done by treating, in detail, the behavior of the electrons in the walls of the blackbody and their coupling to the electromagnetic radiation within the cavity. This coupling leads to the same factor v 2 we obtained in (1-12) from the more general arguments due to Rayleigh and Jeans. Through this coupling, Planck related the energy in a particular frequency component of the blackbody radiation to the energy of an electron in the wall oscillating sinusoidally at the same frequency, and he postulated only that the energy of the oscillating particle is quantized. It was not until later that Planck accepted the idea that the oscillating electromagnetic waves were themselves quantized, and the postulate was broadened to include any entity whose single coordinate oscillates sinusoidally. At first Planck was unsure whether his introduction of the constant h was only a mathematical device or a matter of deep physical significance. In a letter to R. W. Wood, Planck called his limited postulate "an act of desperation." "I knew," he wrote, "that the problem (of the equilibrium of matter and radiation) is of fundamental significance for physics; I knew the formula that reproduces the energy distribution in the normal spectrum; a theoretical interpretation had to be found at any cost, no matter how high." For more than a decade Planck tried to fit the quantum idea into classical theory. With each attempt he appeared to retreat .lb1 O1SIH1/1f11N `d (l OAO11 8 `d nary pendulum executes simple harmonic oscillations, and yet this system certainly appears to be capable of possessing a continuous range of energies. Before we accept this argument, however, we should make some simple numerical calculations concerning such a system. THERMAL RADIATI ON AND PLANCK 'S PO STULATE N from his original boldness, but always he generated new ideas and techniques that quantum theory later adopted. What appears to have finally convinced him of the correctness and deep significance of his quantum hypothesis was its support of the definiteness of the statistical concept of entropy and the third law of thermodynamics. It was during this period of doubt that Planck was editor of the German research journal Annalen der Physik. In 1905 he received Einstein's first relativity paper and stoutly defended Einstein's work. Thereafter he became one of young Einstein's patrons in scientific circles, but he resisted for some time the very ideas on the quantum theory of radiation advanced by Einstein that subsequently confirmed and extended Planck's own work. Einstein, whose deep insight into electromagnetism and statistical mechanics was perhaps unequalled by anyone at the time, saw as a result of Planck's work the need for a sweeping change in classical statistics and electromagnetism. He advanced predictions and interpretations of many physical phenomena which were later strikingly confirmed by experiment. In the next chapter we turn to one of these phenomena and follow another road on the way to quantum mechanics. QUESTIONS 1. Does a blackbody always appear black? Explain the term blackbody. 2. Pockets formed by coals in a coal fire seem brighter than the coals themselves. Is the temperature in such pockets appreciably higher than the surface temperature of an exposed glowing coal? 3. If we look into a cavity whose walls are kept at a constant temperature no details of the interior are visible. Explain. 4. The relation RT = 6T4 is exact for blackbodies and holds for all temperatures. Why is this relation not used as the basis of a definition of temperature at, for instance, 100°C? 5. A piece of metal glows with a bright red color at 1100°K. At this temperature, however, a piece of quartz does not glow at all. Explain. (Hint: Quartz is transparent to visible light.) 6. Make a list of distribution functions commonly used in the social sciences (e.g., distribution of families with respect to income). In each case, state whether the variable whose distribution is described is discrete or continuous. 7. In (1-4) relating spectral radiancy and energy density, what dimensions would a proportionality constant need to have? 8. What is the origin of the ultraviolet catastrophe? 9. The law of equipartition of energy requires that the specific heat of gases be independent of the temperature, in disagreement with experiment. Here we have seen that it leads to the Rayleigh-Jeans radiation law, also in disagreement with experiment. How can you relate these two failures of the equipartition law? 10. Compare the definitions and dimensions of spectral radiancy R T(v), radiancy RT, and energy density p T(v). 11. Why is optical pyrometry commonly used above the gold point and not below it? What objects typically have their temperatures measured in this way? 12. Are there quantized quantities in classical physics? Is energy quantized in classical physics? 13. Does it make sense to speak of charge quantization in physics? How is this different from energy quantization? 14. Elementary particles seem to have a discrete set of rest masses. Can this be regarded as quantization of mass? 15. In many classical systems the allowed frequencies are quantized. Name some of the systems. Is energy quantized there too? 16. Show that Planck's constant has the dimensions of angular momentum. Does this necessarily suggest that angular momentum is a quantized quantity? 17. For quantum effects to be everyday phenomena in our lives, what would be the minimum order of magnitude of h? PROBLEMS 1. At what wavelength does a cavity at 6000°K radiate most per unit wavelength? 2. Show that the proportionality constant in (1-4) is 4/c. That is, show that the relation between spectral radiancy R T(v) and energy density p T(v) is R T(v) dv = (c/4)p T(v) dv. 3. Consider two cavities of arbitrary shape and material, each at the same temperature T, connected by a narrow tube in which can be placed color filters (assumed ideal) which will allow only radiation of a specified frequency y to pass through. (a) Suppose at a certain frequency v', p T (v') dv for cavity 1 was greater than p r(v') dv for cavity 2. A color filter which passes only the frequency y' is placed in the connecting tube. Discuss what will happen in terms of energy flow. (b) What will happen to their respective temperatures? (c) Show that this would violate the second law of thermodynamics; hence prove that all blackbodies at the same temperature must emit thermal radiation with the same spectrum independent of the details of their composition. 4. A cavity radiator at 6000°K has a hole 10.0 mm in diameter drilled in its wall. Find the power radiated through the hole in the range 5500-5510 A. (Hint: See Problem2.) 5. (a) Assuming the surface temperature of the sun to be 5700°K, use Stefan's law, (1-2), to determine the rest mass lost per second to radiation by the sun. Take the sun's diameter to be 1.4 x 109 m. (b) What fraction of the sun's rest mass is lost each year from electromagnetic radiation? Take the sun's rest mass to be 2.0 x 10 3° kg. 6. In a thermonuclear explosion the temperature in the fireball is momentarily 10 7 °K. Find the wavelength at which the radiation emitted is a maximum. 7. At a given temperature, A max = 6500 A for a blackbody cavity. What will Amax be if the temperature of the cavity walls is increased so that the rate of emission of spectral radiation is doubled? 8. At what wavelength does the human body emit its maximum temperature radiation? List assumptions you make in arriving at an answer. 9. Assuming that Amax is in the near infrared for red heat and in the near ultraviolet for blue heat, approximately what temperature in Wien's displacement law corresponds to red heat? To blue heat? 10. The average rate of solar radiation incident per unit area on the earth is 0.485 cal/cm 2 2). (a) Explain the consistency of this number with the solar constant -min(or38W/m (the solar energy falling per unit time at normal incidence on a unit area) whose value is 1.94 cal/cm 2 -min (or 1353 W/m 2). (b) Consider the earth to be a blackbody radiating energy into space at this same rate. What surface temperature would the earth have under these circumstances? 11. Attached to the roof of a house are three solar panels, each 1 m x 2 m. Assume the equivalent of 4 hrs of normally incident sunlight each day, and that all the incident light is absorbed and converted to heat. How many gallons of water can be heated from 40°C to 120°C each day? 12. Show that the Rayleigh-Jeans radiation law, (1-17), is not consistent with the Wien displacement law vmax cc T, (1-3a), or AmaxT = const, (1-3b). 13. We obtain vmax in the blackbody spectrum by setting dp T(v)/dv = 0 and Amax by setting dp T (2)/dA = 0. Why is it not possible to get from A max T = const to vmax = const x T simply by using Amax = C/Vmax? That is, why is it wrong to assume that vmaxAmax = c, where c is the speed of light? 14. Consider the following numbers: 2, 3, 3, 4, 1, 2, 2, 1, 0 representing the number of hits garnered by each member of the Baltimore Orioles in a recent outing. (a) Calculate ca sw31 8oa d 18. What, if anything, does the 3°K universal blackbody radiation tell us about the temperature of outer space? 19. Does Planck's theory suggest quantized atomic energy states? 20. Discuss the remarkable fact that discreteness in energy was first found in analyzing a continuous spectrum emitted by interacting atoms in a solid, rather than in analyzing a discrete spectrum such as is emitted by an isolated atom in a gas. THERMAL RADIATION AND PLAN CK 'S POS TULATE N directly the average number of hits per man. (b) Let x be a variable signifying the number of hits obtained by a man, and let f(x) be the number of times the number x appears. Show that the average number of hits per man can be written as 4 xf(x) o 4 = o f(x) (c) Let p(x) be the probability of the number x being attained. Show that x is given by 4 E xp(x) = o 15. Consider the function 10(10 —x)2 f(x)= f(x) = 0 0 < x < 10 all other x (a) From — co find the average value of x. (b) Suppose the variable x were discrete rather than continuous. Assume Ax = 1 so that x takes on only integral values 0, 1, 2, ... , 10. Compute x and compare to the result of part (a). (Hint: It may be easier to compute the appropriate sum directly rather than working with general summation formulas.) (c) Compute z for Ax = 5, i.e. x = 0, 5, 10. Compare to the result of part (a). (d) Draw analogies between the results obtained in this problem and the discussion of Section 1-4. Be sure you understand the roles played by g, M, and P(s). 16. Using the relations P(s) = e-67k T/kT and f â P(g) dg = 1, evaluate the integral of (1-21) to deduce (1-22), 1 = kT. 17. Use the relation R T(v) dv = (c/4)p T(v) dv between spectral radiancy and energy density, together with Planck's radiation law, to derive Stefan's law. That is, show that co ( 27 h v 3 dv RT = = QT 4 c 2 eby/kT 1 J0 — where o = 27z 5 k4/15c 2h3 . OD Hint: 3 ^4 q dq — 15 eq —1 0 18. Derive the Wien displacement law, AmaxT = 0.2014 he/k, by solving the equation dp(A)/dA = 0. (Hint: Set he/AkT = x and show that the equation quoted leads to e - x + x/5 = 1. Then show that x = 4.965 is the solution.) 19. To verify experimentally that the 3°K universal background radiation accurately fits a blackbody spectrum, it is decided to measure R T(A) from a wavelength below /1,max where its value is 0.2RT(Amax) to a wavelength above Amax where its value is again 0.2RT(2max). Over what range of wavelength must the measurements be made? 20. Show that, at the wavelength Amax, where p T(2) has its maximum PT(2max) = 1707t(kT)5/(hc)4 (Hint: he/)maxkT = 4.965; hence Wien's approximation is fairly accurate in evaluating the integral in the numerator above.) (b) By what percent does Wien's approximation used over the entire wavelength range overestimate or underestimate the integrated energy density? 24. Find the temperature of a cavity having a radiant energy density at 2000 A that is 3.82 times the energy density at 4000 A. IV ^ SW 3 -1801:1d 21. Use the result of the preceding problem to find the two wavelengths at which p T ()) has a value one-half the value at Amax. Give answers in terms of Amax. 22. A tungsten sphere 2.30 cm in diameter is heated to 2000°C. At this temperature tungsten radiates only about 30% of the energy radiated by a blackbody of the same size and temperature. (a) Calculate the temperature of a perfectly black spherical body of the same size that radiates at the same rate as the tungsten sphere. (b) Calculate the diameter of a perfectly black spherical body at the same temperature as the tungsten sphere that radiates at the same rate. 23. (a) Show that about 25% of the radiant energy in a cavity is contained within wavelengths zero and Amax; i.e., show that 2 PHOTONS PARTIC LELIKE PROPERTIES OF RADIATION 2-1 27 INTRODUCTION interaction of radiation with matter 2 2 - THE PHOTOELECTRIC EFFECT stopping potential; cutoff frequency; absence of time lag 2 3 - EINSTEIN'S QUANTUM THEORY OF THE PHOTOELECTRIC EFFECT 27 29 photons; photon energy quantization; work function; re-evaluation of Planck's constant; electromagnetic spectrum; momentum conservation 2 4 - THE COMPTON EFFECT 34 Compton shift; derivation of Compton's equation; Compton wavelength; Rayleigh scattering; competition between Rayleigh and Compton scattering 2 5 - THE DUAL NATURE OF ELECTROMAGNETIC RADIATION 40 diffraction; split personality of electromagnetic radiation; contemporary attitude of physicists 2 6 - PHOTONS AND X RAY PRODUCTION - 40 production of x rays; bremsstrahlung; relation of bremsstrahlung to photoelectric effect 27 - PAIR PRODUCTION AND PAIR ANNIHILATION 43 positrons; production of electron-positron pairs; pair annihilation; positronium; Dirac theory of positrons 28 - CROSS SECTIONS FOR PHOTON ABSORPTION AND SCATTERING 48 definition of cross section; energy dependence of scattering, photoelectric, pair production, and total cross sections; exponential attenuation; attenuation coe ffi cients and lengths 26 QUESTIONS 51 PROBLEMS 52 INTRODUCTION In this chapter we shall examine processes in which radiation interacts with matter. Three processes (the photoelectric effect, the Compton effect, and pair production) involve the scattering or absorption of radiation in matter. Two processes (bremsstrahlung and pair annihilation) involve the production of radiation. In each case we shall obtain experimental evidence that radiation is particlelike in its interaction with matter, as distinguished from the wavelike nature of radiation when it propagates. In the following chapter we shall study a generalization of this result, due to de Broglie, which leads directly into quantum mechanics. Some of the material of these two chapters may be a review of topics the student has already come across in studying elementary physics. 22 - THE PHOTOELECTRIC EFFECT It was in 1886 and 1887 that Heinrich Hertz performed the experiments that first confirmed the existence of electromagnetic waves and Maxwell's electromagnetic theory of light propagation. It is one of those fascinating and paradoxical facts in the history of science that in the course of his experiments Hertz noted the effect that Einstein later used to contradict other aspects of the classical electromagnetic theory. Hertz discovered that an electric discharge between two electrodes occurs more readily when ultraviolet light falls on one of the electrodes. Lenard, following up some experiments of Hallwachs, showed soon after that the ultraviolet light facilitates the discharge by causing electrons to be emitted from the cathode surface. The ejection of electrons from a surface by the action of light is called the photoelectric effect. It is the phenomenon underlying the operation of the solar cells being developed to convert thermal energy received from the sun directly into electrical energy. Figure 2-1 shows an apparatus used to study the photoelectric effect. A glass envelope encloses the apparatus in an evacuated space. Monochromatic light, incident through a quartz window, falls on the metal plate A and liberates electrons, Quartz window Incident light Figure 2-1 14^ An apparatus used to study the photoelectric effect. The potential difference V can be varied continuously in magnitude, and also reversed in sign by the switching arrangement. If the same metal is used to make plate A and cup B then the potential difference between them equals the value of V measured with a voltmeter between the points indicated in the figure. But if this is not the case then the measured value of V must be corrected by adding to it the contact potential acting between the two metals in order to obtain the quantity of interest—the potential difference between A and B. The phenomenon of contact potential is explained in Chapter 11. 103dd3 O Ild1O 313OlO Hd 3H1 2-1 PHOTONS- PARTI C LELIKE P RO PERT IESOFRADIATI ON CO N — 0 + Applied potencial difference V yo Figure 2-2 Graphs of current i as a function of potential difference V from data taken with the apparatus of Figure 2-1. The applied potential difference V is called positive when the cup B in Figure 2-1 is positive with respect to the photoelectric surface A. In curve b the incident light intensity has been reduced to one-half that of curve a. The stopping potential Vo is independent of light intensity, but the saturation currents l a and ib are directly proportional to it. called photoelectrons. The electrons can be detected as a current if they are attracted to the metal cup B by means of a potential difference V applied between A and B. The sensitive ammeter G serves to measure this photoelectric current. Curve a of Figure 2-2 is a plot of the photoelectric current, in an apparatus like that of Figure 2-1, as a function of the potential difference V. If V is made large enough, the photoelectric current reaches a certain limiting (saturation) value at which all photoelectrons ejected from A are collected by cup B. If V is reversed in sign, the photoelectric current does not immediately drop to zero, which suggests that the electrons are emitted from A with kinetic energy. Some will reach cup B in spite of the fact that the electric field opposes their motion. However, if this reversed potential difference is made large enough, a value Vo called the stopping potential is reached at which the photoelectric current does drop to zero. This potential difference V0 , multiplied by electron charge, measures the kinetic energy Kmax of the fastest ejected photoelectron. That is (2-1) Kmax = eVo The quantity Kmax turns out experimentally to be independent of the intensity of the light, as is shown by curve b in Figure 2-2 in which the light intensity has been reduced to one-half the value used in obtaining curve a. Figure 2-3 shows the stopping potential Vo as a function of the frequency of the light incident on sodium. Note that there is a definite cutoff frequency v o , below which no photoelectric effect occurs. These data were taken in 1914 by Millikan whose painstaking work on the photoelectric effect won him the Nobel prize in 1923. Because the photoelectric effect for visible or near-visible light is largely a surface phenomenon, it is necessary in the experiments to avoid oxide films, grease, or other surface contaminants. There are three major features of the photoelectric effect that cannot be explained in terms of the classical wave theory of light: 1. Wave theory requires that the oscillating electric vector E of the light wave increase in amplitude as the intensity of the light beam is increased. Since the force applied to the electron is eE, this suggests that the kinetic energy of the photo- 4 8 Frequency (10 14/sec) 12 Figure 2-3 The stopping potential at various frequencies for sodium. The points show Millikan's data, except that the correction mentioned in the caption to Figure 2-1 has been recalculated using a recent measurement of the contact potential. The cutoff frequency vo is 5.6 x 10 14 Hz. photoelectric effect does not occur, no matter how intense the illumination. 3. If the energy acquired by a photoelectron is absorbed from the wave incident on the metal plate, the "effective target area" for an electron in the metal is limited, and probably not much more than that of a circle having about an atomic diameter. In the classical theory the light energy is uniformly distributed over the wave front. Thus, if the light is feeble enough, there should be a measurable time lag, which we shall estimate in Example 2-1, between the time when light starts to impinge on the surface and the ejection of the photoelectron. During this interval the electron should be absorbing energy from the beam until it has accumulated enough to escape. However, no detectable time lag has ever been measured. This disagreement is particularly striking when the photoelectric substance is a gas; under these circumstances collective absorption mechanisms can be ruled out and the energy of the emitted photoelectron must certainly be soaked out of the light beam by a single atom or molecule. A potassium plate is placed 1 m from a feeble light source whose power is 1 W = 1 joule/sec. Assume that an ejected photoelectron may collect its energy from a circular area of the plate whose radius r is, say, one atomic radius: r ^ 1 x 10 -10 m. The energy required to remove an electron through the potassium surface is about 2.1 eV = 3.4 x 10 -19 joule. (One electron volt = 1 eV = 1.60 x 10 -19 joule is the energy gained by an electron, of charge 1.60 x 10 -19 coul, in falling through a potential drop of 1 V.) How long would it take for such a target to absorb this much energy from the light source? Assume the light energy to be spread uniformly over the wave front. •The target area is 7cr 2 = it x 10 -20 m2 . The area of a 1 m sphere centered on the source is 4741 m)2 = 47c m 2 . Thus if the source radiates uniformly in all directions (i.e., if the energy is uniformly distributed over spherical wave fronts spreading out from the source, in agreement with classical theory) the rate R at which energy falls on the target is given by 7c x 10 -20 m2 = 2.5 x 10 -21 joule/sec R = 1 joule/sec x 47t m2 Assuming that all this power is absorbed, we may calculate the time required for the electron to acquire enough energy to escape; we find 3.4 x 10 -19 joule = 1.4 x 10 102sec = t ^ 2 min 2.5 x 10 -21 joule/sec Of course, we could modify the preceding picture to reduce the calculated time by assuming a larger effective target area. The most favorable assumption, that energy is transferred by a resonance process from light wave to electron, leads to a target area of /1 2 , where /1, is the wavelength of the light, but we would still obtain a finite time lag which is well within our ability to measure experimentally. (For ultraviolet light of ) = 100 A, for example, t ^ 10 -2 sec.) However, no time lag has been detected under any circumstances, the early experiments setting an upper limit of 10 -9 sec on any such possible delay! • Example 2-1. 2-3 EINSTEIN'S QUANTUM THEORY OF THE PHOTOELECTRIC EFFECT In 1905 Einstein called into question the classical theory of light, proposed a new theory, and cited the photoelectric effect as one application that could test which theory was correct. This was many years before Millikan's work, but Einstein was influenced by Lenard's experiment. As we have mentioned, Planck originally restricted EINSTEIN' S QUANTUM THE ORYOF THE PH OTOELECTRIC EFFECT electrons should also increase as the light beam is made more intense. However, Figure 2-2 shows that Kmax, which equals eV0 , is independent of the light intensity. This has been tested over a range of intensities of 10'. 2. According to the wave theory the photoelectric effect should occur for any frequency of the light, provided only that the light is intense enough to give the energy needed to eject the photoelectrons. However, Figure 2-3 shows that there exists, for each surface, a characteristic cutoff frequency v 0 . For frequencies less than v0 , the 0 PHOTONS- PARTICLELIKE PROPERTIESOF RAD IATIO N CO his concept of energy quantization to the radiating electron in the walls of a blackbody cavity. Planck believed that electromagnetic energy, once radiated, spreads through space like water waves spread through water. Einstein proposed instead that radiant energy is quantized into concentrated bundles which later came to be called photons. Einstein argued that the well-known optical experiments on interference and diffraction of electromagnetic radiation had been performed only in situations involving very large numbers of photons. These experiments yield results which are averages of the behaviors of the individual photons. The presence of the photons is not apparent in them any more than the presence of individual droplets of water is apparent in a fine spray from a garden hose, if the number of droplets is very high. Of course the interference and diffraction experiments definitely show that photons do not travel from where they are emitted to where they are absorbed in the simple ways that classical particles, like water droplets, do. They travel like classical waves, in the sense that calculations based on the way such waves propagate (and in particular the way two component waves reinforce or nullify each other depending on their relative phases) correctly explain measurements of the average way photons travel. Einstein focused his attention not on the familiar wavelike way radiation propagates, but on what he first realized is the particlelike way it is emitted and absorbed. He reasoned that Planck's requirement that the energy content of the electromagnetic waves of frequency v in a radiant source (e.g., an ultraviolet light source in a photoelectric experiment) can only be 0, or hv, or 2hv, ... , or nhv,... implies that in the process of going from energy state nhv to energy state (n — 1)hv the source would emit a discrete burst of electromagnetic energy of energy content hv. Einstein assumed that such a bundle of energy is initially localized in a small volume of space, and that it remains localized as it moves away from the source with velocity c. He assumed that the energy content E of the bundle, or photon, is related to its frequency v by the equation E = hv (2-2) He also assumed that in the photoelectric process one photon is completely absorbed by one electron in the photocathode. be When the electron is emitted from the surface of the metal, its kinetic energy will (2-3) where hv is the energy of the absorbed incident photon and w is the work required to remove the electron from the metal. This work is needed to overcome the attractive fields of the atoms in the surface and losses of kinetic energy due to internal collisions of the electron. Some electrons are bound more tightly than others; some lose energy in collisions on the way out. In the case of loosest binding and no internal losses, the photoelectron will emerge with the maximum kinetic energy, Kmax. Hence (2-4) Kmax = hv — wo where wo , a characteristic energy of the metal called the work function, is the minimum energy needed by an electron to pass through the metal surface and escape the attractive forces that normally bind the electron to the metal. Consider now how Einstein's photon hypothesis meets the three objections raised against the wave theory interpretation of the photoelectric effect. As for objection 1 (the lack of dependence of Kmax on the intensity of illumination), there is complete agreement of the photon theory with experiment. Doubling the light intensity merely doubles the number of photons and thus doubles the photoelectric current; it does not change the energy hv of the individual photons or the nature of the individual photoelectric process described by (2-3). K = by — w hv w o Vo =--e e Thus Einstein's theory predicts a linear relationship between the stopping potential Vo and the frequency v, in complete agreement with experimental results as shown in Figure 2-3. The slope of the experimental curve in the figure should be hie or, using data from the figure 2.1V-0.1V h = 4.0 x 10 -15 V-sec e 11.0 x 10 14/sec — 6.0 x 10 14/sec We can find h by multiplying this ratio by the electronic charge e. Thus h = 4.0 x 10 -15 V-sec x 1.6 x 10 -19 coul = 6.4 x 10 -34 joule-sec. From a much more careful analysis of these and other data, including data taken with lithium surfaces, Millikan found the value h = 6.57 x 10 -34 joule-sec, with an accuracy of about 0.5%. This early measurement was in good agreement with the value of h derived from Planck's radiation formula. The numerical agreement in two determinations of h, using completely different phenomena and theories, is striking A modern value of h, deduced from diverse experiments, is h = 6.6262 x 10 -34 joule-sec To quote Millikan: "The photoelectric effect ... furnishes a proof which is quite independent of the facts of blackbody radiation of the correctness of the fundamental assumption of the quantum theory, namely, the assumption of a discontinuous or explosive emission of the energy absorbed by the electronic constituents of atoms from ... waves. It materializes, so to speak, the quantity h discovered by Planck through the study of blackbody radiation and gives us a confidence inspired by no other type of phenomenon that the primary physical conception underlying Planck's work corresponds to reality." Deduce the work function for sodium from Figure 2-3. ^ The intersection of the straight line in Figure 2-3 with the horizontal axis is the cutoff frequency, v o = 5.6 x 10 14/sec. Substituting this into (2-5) gives us wo = hvo = 6.63 x 10 -34 joule-sec x 5.6 x 10 14/sec l eV = 3.7 x 10 -19 joule x 1.60 x 10 19 j oule = 2.3 eV The same value is obtained from Figure 2-3 as the magnitude of the intercept of the extended line with the vertical axis. Example 2-2. EINSTEIN' S QUANTUM THEORY O F THE PH OTOE LECTRI C EF F E CT Objection 2 (the existence of a cutoff frequency) is removed at once by (2-4). If K. equals zero we have (2-5) hvo = wo which asserts that a photon of frequency v o has just enough energy to eject the photoelectrons and none extra to appear as kinetic energy. If the frequency is reduced below vo , the individual photons, no matter how many of them there are (that is, no matter how intense the illumination), will not have enough energy individually to eject photoelectrons. Objection 3 (the absence of a time lag) is eliminated in the photon theory because the required energy is supplied in concentrated bundles. It is not spread uniformly over a large area, as we assumed in Example 2-1, which is based on the assumption that the classical wave theory is true. If there is any illumination at all incident on the cathode, then there will be at least one photon that hits it; this photon will be immediately absorbed, by some atom, leading to the immediate emission of a photoelectron. Let us rewrite Einstein's photoelectric equation, (2-4), by substituting e Vo for K. from (2-1). This yields N For most conducting metals the value of the work function is of the order of a few electron volts. It is the same as the work function for thermionic emission from these metals. • PHOTO NS- PARTIC LELIKE PROPERTIES OF RAD IATION Example 2-3. At what rate per unit area do photons strike the metal plate in Example 2-1? Assume that the light is monochromatic, of wavelength 5890 A (yellow light). ^ The rate per unit area at which energy falls on a metal plate 1 m from a 1-W light source (see Example 2-1) is 1 joule/sec R= = 8.0 x 10 2joule/m 22-sec 44741 m) 2 = 5.0 x 10 17 eV/m2-sec M Each photon has an energy of he 6.63 x 10 -34 joule-sec x 3.00 x 108 m/sec E=hv= — = 5.89 x 10 -7 m = 3.4 x 10 -19 joule = 2.1 eV Thus the rate R at which photons strike a ùnit area of the plate is 1 photon photon 2 R = 5.0 x 101 eV/m2-sec x = 2.4 x 10 17 2.1 eV m2-sec The photoelectric effect is just able to occur because the photon energy just equals the 2.1 eV work function for the potassium surface (see Example 2-1). Note that if the wavelength is slightly increased (that is, if v is slightly decreased) the photoelectric effect will not occur, no matter how large the rate R might be. This example suggests that the intensity of light I can be regarded as the product of N, the number of photons per unit area per unit time, and hv, the energy of a single photon. We see that even at the relatively low intensity here (^ 10 -1 W/m2) the number N is extremely large (^ 10 17 photons/m 2-sec) so that the energy of any one photon is very small. This accounts for the extreme fineness of the granularity of radiation and suggests why ordinarily it is difficult to detect at all. It is analogous to detecting the atomic structure of bulk matter which for most purposes can be regarded as continuous, the discreteness being revealed only under special circumstances. • In 1921 Einstein received the Nobel Prize for predicting theoretically the law of the photoelectric effect. Before Millikan's complete experimental validation of this law in 1914, Einstein was recommended to membership in the Prussian Academy of Sciences by Planck and others. Their early negative attitude toward the photon hypothesis is revealed in their signed affidavit, among the great problems, in which modern physics is so rich, to which Einstein has not made an important contribution. That he may have sometimes missed the target in his speculations, as, for example, in his hypothesis of light quanta (photons), cannot really be held too much against him, for it is not possible to introduce fundamentally new ideas, even in the most exact sciences, without occasionally taking a risk." Today the photon hypothesis is used throughout the electromagnetic spectrum, not only in the light region (see Figure 2-4). A microwave cavity, for example, can be said to contain-photons. At )L = 10 cm, a typical microwave wavelength, the photon energy can be computed as above to be 1.20 x 10 -5 eV. This energy is much too low to eject photoelectrons from metal surfaces. For x rays, or for energetic y rays such as are emitted from radioactive nuclei, the photon energy may be 10 6 eV or higher. Such photons can eject electrons bound deep in heavy atoms by energies of the order of 105 eV. The photons in the visible region of the electromagnetic spectrum are not energetic enough to do this, the photoelectrons which they eject being the so-called conduction electrons which are bound to the metal by energies of only a few electron volts. praisngEte,whcyro:"Sumingp,weaysthridlone 10 10 10 10 -13 — 107 - 106 — 105 - — 104 - -12 -11 -10 10 -9 - -5 10 10 — - 21 .y rays x ray -4 — 1— -2 10 10 — 10 12 11 Radar bands. 10 10 - UHF — I 9 10 HF VHF I - -5 10-6 10 — 10 13 10 10 EHF 4 1 light - 10 10 15 Visible _ 14 -3 _ — 16 10 — Infrared 10 0 10 — I 10 17 10 10 — — — 10 2° 18 — 10-7 — SHF TV TV FM TV 10 8 7 10 HF 10 2 — 10-8— 103 10 -9 - 104 — 10 -10 — 105 — 10-11-- 106 10-12 - I MF LF VLF — Standard broadcast- 106 radio 105 10 4 103 10 7 - - T Power G) 10 10 19 10 — _ Ultraviolet W 22 103 - 10-7 — 10 10 2 10-8 -6 Frequency (Hz) 1 02 10 The electromagnetic spectrum, showing wavelength, frequency, and energy per photon on a logarithmic scale. Figure 2 4 - Notice that the photons are absorbed in the photoelectric process. This requires the electrons to be bound to atoms, or solids, for a truly free electron cannot absorb a photon and conserve both total relativistic energy and momentum in the process. We must have a bound electron, therefore, the binding forces serving to transmit momentum to the atom or solid. Due to the large mass of an atom, or solid, compared to the electron, the system can absorb a large amount of momentum without acquiring a significant amount of energy. Our photoelectric energy equation remains valid, the effect being possible only because there is a heavy recoiling particle in addition to an ejected electron. The photoelectric effect is one important way in which photons, of energy up to and including x-ray energies, are absorbed by matter. At higher energies other photon absorption processes, soon to be discussed, become more important. 103d d3OI 810313 O1OHd 3H1JO Aa O3H1 W f1 1N `d f1OSNI 3 1S NI3 Energy per photon(eV) cosmic rays Wavelength (m) PHOTO NS- PARTICLELIKEPROPERTIE S OF RAD IATION Finally, it should be emphasized here that in the Einstein picture a photon of frequency v has exactly the energy hv; it does not have energies that are integral multiples of hv. Of course, there can be n photons of frequency v so that the energy at that frequency can be nhv. In treating blackbody cavity radiation in the Einstein picture, we deal with a "photon gas," because the radiant energy is localized in space in bundles rather than extended through space in standing waves. Years after the Planck deduction of the cavity radiation formula, Bose and Einstein derived the same formula on the basis of a photon gas. 2 4 THE COMPTON EFFECT - The corpuscular (particlelike) nature of radiation received dramatic confirmation in 1923 from the experiments of Compton. He allowed a beam of x rays of sharply defined wavelength 2 to fall on a graphite target, as shown in Figure 2-5. For various angles of scattering, he measured the intensity of the scattered x rays as a function of their wavelength. Figure 2-6 shows his experimental results. We see that, although the incident beam consists essentially of a single wavelength 2, the scattered x rays have intensity peaks at two wavelengths; one of them is the same as the incident wavelength, the other, A', being larger by an amount A2. This so-called Compton shift AA _ A' — 2 varies with the angle at which the scattered x rays are observed. The presence of scattered wavelength 2' cannot be understood if the incident x radiation is regarded as a classical electromagnetic wave. In the classical model the oscillating electric field vector in the incident wave of frequency v acts on the free electrons in the scattering target and sets them oscillating at that same frequency. These oscillating electrons, like charges surging back and forth in a small radio transmitting antenna, radiate electromagnetic waves that again have this same frequency v. Hence, in the classical picture the scattered wave should have the same frequency v and the same wavelength 2 as the incident wave. Compton (and independently Debye) interpreted his experimental results by postulating that the incoming x-ray beam was not a wave of frequency v but a collection of photons, each of energy E = hv, and that these photons collided with free electrons in the scattering target as in a collision between billiard balls. In this view, the "recoil" photons emerging from the target make up the scattered radiation. Since the incident photon transfers some of its energy to the electron with which it collides, the scattered photon must have a lower energy E'; it must therefore have a x-ray source Lead collimating slits Detector Figure 2-5 Compton's experimental arrangement. Monochromatic x rays of wavelength /I. fall on a graphite scatterer. The distribution of intensity with wavelength is measured for x rays scattered at any scattering angle O. The scattered wavelengths are measured by observing Bragg reflections from a crystal (see Figure 3-3). Their intensities are measured by a detector such as an ionization chamber. CT (J) CD Primary B = 0° ^ lO3 dd 3NOldWO O 3 H1 N o B =45° i B = 90° • 0 = 135° 0 0.700 ° 0.750 ^ (A) —^ Figure 2-6 Compton's experimental results. The solid vertical line on the left corresponds to the wavelength A, that on the right to A'. Results are shown for four different angles of scattering 0. Note that the Compton shift, AA = — A, for 0 = 90°, agrees well with the theoretical prediction h/m oc = 0.0243 A. lower frequency y' = E'lh, which implies a longer wavelength 2' = c/v'. This point of view accounts qualitatively for the wavelength shift, 02 = A' — 2. Notice that in the interaction the x rays are regarded as particles, not as waves, and that, as distinguished from their behavior in the photoelectric process, the x-ray photons are scattered rather than absorbed. Let us now analyze a single photon-electron collision quantitatively. For x radiation of frequency v, the energy of a photon in the incident beam is E= hv Taking the idea of a photon as a localized bundle of energy quite literally, we shall consider it to be a particle of energy E and momentum p. Such a particle must, however, have certain quite specialized properties. Consider the equation (see Appendix A) giving the total relativistic energy of a particle in terms of its rest mass m o and its velocity y — v2/c2 E = moc2/,I 1 Since the velocity of a photon equals c, and since its energy content E = by is finite, it is apparent that the rest mass of a photon must be zero. Thus a photon can be considered to be a particle of zero rest mass, and of total relativisitic energy E which is entirely kinetic. The momentum of a photon can be evaluated from the general relation between the total relativistic energy E, momentum p, and rest mass m o . This is (2-6) E2 = c 2p2 + (m0 c 2) 2 For a photon the second term on the right is zero, and we have (2-7) p = E/c = by/c PHOTO NS- PARTIC LELIKE PROPERTIES OF RADIATIO N or ci (2-8) p = h/.1, where A. = c/v is the wavelength of the electromagnetic radiation that the photon comprises. It is quite interesting to note that Maxwell's classical wave theory of electromagnetic radiation also leads to an equation p = E/c, with p representing the momentum content per unit volume of radiation and E representing its energy content per unit volume. Now the frequency y of the scattered radiation was observed to be independent of the material in the foil. This implies that the scattering does not involve entire atoms. Compton assumed that the scattering was due to collisions between the photon and an individual electron in the target. He also assumed that the electrons participating in this scattering process are free and initially stationary. Some a priori justification of these assumptions can be found from considering the fact that the energy of an x-ray photon is several orders of magnitude greater than the energy of an ultraviolet photon, and from our discussion of the photoelectric effect it is apparent that the energy of an ultraviolet photon is comparable to the minimum energy with which an electron is bound in a metal. Consider, then, a collision between a photon and a free stationary electron, as in Figure 2-7. In the diagram on the left, a photon of total relativistic energy E 0 and momentum po is incident on a stationary electron of rest mass energy m oc2 . In the diagram on the right, the photon is scattered at an angle B and moves off with total relativistic energy E 1 and momentum p i, while the electron recoils at an angle 9N with kinetic energy K and momentum p. Compton applied the conservation of momentum and total relativistic energy to this collision problem. Relativistic equations were used since the photon always moves at relativistic velocities, and the recoiling electron does too under most circumstances. Momentum conservation requires po= pi cos O+p cos 9 and p l sin 0=p sin 9p Squaring these equations, we obtain (po — pi cos 0)2 =p2 cos 2 (p and pi sin2 B = p2 sin2 cp Photon E0,P0 V X ^ Electron K,p Before After Figure 2-7 Compton's interpretation. A photon of wavelength 2 is incident on a free electron at rest. On collision, the photon is scattered at an angle B with increased wavelength 2', while the electron moves o ff at angle 'p. Adding, we find (2-9) Eo —E 1 =K According to (2-7), this is c(po — Pi) = K (2-10) Writing K + moc2 for E in (2-6), that equation becomes (K + m oc 2)2 = c2p2 + (moc 2 )2 which simplifies to K2 + 2Kmoc2 = c2p2 or K2/c 2 + 2Kmo = p2 Evaluating p2 from (2-9) and K from (2-10), we have (Po — p1) 2 + 2 moc(po — Pi) = pô + pi — 2PoPi cos 0 which reduces to m o c( p o — Pi) = pop1(1 — cos 0) or 1 Pi 1 1 — (1 — cos 0) Po moc Multiplying through by h, and applying (2-8), we obtain the Compton equation (2-11) AA= 21 — io=Ac( 1— cos 0) where (2-12) 2c = h/moc = 2.43 x 10 -12 m = 0.0243 A is the so-called Compton wavelength. Notice that A),, the Compton shift, depends only on the scattering angle 0, and not on the initial wavelength A. Equation (2-11) predicts the experimentally observed Compton shifts of Figure 2-6 to within the experimental limits of accuracy. In (2-11) we see that Ail varies from zero (for 0 = 0, corresponding to a "grazing" collision with the incident photon being scarcely deflected) to 2h/m oc = 0.049 A (for 0 = 180°, corresponding to a "head-on" collision, the incident photon being reversed in direction). Figure 2-8 is a plot of A). versus 0. Subsequent experiments (by Compton, Simon, Wilson, Bothe, Geiger, and Blass) detected the recoil electron in the process, showed that it appeared simultaneously with the scattered x ray, and confirmed quantitatively the predicted electron energy and direction of scattering. The presence of the peak in Figure 2-6 for which the photon wavelength does not change on scattering must still be explained. We have assumed heretofore that the electron with which the photon collides is free. Even though the electron is initially bound, this assumption is justifiable if the kinetic energy acquired by the electron in the collision is much larger than its binding energy. If the electron is particularly strongly bound to an atom in the target, however, or if the incident photon energy is very small, there is some chance that the electron will not be ejected from the atom. In this case, the collision can be regarded as taking place between the photon and the whole atom. The ionic core, to which the electron is bound in the scattering 103d d 3 N OldW003H1 pô + pi — 2popi cos 0 = p2 Conservation of total relativistic energy requires E0 + m oc2 = E 1 + K + moc2 Thus CO M 2h PH OTO NS- PART I CLELIKEPROPERTI ESOF RA DIATION m^c L 7r/2 9 Figure 2-8 ^ Compton's result AA _ (him oc)(1 — cos 9). target, recoils as a whole during the collision. Then the mass M of the atom is the characteristic mass for the process, and it must be substituted in the Compton shift equations for the electron mass m o. Since M » m o (M ^ 22,000m0 for carbon, for instance), the Compton shift for collisions with tightly bound electrons is seen, from (2-11) and (2-12), to be immeasurably small (one millionth of an angstrom for carbon), so that the scattered photon is essentially unmodified in wavelength. To summarize, some photons are scattered from electrons which are freed by the collision; these photons are modified in wavelength. Other photons are scattered from electrons which remain bound during the collision; these photons are not modified in wavelength. The process that scatters photons without changing their wavelength is called Rayleigh scattering, after the physicist who developed a classical theory of the scattering of electromagnetic radiation by atoms around the year 1900. He considered a beam of electromagnetic waves whose oscillating electric field interacts with the charges of the atomic electrons in the target. This interaction produces forces on the electrons which cause oscillating accelerations. As a result of the accelerations, the electrons will radiate electromagnetic waves of the same frequency, and in phase with, the incident waves. (See Appendix B.) Thus the atomic electrons absorb energy from the incident beam of x rays and scatter it in all directions, without modifying the wavelength. Although this classical explanation of Rayleigh scattering is different from the quantum explanation presented in the preceding paragraph, both explain the same feature observed in the measurements. Thus Rayleigh scattering is a case where classical and quantum results merge. It is interesting to ask in what region of the electromagnetic spectrum Rayleigh scattering will be the dominant process, and in what region Compton scattering will dominate. If the incident radiation is in the visible, microwave, or radio part of the spectrum, then % is extremely large compared to the Compton shift A2, independent of whether an electron or an atomic mass is used in evaluating the Compton wavelength of (2-12). Thus the scattered radiation in this region of the spectrum will in all circumstances have a wavelength which is the same as the wavelength of the incident radiation within experimental accuracy. So, as 2 —> co the quantum results merge with the classical results, and Rayleigh scattering dominates. Moving into the x-ray region of the spectrum, Compton scattering starts to become important, particularly for scattering targets of low atomic number where the atomic electrons are not very tightly bound, and the wavelength shift in scattering from an electron which Consider an x-ray beam, with = 1.00 A, and also a y-ray beam from a Cs 137 A = 1.88 x 10 -2 A. If the radiation scattered from free electrons is viewed at 90° sample,with to the incident beam: (a) What is the Compton wavelength shift in each case? (b) What kinetic energy is given to a recoiling electron in each case? (c) What percentage of the incident photon energy is lost in the collision in each case? ^ (a) The Compton shift, with 0 = 90°, is 6.63 x 10 -34 joule-sec h x (1 - cos 90°) AA = (1 - cos 0) = 31 moc kg x 3.00 x 108 m/sec 9.11 x 10 = 2.43 x 10 -12 m = 0.0243 A This result is independent of the incident wavelength, the same for the y rays as the x rays. (b) Equation (2-10) can be written as Example 2-4. he/.l = he/l' + K Then, since 2' = 2 + AA., we have hc/A = hc/(2 + A A) + K so that K = he A),/2O. + AA). For the x-ray beam, with 2 = 1.00 A, we have 6.63 x 10 -34 joule-sec x 3.00 x 108 m/sec x 2.43 x 10 -12 m K= = 4.73 x 10 1 joule lo lo m m x (1.00 + 0.024) x 10 1.00 x 10 = 295 eV = 0.295 keV For the y-ray beam, with 2 = 1.88 x 10 -2 A, we have 6.63 x 10 -34 joule-sec x 3.00 x 108 m/sec x 2.43 x 10 -12 m l4 joule = 5.98 x 10 1.88 x 10 la m x (0.0188 + 0.0243) x 10 -1° m = 378 keV. K= (c) The incident x-ray photon energy is he 6.63 x 10 -34 joule-sec x 3.00 x 108 m/sec s =1.99=10 1joule E=hv=-_ 1.00 x 10 -1Ô m 2 = 12.4 keV The energy lost by the photon equals that gained by the electron, or 0.295 keV, so the percentage loss in energy is 0.295 keV x 100% = 2.4% 12.4 keV The incident y-ray photon energy is he 6.63 x 10 -34 joule-sec x 3.00 x 108 m/sec = 1.06 x 10 -13 joule E = hv = = 1.88x10 -i2 m = 660 keV W CO Ci) CD o N 103dd3 NOldW OO 3H1 is freed in the process becomes easily measurable. In the y-ray region where I -4 0, the photon energy becomes so large, that an electron is always freed in a collision, and Compton scattering dominates. It is in the short wavelength region that the classical results fail to explain the scattering of radiation, just as in the ultraviolet catastrophe of classical physics where predictions concerning the radiation in a cavity diverged radically from experimental results at short wavelengths. These circumstances are due to the size of Planck's constant h. At long wavelengths the frequency y is small, and since h is also small the granularity in electromagnetic energy, hv, is so small as to be virtually indistinguishable from the continuum of classical physics. But at sufficiently short wavelengths, where y is large enough, hv is no longer small enough to be negligible and quantum effects abound. The energy lost by the photon equals that gained by the electron, or 378 keV, so that the PHOTONS- PARTICLELIKE PRO PERTIES OF RADIATION percentage loss in energy is N ^ o ^ 378 keV 660 keV x 100% = 57% Hence, the more energetic photons (which have small wavelengths) experience a larger percent loss in energy in Compton scattering. This corresponds to the fact that the photons of smaller wavelengths experience a larger percent increase in wavelength on being scattered. This becomes clear from the expression for fractional loss in energy, given simply by K hcA2/2(2 + Ail) A2 hc/.l A + 4^ From this it can be shown that at 2 = 5500 A, corresponding to visible photons, the percentage loss (for 0 = 90°) is less than one-thousandth of 1%, whereas at 2 = 1.25 x 10 -2 A, corresponding to 1 MeV y ray photons, the percentage loss (for 8 = 90°) is 67%. 1 - 2 5 - THE DUAL NATURE OF ELECTROMAGNETIC RADIATION In his paper, "A Quantum Theory of the Scattering of X-rays by Light Elements," Compton wrote: "The present theory depends essentially upon the assumption that each electron which is effective in the scattering scatters a complete quantum (photon). It involves also the hypothesis that the quanta of radiation are received from definite directions and are scattered in definite directions. The experimental support of the theory indicates very convincingly that a radiation quantum carries with it directed momentum as well as energy." The need for a photon, or localized particle, interpretation of processes dealing with the interaction between radiation and matter is clear, but at the same time we need a wave theory of radiation to understand interference and diffraction phenomena. The idea that radiation is neither purely a wave phenomenon nor merely a stream of particles must therefore be taken seriously. Whatever radiation is, it behaves wavelike under some circumstances and particlelike under other circumstances. Indeed, the situation is revealed most forcefully in Compton's experimental work where (a) a crystal spectrometer is used to measure x-ray wavelengths, the measurement being interpreted by a wave theory of diffraction and (b) the scattering affects the wavelength in a way that can be understood only by treating the x rays as particles. It is in the very expressions E = by and p = h/2 that the wave attributes (v and A,) and the particle attributes (E and p) are combined. Although many physicists felt at first very uncomfortable when contemplating the "split personality" of electromagnetic radiation, the broader point of view provided by the development of quantum mechanics has caused the contemporary attitude to be quite different. The duality evident in the wave-particle nature of radiation is no longer considered at all unusual because it is now known to be a general characteristic of all physical entities. We shall see that electrons and protons, for example, have exactly the same dual nature as photons. We shall also see that it is possible to reconcile the existence of the wave aspects with the existence of the particle aspects, for any of these entities, with the aid of quantum mechanics. 2 6 - PHOTONS AND X RAY PRODUCTION - X rays, so named by their discoverer Roentgen because their nature was then unknown, are radiations in the electromagnetic spectrum of wavelength less than about 1.0 A. They show the typical transverse wave behavior of polarization, interference, and diffraction that is found in light and all other electromagnetic radiation. X rays are produced in the target of an x-ray tube, illustrated in Figure 2-9, when a beam of energetic electrons, accelerated through a potential difference of thousands of volts, is stopped upon striking the target. According to classical physics (see Appendix B), the deceleration of the electrons, brought to rest in the target material, results in the emission of a continuous spectrum of electromagnetic radiation. Figure 2-10 shows, for four different values of the incident electron energy, how the x rays emerging from a tungsten target are distributed in wavelength. (In addition to the continuous x-ray spectrum shown in the figure, x-ray lines characteristic of the target material are emitted. We shall discuss the lines in Chapter 9.) The most notable feature of these smooth curves is that, for a given electron energy, there exists a well-defined minimum wavelength Amin; for 40 keV electrons, for instance, Amin is 0.311 A. Although the shape of the continuous x-ray distribution spectrum depends slightly on the choice of target material as well as on the electron accelerating potential V, the value of Amin depends only on V, being the same for all target materials. Classical electromagnetic theory cannot account for this fact, there being no reason why waves whose wavelength is less than a certain critical value should not emerge from the target. A ready explanation appears, however, if we regard the x rays as photons. Figure 2-11 shows the elementary process that, on the photon view, is responsible for the continuous x-ray spectrum of Figure 2-10. An electron of initial kinetic energy K is Relative intens ity 10 0.2 Il 20 keV 0.6a 0.4 Wavelength (A) 0.8 10 The continuous x-ray spectrum emitted from a tungsten target for four different values of eV, the incident electron energy. Figure 2-10 NOIlJflaObld At/H - XaN t/ S NOlO Hd An x-ray tube. Electrons are emitted thermally from the heated cathode C and are accelerated toward the anode target A by the applied potential V. X rays are emitted from the target when electrons are stopped by striking it. Figure 2-9 PHOTON S- PARTI CLELIKE PROPERTIESO F RAD IATION N Bremsstrahlung photon K i ^ ^ ^ ^ Electron • Target nucleus Figure 2 11 The bremsstrahlung process responsible for the production of x rays in the continuous spectrum. - decelerated during an encounter with a heavy target nucleus, the energy it loses appearing in the form of radiation as an x-ray photon. The electron interacts with the charged nucleus via the Coulomb field, transferring momentum to the nucleus. The accompanying deceleration of the electron leads to photon emission. The target nucleus is so massive that the energy it acquires during the collision can safely be neglected. If K' is the kinetic energy of the electron after the encounter, then the energy of the photon is by=K — K' and the photon wavelength follows from (2-13) Electrons in the incident beam can lose different amounts of energy in such encounters and typically a single electron will be brought to rest only after many encounters. The x rays thus produced by many electrons make up the continuous spectrum of Figure 2-10 and are very many discrete photons whose wavelengths vary from Amin to A — co, corresponding to the different energy losses in the individual encounters. The shortest wavelength photon would be emitted when an electron loses all its kinetic energy in one deceleration process; here K' = 0 so that K = he/Amin. Since K equals eV, the energy acquired by the electron in being accelerated through the potential difference V applied to the x-ray tube, we have he/A = K — K' eV = he/ Amin or (2-14) Thus the minimum wavelength cutoff represents the complete conversion of the electron's kinetic energy to x radiation. Equation (2-14) shows clearly that if h 0 then Amin 0, which is the prediction of classical theory. This shows that the very existence of a minimum wavelength is a quantum phenomenon. The continuous x radiation of Figure 2-10 is often called bremsstrahlung, from the German brems (= braking, i.e., decelerating) + strahlung (= radiation). The bremsstrahlung process occurs not only in x-ray tubes but wherever fast electrons collide with matter, as in cosmic rays, in the van Allen radiation belts which surround the earth, and in the stopping of electrons emerging from accelerators or radioactive nuclei. The bremsstrahlung process can be considered as an inverse photoelectric effect: in the photoelectric effect, a photon is absorbed, its energy and momentum going to an electron and a recoiling nucleus; in the bremsstrahlung process, a photon is created, its energy and momentum coming from a colliding electron and nucleus. We deal with the creation of photons in the bremsstrahlung process, rather than with their absorption or scattering by matter. Amin = he/eV Determine Planck's constant h from the fact that the minimum x-ray wavelength produced by 40.0 keV electrons is 3.11 x 10 -11 m. Example 2-5. ■ From (2-14), we have h (D C 2-7 PAIR PRODUCTION AND PAIR ANNIHILATION In addition to the photoelectric and Compton effects there is another process whereby photons lose their energy in interactions with matter, namely the process of pair production. Pair production is also an excellent example of the conversion of radiant energy into rest mass energy as well as into kinetic energy. In this process, illustrated schematically in Figure 2-12, a high energy photon loses all of its energy hv in an encounter with a nucleus, creating an electron and a positron (the pair) and endowing them with kinetic energies. A positron is a particle which is identical in all of its properties with an electron, except that the sign of its charge (and of its magnetic moment) is opposite to that of an electron; a positron is a positively charged electron. In pair production the energy taken by the recoil of the nucleus is negligible because it is so massive, and thus the balance of total relativistic energy in the process is simply hv = E_ + E+ = (moc2 + K _ ) + (m oc2 + K+) = K_ + K+ + 2m0c 2 (2-15) In this expression E _ and E + are the total relativistic energies, and K _ and K + are the kinetic energies of the electron and positron, respectively. Both particles have the same rest mass energy m oc2 . The positron is produced with a slightly larger kinetic energy than the electron because the Coulomb interaction of the pair with the positively charged nucleus leads to an acceleration of the positron and a deceleration of the electron. In analyzing this process here we ignore the details of the interaction itself, considering only the situation before and after the interaction. Our guiding principles are the conservation of total relativistic energy, conservation of momentum, and conservation of charge. From these conservation laws, it is not difficult to show that a photon cannot simply disappear in empty space, creating a pair as it vanishes. The hp Nucleus e K_ Figure 2-12 The pair production process. n iv PAIR PRO DUCTI ON AN D PAIR ANNIHILATIO N 1.60 x 10 -19 coul x 4.00 x 104 V x 3.11 x 10 -11 m 3.00 x 108 m/sec = 6.64 x 10 34 joule-sec This agrees well with the value of h deduced from the photoelectric effect and the Compton effect. Measurement of V, Amin, and c provides one of the most accurate methods for evaluating the ratio h/e. Bearden, Johnson, and Watts at the Johns Hopkins University found in 1951, using this procedure, h/e = 1.37028 x 10 -15 joule-sec/coul. This ratio is combined with many other measured combinations of physical constants, the assembly of data being analyzed by elaborate statistical methods to find the "best" value for the various physical constants. The best values change (but usually only within the a priori estimates of accuracy) and become increasingly precise as new experimental data and higher precision methods are used. PH OTONS- PARTICLELIKE PROPERTIE SOF RADIATIO N presence of the massive nucleus (which can absorb momentum without appreciably affecting the energy balance) is necessary to allow both energy and momentum to be conserved in the process. Charge is automatically conserved, the photon having no charge and the created pair of particles having no net charge. From (2-15) we see that the minimum, or threshold, energy needed by a photon to create a pair is 2m 0c2 or 1.02 MeV (1 MeV = 10 6 eV), which is a wavelength of 0.012 A. If the wavelength is shorter than this, corresponding to an energy greater than the threshold value, the photon endows the pair with kinetic energy as well as rest mass energy. The pair production phenomenon is a high-energy one, the photons being in the very short x-ray or y-ray regions of the electromagnetic spectrum (see Figure 2-4), where their energies by are equal to or greater than 2m oc2. As we shall see in the next section, experimental results demonstrate that the absorption of photons in interaction with matter occurs principally by the photoelectric process at low energies, by the Compton effect at medium energies, and by pair production at high energies. Electron-positron pairs are produced in nature by cosmic-ray photons and in the laboratory by bremsstrahlung photons from particle accelerators. Other particle pairs, such as proton and antiproton, can be produced as well if the initiating photon has sufficient energy. Because the electron and positron have the smallest rest mass of known particles, the threshold energy of their production is the smallest. Experiment verifies the quantum picture of the pair production process. There is no satisfactory explanation whatever of this phenomenon in classical theory. Analysis of a bubble chamber photograph (as in Figure 2-13) reveals the creation of an electron-positron pair as photons pass through matter. The electron and positron tracks have opposite curvatures in the uniform magnetic field B of 0.20 weber/m 2, their radii r each being 2.5 x 10 -2 m. What was the energy and the wavelength of the pair producing photon? •The momentum p of the electron is given by p = eBr = 1.6 x 10 - 19 cowl x 2.0 x 10 -1 weber/m2 x 2.5 x 10 -2 m = 8.0 x 10 -22 kg-m/sec Its total relativistic energy E_ is given by E2 = c2p2 + (mo c 2)2 Since moc 2 = 0.51 MeV, and pc = 8.0 x 10 -22 kg-m/sec x 3.0 x 108 m/sec = 2.4 x 10 -13 joule = 1.5 MeV, we have E2 = (1.5 MeV) 2 + (0.51 MeV) 2 and E_ = 1.6 MeV. The positron's total relativistic energy had the same value since its track had the same radius, so the energy of the photon was hv=E_+E + = 3.2 MeV The photon's wavelength follows from Example 2-6. E =hv=hc/ d% or he 6.6 ^,_—_ E 10 -34 joule-sec x 3.0 x 108 m/sec — 3.9 x 10 13 m= 0.0039 A 3.2 x 106 eV x 1.6 x 10 -19 joule/eV x t Closely related to pair production is the inverse process called pair annihilation. An electron and a positron, which are essentially at rest near one another, unite and are annihilated. Matter disappears and in its place we get radiant energy. Since the initial momentum of the system is zero and momentum must be conserved in the process, we cannot have only one photon created because a single photon cannot have zero momentum. The most probable process is the creation of two photons moving with equal momenta in opposite directions. Less probable, but possible, is the creation of three photons. In the two-photon process illustrated by Figure 2-14, momentum conservation gives 0 = p i + p2 or p i = —p2 so that the photon momenta are oppositely directed PAIR PRODUCTIO N ANDPAIRANNI HILATION Figure 2 13 Electron pair production, as seen in a bubble chamber. The electron and positron tracks are the two spirals meeting at the point where the production took place in the liquid filling of the chamber. The student can determine which of the two spirals belongs to the positron by knowing that the long tracks are primarily positively charged deuterons which are incident from the left. (Courtesy of C. R. Sun, State University of New York at Albany) - but equal in magnitude. Hence, p l = p2 or hv 1/c = hv2/c and y 1 = y2 = v. Total relativistic energy conservation then requires that m oc2 + m oc2 = hv + hv, the positron and electron having no initial kinetic energy and the photon energies being the same. Hence, hv = moc2 = 0.51 MeV, corresponding to a photon wavelength of 0.024 A. If the initial pair had some kinetic energy then the photon energy would exceed 0.51 MeV and its wavelength could be less than 0.024 A. Positrons are created in the pair production process. On passing through matter a positron loses energy in successive collisions until it combines with an electron to form a bound system called positronium. The positronium "atom' is short lived, decaying into photons within about 10 -10 sec of its formation. The electron and positron presumably move about their common center of mass in a kind of death dance before mutual annihilation. Example 2 7. (a) Assume that Figure 2-14 represents the annihilation process in a reference frame S, the electron-positron pair being at rest there and the two annihilation photons moving along the x axis. Find the wavelength 2 of these photons in terms of m0 , the rest mass of an electron or positron. - •• +e —e Before Figure 2-14 P2 Pi "P2 hv1 After Pair annihilation producing two photons. PHOTON S- PARTIC LELIKE PRO PERTIES O F RAD IATION tO •We saw that p 1 = p2 and hv 1 = hv 2 . Each photon has the same energy, the same frequency, and the same wavelength. We can drop the subscripts then and from the relation by = moc2 and p = E/c we obtain p=E/c=hv/c=moc2/c=moc But we also have the relation p = so that .1 = hip =h/moc Hence, in the rest frame of the positronium atom each photon has the same wavelength, 2 = • h/m oc. (b) Now consider the same annihilation event to be observed in frame S', moving relative to S with a velocity v to the left. What wavelength does this (moving) observer record for the annihilation photons? ^ Here, the pair has initial total relativistic energy 2mc 2, where m is relativistic mass, rather than merely the rest mass energy 2m 0c2, so that conservation of energy in the annihilation process gives us 2mc = plc + p'2 c Also, the pair now moves with velocity v along the positive x' axis so that its initial momentum is 2mv, rather than zero as before. Conservation of momentum now gives us 2mv =p'1 — p'2 the photons moving in opposite directions also the x' axis. Let us combine these two expressions. We multiply the second by c and add it to the first, obtaining, since m = m o/N/1 — v2/c 2 m o(c + v) —mcc c +v pi =m(c+ v)— —v v 2/c 2 But p'1 = h/21, so that h_ h c—v p'1 moc c+v c—v c+ v In a similar manner, by subtracting the second equation from the first, we obtain 212 h — h /c+v — P2 moc 'Vf c—v c+ v c— v The photons do not have the same wavelength, but they are Doppler shifted from the wavelength 2 they had in the rest frame of the source (the positronium atom). If an observer is situated on the x' axis so that the source moves toward him, he will receive photon 1, having a frequency higher than the "rest" frequency. If an observer is situated on the x' axis so that the source moves away from him, he will receive photon 2, having a frequency lower than the rest frequency. This Example is actually a derivation of the longitudinal Doppler shift formula of relativity theory. • The first experimental evidence for the pair production process, and the existence of positrons, was obtained in 1933 by Anderson during an investigation of the cosmic radiation. This radiation consists of a flux of very high energy photons and charged particles incident upon the earth from extra-terrestrial sources. Anderson was using a cloud chamber containing a thin lead plate, with the entire apparatus in a magnetic field. Upon exposing this apparatus to the cosmic radiation, it was found that very infrequently a pair of charged particles was ejected from some point in the lead plate. These events were assumed to be the result of the interaction of a photon in the lead because no charged particle was seen to strike the point of ejection, whereas a photon, being uncharged, could strike the point of ejection without being seen. The two charged particles ejected in these events were bent in opposite directions by the magnetic field. Therefore their charges were of opposite sign. From other considerations it could be shown that the magnitudes of these charges were equal to one electronic charge and that the masses of the particles were approximately equal to one electronic mass. E = ± Vc2p2 + (moc 2)2 (2-17) where mo is the electron rest mass. These are simply the solutions for E of (2-6), but the solution with the minus sign corresponds to a negative total relativistic energy—a concept as foreign to relativistic mechanics as a negative total energy is to classical mechanics. Instead of just throwing away the negative part on the grounds that it is not physically realistic, Dirac pursued the consequences of the entire equation. In doing this he was led to some very interesting conclusions. Consider Figure 2-15, which is an energy-level diagram representing (2-17). If the indicated continuum of negative energy levels exists, all free electrons of positive energy should be able to make transitions into these levels, accompanied by the emission of photons of the appropriate energies. This obviously disagrees with experiment because free electrons are not generally observed to emit spontaneously photons of energy hv > 2m0c2 . However, Dirac pointed out that this difficulty can be removed by assuming that all the negative energy levels are normally filled at all points in space. According to this assumption, a vacuum consists of a sea of electrons in negative energy levels. This does not disagree with experiment. For instance, the negative charge could not be detected, as it is assumed to be uniformly distributed and therefore exerts no force on a charged body. Similar considerations will demonstrate that all the "usual" properties of a sea of negative energy electrons are such that its presence would not be apparent in any of the usual experiments. However, Dirac's theory of the vacuum is not completely vacuous because it predicts certain new properties which can be tested by experiment. The energy-level diagram for a free electron suggests the possibility of exciting an electron in a negative energy level by the absorption of a photon. Since all the negative energy levels are assumed to be fully occupied, the electron must be excited to one of the unoccupied positive energy levels. The minimum photon energy required for this process is obviously hv = 2moc2 , and the process results in the production of an electron in a positive energy level plus a hole in a negative energy level. We can demonstrate that a hole in a negative electron energy level has all the mechanical and electrical properties of a positron of positive energy. For instance, there is a positive charge +e associated with the absence of an electron of negative charge — e. Consequently, this is the pair production process observed experimentally by Anderson three years after its theoretical prediction by Dirac. ^ Higher + levels, corresponding to +moc2 p> 0 Lowest + level, corresponding to p= Highest — level, corresponding to p=0 Lower — levels, corresponding to p>0 0 0— —mo c2 Figure 2 15 - The energy levels of a free electron according to Dirac. PAIR PRODUC TION AND PAIR ANNIHILATION The discovery of the pair production process explained the origin of a discrepancy between the then current theory of x-ray attenuation and the measured attenuation coefficients of several materials for 2.6 MeV x rays (y rays obtained from a radioactive source). As the theory originally did not include pair production, the predicted attenuation was too small; with the inclusion of the pair production process, good agreement is now obtained between experiment and theory. However, the real importance of Anderson's discovery was in the beautiful confirmation which it provided for Dirac's relativistic quantum mechanical theory of the electron. The Dirac theory leads to the prediction that the allowed values of total relativistic energy E for a free electron are PHOTONS- PARTI CLELIKE PRO PERTIE S OF RADIATION co N Q o 2-8 CROSS SECTIONS FOR PHOTON ABSORPTION AND SCATTERING Consider a parallel beam of photons passing through a slab of matter, as in Figure 2-16. The photons can interact with the atoms in the slab by four different processes: photoelectric, pair production, Rayleigh, and Compton. The first two absorb photons completely, while the last two only scatter them, but all the processes remove photons from the parallel beam. The question of what the chances of these processes happening are, in a given set of circumstances, is one of considerable theoretical and practical significance. For instance, it is very important to a medical physicist designing the shielding for an x-ray machine, or a nuclear engineer designing the shielding for a reactor. The answer to the question is expressed in terms of quantities called cross sections. We first meet cross sections here in connection with photons, but we shall encounter them again in other connections elsewhere in this book. The probability that a photon of a given energy will be, for example, absorbed by the photoelectric process in passing an atom of the slab is specified by the value of the photoelectric cross section 6pE . This measure of the likelihood of the photoelectric process occurring is defined so that the number NpE of photoelectric absorptions occurring is NpE = apEIn (2-18) when a beam containing I photons is incident on a slab containing n atoms per unit area. It is assumed here that the slab is thin enough that the probability of a given photon being absorbed in passing through the slab is much smaller than one. The definition of (2-18), which is a prototype of the definitions of all cross sections, is sufficiently important to warrant careful physical interpretation. First note that the number NpE of absorptions should certainly increase in proportion to the number I of photons incident on the slab. Furthermore, it the slab is thin in the sense specified previously the atoms in the slab will not appreciably "shadow" each other, as far as the incident photons are concerned. Then the number NpE of absorptions should also increase in proportion to the number n of target atoms per unit area of the slab. Thus we should have NpE Cc In If we write this proportionality as an equality, calling the proportionality constant 6pE , we obtain the defining equation for that cross section. Thus we see that the cross section, which has a value depending on both the energy of the photon and the type of atom, measures how effective such atoms are in absorbing those photons by the photoelectric effect. Since the quantities NpE and I in (2-18) are dimensionless, while n has the dimensions of (area) -1 , it is clear that 6p E must have the dimensions of (area). Thus it is reasonable to use the name cross section for 613E. It is often given Figure 2-16 A beam of photons passing through a slab. 10 -19 Lead ^\ \ \ ^PE\\ o / \ \u^ ^ N.\ \\ 6pR % NN 108 Figure 2-17 lead atom. The scattering, photoelectric, pair production, and total cross sections for a CROSS SECTIONS FOR PHOTON A BSORPTION AND SCATT ERING a geometrical interpretation by imagining that a circle of area 6pE is centered on each atom in the slab in the plane of the slab, with the property that any photon entering the circular area is absorbed by the atom through the photoelectric effect. This geometrical interpretation is convenient for visualization and even for calculation, but it definitely should not be taken to be literally true. A cross section is really just a way of expressing numerically the probability that a certain type of atom will cause a photon of a given energy to undergo a particular process. The definitions and interpretations of the cross sections for the other absorption or scattering processes are completely analogous to those for the example we have considered. Figure 2-17 shows the measured scattering (as), photoelectric (ape), pair production (6pR), and total (a) cross sections for a lead atom as a function of the photon energy hv. The scattering cross section specifies the probability of scattering occurring by either the Rayleigh or the Compton process. For lead, which has a high atomic number and thus tightly bound atomic electrons, Rayleigh scattering dominates Compton scattering when the photon energy is below about hv = 105 eV. The sharp breaks in the photoelectric cross section occur at the binding energies of the different electrons in the lead atom; when hv drops below the binding energy of a particular electron a photoelectric process involving it is no longer energetically possible. The pair production cross section rises very rapidly from zero when hv exceeds the threshold energy 2moc2 ^ 106 eV required to materialize a pair. The total cross section a in Figure 2-17 is the sum of the scattering, photoelectric, and pair production cross sections. This quantity specifies the probability that a photon will make any kind of interaction with the atom. We see from the figure that the energy ranges in which each of the three processes makes the most important contribution to a are approximately, for lead: Photoelectric effect: hv < 5 x 105 eV 5 x 105 eV < hv < 5 x 106 eV Scattering: Pair production: 5 x 106 eV < by PHOTONS- PARTIC LELIKE PROPERTIES O F RADIATION 0 LO Because these processes have probabilities with different dependences on atomic number, the energy ranges in which they dominate are quite different for atoms of low atomic number. The energy ranges are approximately, for aluminum: Photoelectric effect: hv < 5 x 104 eV Scattering: 5 x 104 eV < hv < 1 x 10' eV Pair production: 1 x 10' eV < hv Evaluate, in terms of the total cross section a, the attenuation of a parallel beam of x rays in passing through a thick slab of matter. ■ Referring to Figure 2-16, I(0) photons are in the beam as it is incident on the front face of the slab of thickness t, which contains p atoms per cm 3 . Assume, for simplicity, that the area of the slab is 1 cm2 . Because of scattering and absorption processes, the parallel beam contains a smaller number I(x) of photons after penetrating x cm into the slab. Consider a thin lamina of the slab, of width dx located at x. The number of atoms per cm 2 in the lamina is p times its volume dx, or p dx. The number of beam photons that will be scattered or absorbed in the lamina is specified by the total cross section a, in a definition analogous to (2-18). It is 6I(x)p dx. Thus the number of beam photons emerging from the lamina, I(x + dx), which equals the number incident minus the number removed, is I(x + dx) = 1(x) — aI(x)pdx or E$ample 2 8. - dI(x) - I(x + dx) — I(x) = —6I(x)pdx We find 1(t), the number of beam photons emerging from the rear face of the slab, by solving for dI(x)/I(x) and then integrating over x dI(x) I(x) —apdx t t 1(x) x) JI( 0 =-6pdx f 0 In I(x)]ô = —ap t In 1(0) I(t) = apt = e apt I(0) I(t) = I(0)e - °pt - (2-19) The intensity of the beam, as measured by the number I of photons it contains, decreases exponentially as the thickness t of the slab increases. The quantity 6p, which is called the attenuation coefficient, has the dimensions (cm i) and is the reciprocal of the thickness of slab required to attenuate the beam intensity by a factor of e. This thickness is called the attenuation length A. That is A = 1/6p (2-20) Of course, the attenuation coefficient has the same dependence on photon energy as the total cross section. Figure 2-18 shows measured attenuation coefficients of lead, tin, and aluminum for photons of relatively high energy. • This section summarizes many of the practical aspects of the electromagnetic radiation emission and absorption phenomena we have studied in the present chapter. But the fundamental aspects of these phenomena are better summarized by saying that they show electromagnetic radiation to be quantized into particles of energy called photons. It should also be emphasized that the phenomena of interference and diffraction show photons do not travel through a system from where they are emitted to where they are absorbed in the simple way that classical particles do. Instead, photons act as if they were guided by classical waves because photons travel through 2.0 SNOIlS3 flO 1.8 1.6 1.4 Pb I 1.2 ; 1.0 g ^ 0.8 b 0.6 0.4 0.2 10 7 hv (eV) 10 8 10 9 Figure 2-18 The attenuation coefficients for several atoms and a range of photon energies. a system such as a diffraction apparatus in a way that is best described by the way that classical waves would propagate through the apparatus. QUESTIONS 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. In the photoelectric experiments, the current (number of electrons emitted per unit time) is proportional to the intensity of light. Can this result alone be used to distinguish between the classical and quantum theories? In Figure 2-2 why does the photoelectric current not rise vertically to its maximum (saturation) value when the applied potential difference is slightly more positive than V0? Why is it that even for incident radiation that is monochromatic, photoelectrons are emitted with a spread of velocities? The existence of a cutoff frequency in the photoelectric effect is often regarded as the most potent objection to a wave theory. Explain. Why are photoelectric measurements very sensitive to the nature of the photoelectric surface? Do the results of photoelectric experiments invalidate Young's interference experiment? Can you use the device of letting h -* 0 to obtain classical results from quantum results in the case of the photoelectric effect? Explain. Assume that the emission of photons from a source of radiation is random in direction. Would you expect the intensity (or energy density) to vary inversely as the square of the distance from the source in the photon theory as it does in the wave theory? Does a photon of energy E have mass? If so, evaluate it. Why, in Compton scattering, would you expect A/I to be independent of the materials of which the scatterer is composed? Would you expect to observe the Compton effect more readily with scattering targets composed of atoms with high atomic number or those composed of atoms with low atomic number? Explain. Do you observe a Compton effect with visible light? Why? Would you expect a definite minimum wavelength in the emitted radiation for a given value of the energy of an electron incident on the target of an x-ray tube from the classical electromagnetic theory of the process? Does a television tube emit x rays? Explain. What effect(s) does decreasing the voltage across an x-ray tube have on the resulting x-ray spectrum? PHOTONS- PARTICLELIKE PROPERTIES O F RADIATION 16. Discuss the bremsstrahlung process as the inverse of the Compton process. Of the photoelectric process. 17. Describe several methods that can be used to determine experimentally the value of Planck's constant h. 18. From what factors would you expect to judge whether a photon will lose its energy in interactions with matter by the photoelectric process, the Compton process, or the pair production process? 19. Can you think of experimental evidence contradicting the idea that vacuum is a sea of electrons in negative energy states? 20. Can electron-positron annihilation occur with the creation of one photon if a nearby nucleus is available for recoil momentum? 21. Explain how pair annihilation with the creation of three photons is possible. Is it possible in principle to create even more than three photons in a single annihilation process? 22. What would be the inverse of the process in which two photons are created in electronpositron annihilation? Can it occur? Is it likely to occur? 23. What is wrong with taking the geometrical interpretation of a cross section as literally true? PROBLEMS 1. (a) The energy required to remove an electron from sodium is 2.3 eV. Does sodium show a photoelectric effect for yellow light, with 1 = 5890 A? (b) What is the cutoff wavelength for photoelectric emission from sodium? 2. Light of a wavelength 2000 A falls on an aluminum surface. In aluminum 4.2 eV are required to remove an electron. What is the kinetic energy of (a) the fastest and (b) the slowest emitted photoelectrons? (c) What is the stopping potential? (d) What is the cutoff wavelength for aluminum? (e) If the intensity of the incident light is 2.0 W/m 2, what is the average number of photons per unit time per unit area that strike the surface? 3. The work function for a clean lithium surface is 2.3 eV. Make a rough plot of the stopping potential Vo versus the frequency of the incident light for such a surface, indicating its important features. 4. The stopping potential for photoelectrons emitted from a surface illuminated by light of wavelength 2 = 4910 A is 0.71 V. When the incident wavelength is changed the stopping potential is found to be 1.43 V. What is the new wavelength? 5. In a photoelectric experiment in which monochromatic light and a sodium photocathode are used, we find a stopping potential of 1.85 V for 2 = 3000 A and of 0.82 V for 4000 A. From these data determine (a) a value for Planck's constant, (b) the work function of sodium in electron volts, and (c) the threshold wavelength for sodium. 6. Consider light shining on a photographic plate. The light will be recorded if it dissociates an AgBr molecule in the plate. The minimum energy to dissociate this molecule is of the order of 10 -19 joule. Evaluate the cutoff wavelength greater than which light will not be recorded. 7. The relativistic expression for kinetic energy should be used for the electron in the photoelectric effect when v/c > 0.1, if errors greater than about 1% are to be avoided. For photoelectrons ejected from an aluminum surface (w o = 4.2 eV) what is the smallest wavelength of an incident photon for which the classical expression may be used? 8. X rays with 2 = 0.71 A eject photoelectrons from a gold foil. The electrons form circular paths of radius r in a region of magnetic induction B. Experiment shows that rB = 1.88 x 10 -4 tesla-m. Find (a) the maximum kinetic energy of the photoelectrons and (b) the work done in removing the electron from the gold foil. 9. (a) Show that a free electron cannot absorb a photon and conserve both energy and momentum in the process. Hence, the photoelectric process requires a bound electron. (b) In the Compton effect, however, the electron can be free. Explain. 2_ - cotg =(1 + m c 2 tanrp o J between the direction of motion of the scattered photon and the recoil electron in the Compton effect. 16. Derive a relation between the kinetic energy K of the recoil electron and the energy E of the incident photon in the Compton effect. One form of the relation is K_ ^2hv^ 2 0C 111 E sin 2 2hv 1 1+ moc2 ( 17. 18. 19. 20. 21. 22. 23. 0 — 2 sin 0 2 (Hint: See Example 2-4.) Photons of wavelength 0.024 A are incident on free electrons. (a) Find the wavelength of a photon which is scattered 30° from the incident direction and the kinetic energy imparted to the recoil electron. (b) Do the same if the scattering angle is 120°. (Hint: See Example 2-4.) An x-ray photon of initial energy 1.0 x 10 5 eV traveling in the +x direction is incident on a free electron at rest. The photon is scattered at right angles into the + y direction. Find the components of momentum of the recoiling electron. (a) Show that AE/E, the fractional change in photon energy in the Compton effect, equals (hv'/m oc2)(1 — cos 0). (b) Plot AE/E versus 0 and interpret the curve physically. What fractional increase in wavelength leads to a 75% loss of photon energy in a Compton collision? Through what angle must a 0.20 MeV photon be scattered by a free electron so that it loses 10% of its energy? What is the maximum possible kinetic energy of a recoiling Compton electron in terms of the incident photon energy by and the electron's rest energy m oc2? Determine the maximum wavelength shift in the Compton scattering of photons from protons. 24. (a) Show that the short wavelength cutoff in the x-ray continuous spectrum is given by Amin = 12.4 A/V, where V is applied voltage in kilovolts. (b) If the voltage across an x-ray tube is 186 kV what is Amin? 25. (a) What is the minimum voltage across an x ray tube that will produce an x ray having the Compton wavelength? A wavelength of 1 A? (b) What is the minimum voltage needed - G) sw31a oad 10. Under ideal conditions the normal human eye will record a visual sensation at 5500 A if as few as 100 photons are absorbed per second. What power level does this correspond to? 11. An ultraviolet lightbulb, emitting at 4000 A, and an infrared lightbulb, emitting at 7000 A, each are rated at 40 W. (a) Which bulb radiates photons at the greater rate, and (b) how many more photons does it produce each second over the other bulb? 12. Solar radiation falls on the earth at a rate of 1.94 cal/cm 2 min on a surface normal to the incoming rays. Assuming an average wavelength of 5500 A, how many photons per cm2 -min is this? 13. What are the frequency, wavelength, and momentum of a photon whose energy equals the rest mass energy of an electron? 14. In the photon picture of radiation, show that if beams of radiation of two different wavelengths are to have the same intensity (or energy density) then the numbers of the photons per unit cross-sectional area per sec in the beams are in the same ratio as the wavelengths. 15. Derive the relation PHOTO NS- PARTICLELIK E PRO PERTIESO F RAD IATIO N Lc) 26. 27. 28. 29. 30. 31. 32. 33. 34. across an x-ray tube if the subsequent bremsstrahlung radiation is to be capable of pair production? A 20 KeV electron emits two bremsstrahlung photons as it is being brought to rest in two successive decelerations. The wavelength of the second photon is 1.30 A longer than the wavelength of the first. (a) What was the energy of the electron after the first deceleration, and (b) what are the wavelengths of the photons? A y ray creates an electron-positron pair. Show directly that, without the presence of a third body to take up some of the momentum, energy and momentum cannot both be conserved. (Hint: Set the energies equal and show that this leads to unequal momenta before and after the interaction.) A y ray can produce an electron-positron pair in the neighborhood of an electron at rest as well as a nucleus. Show that in this case the threshold energy is 4m 0c2 . (Hint: Do not ignore the recoil of the original electron, but assume that all three particles move off together.) A particular pair is produced such that the positron is at rest and the electron has a kinetic energy of 1.0 MeV moving in the direction of flight of the pair-producing photon. (a) Neglecting the energy transferred to the nucleus of the nearby atom, find the energy of the incident photon. (b) What percentage of the photon's momentum is transferred to the nucleus? Assume that an electron-positron pair is formed by a photon having the threshold energy for the process. (a) Calculate the momentum transferred to the nucleus in the process. (b) Assume the nucleus to be that of a lead atom and compute the kinetic energy of the recoil nucleus. Are we justified in neglecting this energy compared to the threshold energy assumed above? An electron-positron pair at rest annihilate, creating two photons. At what speed must an observer move along the line of the photons in order that the wavelength of one photon be twice that of the other? Show that the results of Example 2-8, expressed in terms of p and t, are valid independent of the assumed area of the slab. Show that the attenuation length A is just equal to the average distance a photon will travel before being scattered or absorbed. Use the data of Figure 2-17 to calculate the thickness of a lead slab which will attenuate a beam of 10 keV x rays by a factor of 100. 3 DE BROGLIE'S POSTULATE WAVELIKE PROPERTIES OF PARTICLES 3-1 MATTER WAVES 56 de Broglie's postulate; de Broglie wavelength; Davisson - Germer experiment; Thomson experiment; diffraction of helium atoms and neutrons 3-2 THE WAVE-PARTICLE DUALITY 62 complementarity principle; Einstein's interpretation of duality for radiation; Born's interpretation of duality for matter; wave functions; superposition principle 3-3 THE UNCERTAINTY PRINCIPLE 65 statement of principle; interpretation; Bohr's explanation of its physical origin 3-4 PROPERTIES OF MATTER WAVES 69 wave and group velocities; equality of particle velocity and group velocity; spread of reciprocal wavelengths and frequencies in a wave group; derivation of uncertainty principle from de Broglie postulate; width of a quantum state 3-5 SOME CONSEQUENCES OF THE UNCERTAINTY PRINCIPLE 77 relation to complementarity; limitations imposed on quantum mechanics 3-6 THE PHILOSOPHY OF QUANTUM THEORY 79 Copenhagen interpretation of Bohr and Heisenberg; points of view of Einstein and de Broglie QUESTIONS 80 PROBLEMS 81 55 DE BROG LIE 'S POSTU LATE Co 3-1 MATTER WAVES Maurice de Broglie was a French experimental physicist who, from the outset, had supported Compton's view of the particle nature of radiation. His experiments and discussions impressed his brother Louis so much with the philosophic problems of physics at the time that Louis changed his career from history to physics. In his doctoral thesis, presented in 1924 to the Faculty of Science at the University of Paris, Louis de Broglie proposed the existence of matter waves. The thoroughness and originality of his thesis was recognized at once but, because of the apparent lack of experimental evidence, de Broglie's ideas were not considered to have any physical reality. It was Albert Einstein who recognized their importance and validity and in turn called them to the attention of other physicists. Five years later de Broglie won the Nobel Prize in physics, his ideas having been dramatically confirmed by experiment. The hypothesis of de Broglie was that the dual, that is wave-particle, behavior of radiation applies equally well to matter. Just as a photon has a light wave associated with it that governs its motion, so a material particle (e.g., an electron) has an associated matter wave that governs its motion. Since the universe is composed entirely of matter and radiation, de Broglie's suggestion is essentially a statement about a grand symmetry of nature. Indeed, he proposed that the wave aspects of matter are related to its particle aspects in exactly the same quantitative way that is the case for radiation. According to de Broglie, for matter and for radiation alike the total energy E of an entity is related to the frequency y of the wave associated with its motion by the equation E = by (3-la) and the momentum p of the entity is related to the wavelength 2 of the associated wave by the equation p = h/2 (3-1b) Here the particle concepts, energy E and momentum p, are connected through Planck's constant h to the wave concepts, frequency y and wavelength A. Equation (3-1b), in the following form, is called the de Broglie relation (3-2) 2 = h/p It predicts the de Broglie wavelength 2 of a matter wave associated with the motion of a material particle having a momentum p. (a) What is the de Broglie wavelength of a baseball moving at a speed v = 10 m/sec? • Assume in = 1.0 kg. From (3-2) h 6.6 x 10 -34 joule-sec h m=6.6 x 10 2s A 6.6 x 10 - gs p my 1.0 kg x 10 m/sec (b) What is the de Broglie wavelength of an electron whose kinetic energy is 100 eV? • Here h 6.6 x 10 -34 joule-sec h V2 m (2 x 9.1 x 10 -31 kg x 100 eV x 1.6 x 10 -19 joule/eV) 1 /2 P K Example 3 1. - A= - _—_ = 6.6 x 10 -34 joule-sec = 1.2 x 10 10 m = 1.2 A 5.4 x 10 -24 kg-m/sec The wave nature of light propagation is not revealed by experiments in geometrical optics, for the important dimensions of the apparatus used there are very large compared to the wavelength of light. If a represents a characteristic dimension of an optical apparatus (e.g., the width of a lens, mirror, or slit) and 2 is the wavelength of the light passing through the apparatus, we are in the domain of geometrical optics T Sv F . C Figure 3-1 The apparatus of Davisson and Germer. Electrons from filament F are accelerated by a variable potential difference V. After scattering from crystal C they are collected by detector D. S3AVM h1311`d W when 2/a -+ 0. The reason is that the diffraction effects in any apparatus are always confined to angles of about 9 = 2/a, so diffraction effects are completely negligible when 1/a -i 0. Note that geometrical optics involves ray propagation, which is similar to the trajectory motion of classical particles. However, when the characteristic dimension a of an optical apparatus becomes comparable to, or smaller than, the wavelength 2 of the light going through it, we are in the domain of physical optics. In this case, where 2/a $ 1, the diffraction angle 0 = 2/a is large enough that diffraction effects are easily observed and the wave nature of light propagation becomes apparent. To observe wavelike aspects in the motion of matter, therefore, we need systems with apertures or obstacles of suitably small dimensions. The finest scale systems of apertures available to experimentalists at the time of de Broglie made use of the spacing between adjacent planes of atoms in a solid, where a 1 A. (Now systems are available involving nuclear dimensions of ^ 10 -4 A.) Considering the de Broglie wavelengths evaluated in Example 3-1, we see that we cannot expect to detect any evidence of wavelike motion for a baseball, where 2/a 10 -25 for a ^ 1 A; but for a material particle of very much smaller mass than a baseball, the momentum p is reduced, and the de Broglie wavelength A = h/p is increased sufficiently for diffraction effects to be observable. Using apparatus with characteristic dimensions a = 1 A, wavelike aspects in the motion of the 2 = 1.2 A electron of Example 3-1 should be very apparent. Elsasser pointed out, in 1926, that the wave nature of matter might be tested in the same way that the wave nature of x rays was first tested, namely by allowing a beam of electrons of appropriate energy to fall on a crystalline solid. The atoms of the crystal serve as a three-dimensional array of diffracting centers for the electron wave, and so they should strongly scatter electrons in certain characteristic directions, just as for x-ray diffraction. This idea was confirmed in experiments by Davisson and Germer in the United States and by Thomson in Scotland. Figure 3-1 shows schematically the apparatus of Davisson and Germer. Electrons from a heated filament are accelerated through a potential difference V and emerge from the "electron gun" G with kinetic energy eV. This electron beam falls at normal incidence on a single crystal of nickel at C. The detector D is set at a particular angle 9 and readings of the intensity of the scattered beam are taken at various values of the accelerating potential V. Figure 3-2, for example, shows that a strong scattered electron beam is detected at 9 = 50° for V = 54 V. The existence of this peak in the DE B ROG LIE 'S POSTULATE 35 40 45 50 Kinetic energy (eV) B Figure 3 2 Left: The collector current in detector D of Figure 3-1 as a function of the kinetic energy of the incident electrons, showing a diffraction maximum. The angle 0 in Figure 3-1 is adjusted to 50 ° . If an appreciably smaller or larger value is used, the diffraction maximum disappears. Right: The current as a function of detector angle for the fixed value of electron kinetic energy 54 eV. - electron scattering pattern demonstrates qualitatively the validity of de Broglie's postulate because it can only be explained as a constructive interference of waves scattered by the periodic arrangement of the atoms into planes of the crystal. The phenomenon is precisely analogous to the well-known "Bragg reflections" which occur in the scattering of x rays from the atomic planes of a crystal. It cannot be understood on the basis of classical particle motion, but only on the basis of wave motion. Classical particles cannot exhibit interference, but waves can! The interference involved here is not between waves associated with one electron and waves associated with another. Instead, it is an interference between different parts of the wave associated with a single electron that have been scattered from various regions of the crystal. This can be demonstrated by using an electron beam of such low intensity that the electrons go through the apparatus one at a time, and by showing that the pattern of the scattered electrons remains the same. Figure 3-3 shows the origin of a Bragg reflection, obeying the Bragg relation derived in the caption to that figure (3-3) n2.= 2 d sin ce For the conditions of Figure 3-3 the effective interplanar spacing d can be shown by x-ray scattering from the same crystal to be 0.91 A. Since B = 50°, it follows that cp = 90° — 50°/2 = 65°. The wavelength calculated from (3-3), assuming n = 1, is A= 2 d sin cp= 2 x 0.91 A x sin 65° = 1.65 A The de Broglie wavelength for 54 eV electrons, calculated from (3-2), is = h/p = 6.6 x 10' 34 j oule-sec/4.0 x 10 - 24 kg-m/sec = 1.65 A This impressive agreement gives quantitative confirmation of de Broglie's relation between A, p, and h. The breadth of the observed peak in Figure 3-2 is easily understood, also, for low-energy electrons cannot penetrate deeply into the crystal, so that only a small number of atomic planes contribute to the diffracted wave. Hence, the diffraction maximum is not sharp. Indeed, all the experimental results were in excellent qualitative and quantitative agreement with the de Broglie prediction, and they provided convincing evidence that material particles move according to the laws of wave motion. In 1927, G. P. Thomson showed the diffraction of electron beams passing through thin films and independently confirmed the de Broglie relation 2 = h/p in detail. Whereas the Davisson-Germer experiment is like Laue's in x-ray diffraction (reflection from the regular array of atomic planes in a large single crystal), Thomson's experiment is similar to the Debye-Hull-Scherrer method of powder diffraction of x rays (transmission through an aggregrate of very small crystals oriented at random). wavelike scattering from the family of atomic planes shown, which have a separation distance d = 0.91 A. The Bragg angle is cp = 65° . For simplicity, refraction of the scattered wave as it leaves the crystal surface is not indicated. Bottom: Derivation of the Bragg relation, showing only two atomic planes and two rays of the incident and scattered beams. If an integral number of wavelengths n t just fit into the distance 21 from incident to scattered wave fronts measured along the lower ray, then the contributions along the two rays to the scattered wave front will be in phase and a diffraction maximum will be obtained at the angle go. Since lid = cos(90° — (p) = sin cp, we have 21= 2d sin cp, and so we obtain the Bragg relation nil = 2d sin cp. The "first order" diffraction maximum (n = 1) is usually most intense. Thomson used higher-energy electrons, which are much more penetrating, so that many hundred atomic planes contribute to the diffracted wave. The resulting diffraction pattern has a sharp structure. In Figure 3-4 we show, for comparison, an x-ray diffraction pattern and an electron diffraction pattern from polycrystalline substances (substances in which a large number of microscopic crystals are oriented at random). It is of interest that J. J. Thomson, who in 1897 discovered the electron (which he characterized as a particle with a definite charge-to-mass ratio) and was awarded the Nobel Prize in 1906, was the father of G. P. Thomson, who in 1927 experimentally discovered electron diffraction and was awarded the Nobel Prize (with Davisson) in 1937. Max Jammer writes of this, "One may feel inclined to say that Thomson, the father, was awarded the Nobel Prize for having shown that the electron is a particle, and Thomson, the son, for having shown that the electron is a wave." Not only electrons but all material objects, charged or uncharged, show wavelike characteristics in their motion under the conditions of physical optics. For example, Estermann, Stern, and Frisch performed quantitative experiments on the diffraction of molecular beams of hydrogen and atomic beams of helium from a lithium fluoride crystal; and Fermi, Marshall, and Zinn showed interference and diffraction phenomena for slow neutrons. In Figure 3-5 we show a neutron diffraction pattern for a sodium chloride crystal. Even an interferometer operating with electron beams has been constructed. The existence of matter waves is well established. It is instructive to note that we had to go to relatively long de Broglie wavelengths to find experimental evidence for the wave nature of matter. For both large and small S3AdM1:1311`d W Figure 3-3 Top: The strong diffracted beam at 9 = 50° and V = 54 V arises from 0 Photographic plate DE BRO GLIE 'S PO STULATE CO Incident beam of x rays or electrons Crystalline film Figure 3-4 Top: The experimental arrangement for Debye -Scherrer di ff raction of x rays or electrons by a polycrystalline material. Bottom left: Debye- Scherrer pattern of x-ray diffraction by zirconium oxide crystals. Bottom right: Debye -Scherrer pattern of electron diffraction by gold crystals. wavelengths, both matter and radiation have both particle and wave aspects. The particle aspects are emphasized when their emission or absorption is studied, and the wave aspects are emphasized when their behavior in moving through a system is studied. But the wave aspects of their motion become more difficult to observe as their wavelengths become shorter. Once again we see the central role played by Planck's constant h. If h were zero then in A = h/p we would obtain A = 0 in all circumstances. All material particles would then always have a wavelength smaller than any characteristic dimension, and diffraction effects could never be observed. Although the value of h is definitely not zero, it is small. It is the smallness of h that obscures the existence of matter waves in the macroscopic world, for we must have very small momenta to obtain measurable wavelengths. For ordinary macroscopic particles the mass is so large that the momentum is always sufficiently large to make the de Broglie wavelength small enough to be beyond the range of experimental detection, and classical mechanics reigns supreme. In the microscopic world the masses of material particles are so small that their momenta are small even when their velocities are quite high. Thus their de Broglie wavelengths are large enough to be comparable to characteristic dimensions of systems of interest, such as atoms, and the wavelike properties are experimentally observable in their motion. But we should not forget that in their interaction, for instance when they are detected, their particlelike properties dominate even when their wavelengths are large. In the experiments with helium atoms referred to earlier, a beam of atoms of nearly uniform speed of 1.635 x 10 5 cm/sec was obtained by allowing helium gas to escape Example 3-2. a) 1 S3Ab'M 1:i311`dW Figure 3-5 Top: Laue pattern of x-ray diffraction by a single sodium choride crystal. Bottom: Laue pattern of diffraction of neutrons from a nuclear reactor by a single sodium choride crystal. N DE BROGLIE 'S POSTULATE ^ through a small hole in its enclosing vessel into an evacuated chamber and then through narrow slits in parallel rotating circular disks of small separation (a mechanical velocity selector). A strongly diffracted beam of helium atoms was observed to emerge from the lithium fluoride crystal surface upon which the atoms were incident. The diffracted beam was detected with a highly sensitive pressure gage. The usual crystal diffraction analysis of the experimental results indicated a wavelength of 0.600 x 10 -8 cm. How does this agree with the calculated de Broglie wavelength? The mass of a helium atom is M 4.00 g/mole m= = = 6.65 x 10 -27 kg No 6.02 x 1023 atom/mole According to the de Broglie equation the wavelength then is h ci L U h p my 6.63 x 10 -34 joule-sec = 0.609 x 10 -10 m 6.65 x 10 -27 kg x 1.635 x 10 3 m/sec = 0.609 x 10 -8 cm This result, 1.5% greater than the value measured by crystal diffraction, is well within the limits of error of the experiment. 4 Experiments like the one considered in Example 3-2 are very difficult since the intensities obtainable in atomic beams are quite low. Neutron diffraction experiments, using crystals of known lattice spacing, give confirmation of the existence of matter waves and precise confirmation of de Broglie's equation. The precision is due to the fact that the supply of neutrons from nuclear reactors is copious. Indeed, neutron diffraction is now an important method of studying crystal structure. Certain crystals, such as hydrogenous organic ones, are particularly well suited to neutron diffraction analysis, since neutrons are strongly scattered by hydrogen atoms whereas x rays are very weakly scattered by them. X rays interact chiefly with electrons in the atom, and electrons interact with the nuclear charge of the atom as well as the atomic electrons by electromagnetic forces, so that their interaction with hydrogen atoms is weak because the charge is small. Neutrons interact principally with the nucleus of the atom by nuclear forces, however, and the interaction is strong. 3-2 THE WAVE-PARTICLE DUALITY In classical physics energy is transported either by waves or by particles. Classical physicists observed water waves carrying energy over the water surface or bullets transferring energy from gun to target. From such experiences they built a wave model for certain macroscopic phenomena and a particle model for other macroscopic phenomena, and they quite naturally extended these models into visually less accessible regions. Thus they explained sound propagation in terms of a wave model and pressures of gases in terms of a particle model (kinetic theory). Their successes conditioned them to expect that all entities are either particles or waves. Indeed, these successes extended into the early twentieth century with applications of Maxwell's wave theory to radiation and the discovery of elementary particles of matter, such as the neutron and positron. Hence, classical physicists were quite unprepared to find that to understand radiation they needed to invoke a particle model in some situations, as in the Compton effect, and a wave model in other situations, as in the diffraction of x rays. Perhaps more striking is the fact that this same wave-particle duality applies to matter as well as to radiation. The charge-to-mass ratio of the electron and its ionization trail in matter (a sequence of localized collisions) suggest a particle model, but electron diffraction suggests a wave model. Physicists now know that they are compelled to use both models for the same entity. It is very important to note, however, that in any given measurement only one model applies—both models are not used under the same circumstances. When the entity is detected by some kind of interaction, it acts I= (1/poc)& 2 = hvN so that e2 is proportional to N. Einstein's interpretation of e2 as a probability measure of photon density then becomes clear. We expect that, as in kinetic theory, fluctuations about an average will become more noticeable at low intensities than at l.11 -1`d fla310I1.8dd -3A`dM3 H1 like a particle in the sense that it is localized; when it is moving it acts like a wave in the sense that interference phenomena are observed, and, of course, a wave is extended, not localized. Neils Bohr summarized the situation in his principle of complementarity. The wave and particle models are complementary; if a measurement proves the wave character of radiation or matter, then it is impossible to prove the particle character in the same measurement, and conversely. Which model we use is determined by the nature of the measurement. Furthermore, our understanding of radiation, or of matter, is incomplete unless we take into account measurements which reveal the wave aspects and also those that reveal the particle aspects. Hence, radiation and matter are not simply waves nor simply particles. A more general and, to the classical mind, a more complicated model is needed to describe their behavior, even though in extreme situations a simple wave model or a simple particle model may apply. The link between wave model and particle model is provided by a probability interpretation of the wave-particle duality. In the case of radiation it was Einstein who united the wave and particle theories; subsequently Max Born applied a similar argument to unite wave and particle theories of matter. In the wave picture the intensity of radiation, I, is proportional to 6' 2, where 6' 2 is the average value over one cycle of the square of the electric field strength of the wave. (I is the average value of the so-called Poynting vector and we use the symbol g instead of E for electric field to avoid confusion with the total energy E.) In the photon, or particle, picture the intensity of radiation is written as I = Nhv where N is the average number of photons per unit time crossing unit area perpendicular to the direction of propagation. It was Einstein who suggested that g2, which in electromagnetic theory is proportional to the radiant energy in a unit volume, could be interpreted as a measure of the average number of photons per unit volume. Recall that Einstein introduced a granularity to radiation, abandoning the continuum interpretation of Maxwell. This leads to a statistical view of intensity. In this view, a point source of radiation emits photons randomly in all directions. The average number of photons crossing a unit area will decrease with increasing distance from source to area. This is due to the fact that the photons spread over a sphere of larger area the farther they are from the source. Since the area of a sphere is proportional to the square of its radius, we obtain, on the average, an inverse square law of intensity just as in the wave picture. In the wave picture we imagine that spherical waves spread out from the source, the intensity dropping inversely as the square of the distance from the source. Here, these waves, whose strength can be measured by g2 , can be regarded as guiding waves for the photons; the waves themselves have no energy—there are only photons—but they are a construct whose intensity measures the average number of photons per unit volume. We use the word "average" because the emission processes are statistical in nature. We do not specify exactly how many photons cross unit area in unit time, only their average number; the exact number can fluctuate in time and space, just as in kinetic theory of gases there are fluctuations about an average value from many quantities. We can say quite definitely, however, that the probability of having a photon cross unit area 3 m from the source is exactly one-ninth the probability that a photon will cross unit area 1 m from the source. In the formula I = Nhv, therefore, N is an average value and is a measure of the probability of finding a photon crossing unit area in unit time. If we equate the wave expression to the particle expression we have DE B ROGLIE 'S POSTULATE CD M ci. v high intensities, so that the granular quantum phenomena contradict the continuum classical view more dramatically there. In analogy to Einstein's view of radiation, Max Born proposed a similar uniting of the wave-particle duality for matter. This came several years after Schroedinger developed his generalization of de Broglie's postulate, called quantum mechanics. We shall examine Schroedinger's theory quantitatively in later chapters. Here we wish merely to use Born's idea in a qualitative way to set the stage conceptually for the subsequent detailed analysis. Let us associate more than just a wavelength and frequency with matter waves. We do this by introducing a function representing the de Broglie wave, called the wave function 'P. For particles moving in the x direction with a precise value of linear momentum and energy, for example, the wave function can be written as a simple sinusoidal function of amplitude A, such as 'Y(x,t) = A sin 27c (j, x — vt (3-4a) &(x,t) = A sin 27t ( — vt) (3-4b) This is analogous to for the electric field of a sinusoidal electromagnetic wave of wavelength 2, and frequency y, moving in the positive x direction. The quantity 'F 2 will play a role for matter waves analogous to that played by )2 for waves of radiation. That quantity, the average of the square of the wave function of matter waves, is a measure of the probability of finding a particle in unit volume at a given place and time. Just as g is a function of space and time, so is W; and, as we shall see later, just as g satisfies a wave equation, so does 'P (Schroedinger's equation). The quantity g is a (radiation) wave associated with a photon, and 'P is a (matter) wave associated with a material particle. As Born says: "According to this view, the whole course of events is determined by the laws of probability; to a state in space there corresponds a definite probability, which is given by the de Broglie wave associated with the state. A mechanical process is therefore accompanied by a wave process, the guiding wave, described by Schroedinger's equation, the significance of which is that it gives the probability of a definite course of the mechanical process. If, for example, the amplitude of the guiding wave is zero at a certain point in space, this means that the probability of finding the electron at this point is vanishingly small." Just as in the Einstein view of radiation we do not specify the exact location of a photon at a given time, but specify instead by g2 the probability of finding a photon at a certain location at a given time, so here in Born's view we do not specify the exact location of a particle at a given time, but specify instead by 'P 2 the probability of finding a particle at a certain location at a given time. Just as we are accustomed to adding wave functions (g 1 + g2 = g) for two superposed electromagnetic waves whose resultant intensity is given by g2, so we shall add wave functions for two superposed matter waves (Y' 1 + W2 = 'P) whose resultant intensity is given by 'P2 . That is, a principle of superposition applies to matter as well as to radiation. This is in accordance with the striking experimental fact that matter exhibits interference and diffraction properties, a fact that simply cannot be understood on the basis of ideas in classical mechanics. Because waves can be superposed either constructively (in phase) or destructively (out of phase), two waves can combine either to yield a resultant wave of large intensity or to cancel, but two classical particles of matter cannot combine in such a way as to cancel. The student might accept the logic of this fusion of wave and particle concepts but nevertheless ask whether a probabilistic or statistical interpretation is necessary. It was Heisenberg and Bohr who, in 1927, first showed how essential the concept of probability is to the union of wave and particle descriptions of matter and radiation. We investigate these matters in succeeding sections. The use of probability considerations is not foreign to classical physics. Classical statistical mechanics makes use of probability theory, for example. However, in classical physics the basic laws (such as Newton's laws) are deterministic, and statistical analysis is simply a practical device for treating very complicated systems. According to Heisenberg and Bohr, however, the probabilistic view is the fundamental one in quantum physics and determinism must be discarded. Let us see how this conclusion is reached. In classical mechanics the equations of motion of a system with given forces can be solved to give us the position and momentum of a particle at all values of the time. All we need to know are the precise position and momentum of the particle at some value of the time t = 0 (the initial conditions) and the future motion is determined exactly. This mechanics has been used with great success in the macroscopic world, for example in astronomy, to predict the subsequent motions of objects in terms of their initial motions. Note, however, that in the process of making observations the observer interacts with the system. An example from contemporary astronomy is the precise measurement of the position of the moon by bouncing radar from it. The motion of the moon is disturbed by the measurement, but due to the very large mass of the moon the disturbance can be ignored. On a somewhat smaller scale, as in a very well-designed macroscopic experiment on earth, such disturbances are also usually small, or at least controllable, and they can be taken into account accurately ahead of time by suitable calculations. Hence, it was naturally assumed by classical physicists that in the realm of microscopic systems the position and momentum of an object, such as a electron, could be determined precisely by observations in a similar way. Heisenberg and Bohr questioned this assumption. The situation is somewhat similar to that existing at the birth of relativity theory. Physicists spoke of length intervals and time intervals, i.e., space and time, without asking critically how one actually measures them. For example, they spoke of the simultaneity of two separated events without even asking how one would physically go about establishing simultaneity. In fact, Einstein showed that simultaneity was not an absolute concept at all, as had been assumed previously, but that two separated events that are simultaneous to one observer occur at different times to another observer moving with respect to the first. Simultaneity is a relative concept Similarly then, we must ask ourselves how we actually measure position and momentum. Can we determine by actual experiment at the same instant both the position and momentum of matter or radiation? The answer given by quantum theory is: not more accurately than is allowed by the Heisenberg uncertainty principle. There are two parts to this principle, also called the indeterminacy principle. The first has to do with the simultaneous measurement of position and momentum. It states that experiment cannot simultaneously determine the exact value of a component of momentum, px say, of a particle and also the exact value of its corresponding coordinate, x. Instead, our precision of measurement is inherently limited by the measurement process itself such that ApxAx _ > h/2 (3-5) where the momentum px is known to within an uncertainty of Apx and the position x at the same time to within an uncertainty Ax. Here h (read h-bar) is a shorthand symbol for h/2n, where h is Planck's constant. That is h - h/2n 31dI ON Ia dJl1N I `d11i130N f1 3 H1 3-3 THE UNCERTAINTY PRINCIPLE DE BROGLIE 'S POS TULATE There are corresponding relations for other components of momentum, namely Ap yAy > h/2 and Ap,Az > h/2, and for angular momentum as well. It is important to realize that this principle has nothing to do with improvements in instrumentation leading to better simultaneous determinations of px and x. Rather the principle says that even with ideal instruments we can never in principle do better than Ap xAx > h/2. Note also that the product of uncertainties is involved, so that, for example, the more we modify an experiment to improve our measure of px , the more we give up ability to determine x accurately. If px is known exactly we know nothing at all about x (i.e., if Ap x = 0, Ax = co). Hence, the restriction is not on the accuracy to which x or px can be measured, but on the product Ap xAx in a simultaneous measurement of both. The second part of the uncertainty principle has to do with the measurement of the energy E and the time t required for the measurements, as for example, the time interval At during which a photon of energy spread AE is emitted from an atom. In this case AEAt > h/2 (3-6) where AE is the uncertainty in our knowledge of the energy E of a system and At the time interval characteristic of the rate of change in the system. Heisenberg's relations will be shown later to follow from the de Broglie postulate plus simple properties common to all waves. Because the de Broglie postulate is verified by the experiments we have already discussed, it is fair to say that the uncertainty principle is grounded in experiment. We shall also consider soon the consistency of the principle with other experiments. Notice first, however, that it is Planck's constant h that again distinguishes the quantum results from the classical ones. If h, or h, in (3-5) and (3-6) were zero, there would be no basic limitation on our measurement at all, which is the classical view. Again it is the smallness of h that takes the principle out of the range of our ordinary experiences. This is analogous to the smallness of the ratio v/c in macroscopic situations taking relativity out of the range of ordinary experience. In principle, therefore, classical physics is of limited validity and in the microscopic domain it will lead to contradictions with experimental results. For if we cannot determine x and p simultaneously, then we cannot specify the initial conditions of motion exactly; therefore, we cannot precisely determine the future behavior of a system. Instead of making deterministic predictions, we can only state the possible results of an observation, giving the relative probabilities of their occurrence. Indeed, since the act of observing a system disturbs it in a manner that is not completely predictable, the observation changes the previous motion of the system to a new state of motion which cannot be completely known. Let us now illustrate the physical origin of the uncertainty principle. With the insight thereby gained we shall better appreciate a more formal proof given in the following section. First, we use a thought experiment due to Bohr to verify (3-5). Let us say that we wish to measure as accurately as possible the position of a "point" particle, like an electron. For greatest precision we use a microscope to view the electron, as in Figure 3-6. To see the electron we must illuminate it, for it is actually the light photon scattered by the electron that the observer sees. At this stage, even before any calculations are made, we can see the uncertainty principle emerge. The very act of observing the electron disturbs it. The moment we illuminate the electron, it recoils because of the Compton effect, in a way that we shall soon find cannot be completely determined. If we don't illuminate the electron, however, we don't see (detect) it. Hence the uncertainty principle refers to the measuring process itself, and it expresses the fact that there is always an undetermined interaction between observer and observed; there is nothing we can do to avoid the interaction or to allow for it ahead of time. In the case at hand we can try to reduce the disturbance to the electron as much as possible by using a very weak source of light. The very weakest we can get 4. Eyepiece Objective lens Region available to photons entering lens x Electron Light source x-component of scattered photon momentum, (h/X) sin B I 16 / \ I ^ x-component of recoil electron momentum, (h/X) sin B Scattered photon momentum Photon of momentum h/X incident Ax Figure 3-6 Bohr's microscope thought experiment. Top: The apparatus. Middle: The scattering of an illuminating photon by the electron. Bottom: The diffraction pattern image of the electron seen by the observer. is to assume that we can see the electron if only one scattered photon enters the objective lens of the microscope. The magnitude of the momentum of the photon is p = h/A. But the photon may have been scattered anywhere within the angular range 20' subtended by the objective lens at the electron. This is why the interaction cannot be allowed for. Hence, we find that the x component of the momentum of the photon can vary from + p sin 0' to p sin 0' and is uncertain after the scattering by an amount Apx = 2p sin 0' = (2h/2) sin 0' Conservation of momentum then requires that the electron receive a recoil momentum in the x direction that is equal in magnitude to the x momentum change in the photon and, therefore, the x momentum of the electron is uncertain by this same amount. Notice that to reduce Ap x we can use light of longer wavelength, or use a microscope with an objective lens subtending a smaller angle. What about the location along x of the electron? Recall that a microscope's image of a point object is not a point, but a diffraction pattern; the image of the electron — 31dIJN Ia d AlNI `d11:130N f1 3 H1E-C'oeS rn 0bserver DE BROGLIE 'S POS TULATE is "fuzzy." The resolving power of a microscope determines the ultimate accuracy to which the electron can be located. If we take the width of the central diffraction maximum as a measure of the uncertainty in x, a well-known expression for the resolving power of a microscope gives Ax = 2/sin 9' (Note that, since sin 0 ^ 0, this is an example of the general relation a ^ 2/0 between the characteristic dimension in a diffraction apparatus, the wavelength of the diffracted waves, and the diffraction angle.) The one scattered photon at our disposal must have originated then somewhere within this range from the axis of the microscope, so the uncertainty in the electron's location is Ax. (We cannot be sure exactly where any one photon originates even though in a large number of repetitions of the experiment the photons forming the total image will produce the diffraction pattern shown in the figure.) Notice that to reduce Ax we can use light of shorter wavelength, or a microscope with an objective lens subtending a larger angle. If now we take the product of the uncertainties we find Ap xAx = I 2h sin e'^ 2h (si n B') — (3-7) in reasonable agreement with the ultimate limit h/2 set by the uncertainty principle. We cannot simultaneously make Apx and Ax as small as we wish, for the procedure that makes one small makes the other large. For instance, if we use light of short wavelength (e.g., y rays) to reduce Ax by obtaining better resolution, we increase the Compton recoil and increase Ap x , and conversely. Indeed, the wavelength 2 and the angle B' subtended by the objective lens do not even appear in the result. In practice an experiment might do much worse than (3-7) suggests, for that result represents the very ideal possible. We arrive at it, however, from genuinely measurable physical phenomena, namely the Compton effect and the resolving power of a lens. There really should be no mystery in the student's mind about our result. It is a direct result of quantization of radiation. We had to have at least one photon illuminating the electron, or else no illumination at all; and even one photon carries a momentum of magnitude p = h/.1. It is this single scattered photon that provides the necessary interaction between the microscope and the electron. This interaction disturbs the particle in a way that cannot be exactly predicted or controlled. As a result, the coordinates and momentum of the particle cannot be completely known after the measurement. If classical physics were valid, then since radiation is regarded there as continuous rather than granular, we could reduce the illumination to arbitrarily small levels and deliver arbitrarily small momentum while using arbitrarily small wavelengths for "perfect" resolution. In principle there would be no simultaneous lower limit to resolution or momentum recoil and there would be no uncertainty principle. But we cannot do this; the single photon is indivisible. Again we see, from ApxAx _ > h/2, that Planck's constant is a measure of the minimum uncontrollable disturbance that distinguishes quantum physics from classical physics. Now let us consider (3-6) relating energy and time uncertainties. For the case of a free particle we can obtain (3-6) from (3-5), which relates position and momentum, as follows. Consider an electron moving along the x axis whose energy we can write as E = p!/2m. If px is uncertain by Apr , then the uncertainty in E is given by AE = (pxlm)Ap x = vxAp x. Here vx can be interpreted as the recoil velocity along x of the electron which is illuminated with light in a position measurement. If the time interval required for the measurement is At, then the uncertainty in its x position is Ax = vxAt. Combining At = Ax/vx and AE = vxAp x, we obtain AEAt = Ap xAx. But ApxAx _ > h/2. Hence AEA t > h/2 — - 3-4 PROPERTIES OF MATTER WAVES In this section we shall derive the uncertainty principle relations by combining the de Broglie-Einstein relations, p = h/2 and E = hv, with simple mathematical properties that are universal to all waves. We begin a development of these properties by calling attention to an apparent paradox. The velocity of propagation w of a wave with wavelength and frequency 2 and y is given by the familiar relation, which we shall verify later (3-8) w 2v Let us evaluate w for a de Broglie wave associated with a particle of momentum p. and total energy E. We obtain hE E w=2v=--=— ph p Now assume the particle is moving at nonrelativistic velocity y in a region of zero potential energy. (The validity of our conclusions will not be limited by these assumptions.) Evaluating p and E in terms of y and the mass m of the particle, we find w= E mv 2/2 y =2 p = my (3-9) This result seems disturbing because it appears that the matter wave would not be able to keep up with the particle whose motion it controls. However, there is really no difficulty, as the following argument shows. Imagine that a particle is moving along the x axis under the in fluence of no force because its potential energy has the constant value zero. Moving along that axis is also its associated matter wave. Assume, for the sake of this thought experiment, that we have distributed along the axis a set of (hypothetical) instruments which are capable of measuring the amplitude of the matter wave. At some time, say t = 0, we record S3Ab'M 1:1311`d W 3 0 53 111:13dOad Example 3 - 3. The speed of a bullet (m = 50 g) and the speed of an electron (m = 9.1 x 10 -28 g) are measured to be the same, namely 300 m/sec, with an uncertainty of 0.01%. With what fundamental accuracy could we have located the position of each, if the position is measured simultaneously with the speed in the same experiment? •For the electron p = my = 9.1 x 10 -31 kg x 300 m/sec = 2.7 x 10 -28 kg-m/sec and Op = mOv = 0.0001 x 2.7 x 10 -28 kg-m/sec = 2.7 x 10 -32 kg-m/sec so that h6.6 x 10 - 34 joule-sec =2 x 10 -3 m=0.2cm Ax > 47rOp 4rc x 2.7 x 10 -32 kg-m/sec For the bullet p = my = 0.05 kg x 300 m/sec = 15 kg-m/sec and Ap = 0.0001 x 15 kg-m/sec = 1.5 x 10 -3 kg-m/sec so that h _ 6.6 x 10 -34 joule-sec = 3 x 10 -32 m ^x 4nAp 47r x 1.5 x 10 -3 kg m/sec Hence, for macroscopic objects such as bullets the uncertainty principle sets no practical limit to our measuring procedure, Ax in this example being about 10 -17 times the diameter of a nucleus; but, for microscopic objects such as electrons, there are practical limits, Ax in this 1 example being about 10' times the diameter of an atom. ^ t =0 4'(x, t) DE BROGLIE 'S PO STULATE ^ Figure 3 7 - A de Broglie wave for a particle. the readings of these instruments. The results of the experiment can be presented as a plot of the instantaneous values of the wave, which we designate by the symbol `If(x,t), as a function of x at a fixed time t = O. It is not necessary to know much about matter waves at present to realize that the plot must look qualitatively like the one shown in Figure 3-7. The amplitude of the matter wave must be modulated in such a way that its value is nonzero only over some finite region of space in the vicinity of the particle. This is necessary because the matter wave must somehow be associated in space with the particle whose motion it controls. The matter wave is in the form of a group of waves and, as time passes, the group surely must move along the x axis with the same velocity as the particle. The student may recall, from his study of classical wave motion, that for such a moving group of waves it is necessary to distinguish between the velocity g of the group and the quite different velocity w of the individual oscillations of the waves. This is encouraging, but of course we must prove that g is equal to the velocity of the particle. To do this, we develop a relation between g and the quantities v and 2 comparable to the relation of (3-8) between w and these two quantities. We start by considering the simplest type of wave motion, a sinusoidal wave of frequency v and wavelength 2, which is of constant unit amplitude from — co to + co, but which is moving with uniform velocity in the direction of increasing x. Such a wave can be represented mathematically by the function T(x,t) = sin 2n x — vt) (3-10a) or, in a more convenient form 'P(x,t) = sin 2n(Kx — vt) where K - 1/ (3-10b) That this does represent the wave just described can be seen from the following considerations: 1. Holding x fixed at any value, we see that the function oscillates in time sinusoidally with frequency v and amplitude one. 2. Holding t fixed, we see that the function has a sinusoidal dependence on x, with wavelength 2 or reciprocal wavelength K. 3. The zeros of the function, which correspond to the nodes of the wave it represents, are found at positions x„ for which 2n(Kx„ — vt) = 1rn n = 0, +1, + 2, .. . Or n v x= +— t „ 2K K Thus these nodes, and in fact all points on the wave, are moving in the direction of increasing x with velocity w = dx„/dt which is equal to Note that this is identical with (3-8) since K = 1/ A. Next we discuss the case in which the amplitude of the waves is modulated to form a group. We can obtain mathematically one group of waves moving in the direction of increasing x, similar to the group of matter waves pictured in Figure 3-7, by adding together an infinitely large number of waves of the form of (3-10b), each with infinitesimally differing frequencies v and reciprocal wavelengths K. (We shall soon explain how this happens.) The mathematical techniques become a little involved, however, and for our purposes it will suffice to consider what happens when we add together only two such waves. Thus we take 'P(x,t) = P 1(x,t) + W 2 (x,t) (3-11) where 'Y 1(x,t) = sin 27r[Kx — vt] and 'P2(x,t) = sin 27r[(K + dK)x (v + dv)t] Now sin A + sin B = 2 cos [(A — B)/2] sin [(A + B)/2] Applying this to the case at hand, we have [(2K + dK) CdK dv (2v + dv) W(x,t) = 2 cos 27r 2 x — 2 t sin 2^rL x 2 t 2 — Since dv « 2v and dK « 2K, this is dv t I sin 2ir(Kx vt) (3-12) (-e A plot of 'V(x,t) as a function of x for a fixed value of t. = 0 is shown in Figure 3-8. The second term of'F(x,t) is a wave of the same form as (3-10b), but this wave is modulated by the first term so that the oscillations of 'P(x,t) fall within an envelope of periodically varying amplitude. Two waves of slightly different frequency and reciprocal wavelength alternately interfere and reinforce in such a way as to produce a succession of groups. These groups, and the individual waves which they contain, are both moving in the direction of increasing x. The velocity w of the individual waves can be evaluated by considering the second term of 1(x,t), and the velocity g of the groups can be evaluated from the first term. Proceeding as in consideration 3, we find again k(x,t) = 2 cos 27r x— — — V W =— (3-13a) K w etc. t =0 etc. \ > x ^ i 1 dtc ^ g Figure 3-8 The sum of two sinusoidal waves of slightly different frequencies and reciprocal wavelengths K. S3AdMa3 11b'WJOS3I 1a 3 dOad W = V/K N DE BROGLIE 'S P OSTU LATE ti CM Q„ and also the new result g = dv/2 _ dv dK/2 dK (3-13b) It can be shown that, for an infinitely large number of waves that combine to form one moving group, the dependence of the wave velocity w, and the group velocity g, on v, K, and dv/dK is exactly the same as for the simple case we have considered. Equations (3-13a) and (3-13b) have general validity. Finally we are in a position to calculate the group velocity g of the group of matter waves associated with the moving particle. From the Einstein and de Broglie relations, we have v = E/h and K = VA= p/h SO L dv = dE/h U and dK = dp/h Thus the group velocity is g = dv/dK = dE/dp Setting 2 E=n we obtain z a d p = my dE_ my dv _ —v dp mdv which gives us the satisfying result that g=v The velocity of the group of matter waves is just equal to the velocity of the particle whose motion they govern, and de Broglie's postulate is internally consistent. The same conclusion is obtained when relativistic expressions for E and p are used in evaluating dE/dp. Now we shall derive the uncertainty relations by combining the de Broglie-Einstein relations, p = h/2 and E = hv, with properties of groups of waves. First consider a simple limiting case. Let 2 be the wavelength of a de Broglie wave associated with a particle. We can picture a definite (monochromatic) wavelength in terms of a single sinusoidal wave extending over all values of x, i.e., an infinitely long unmodulated wave like = A sin 2ir(Kx — vt) or = A cos 27r(Kx — vt) If the wavelength has the definite value 2 there is no uncertainty AA and the associated particle momentum p = h/2 is also definite so Ap x = O. In such a wave the amplitude has the constant value A everywhere; it is the same over the entire infinite range of x. Therefore, the probability of finding the particle, which Born tells us is to be related to the amplitude of the wave, is not concentrated in a particular range of x. In other words, the location of the particle is completely unknown. The particle can be anywhere, so that Ax = c . Analogous statements are that since E = hv, and since the frequency is definite, then AE = O. But to be sure that the amplitude of the wave is perfectly constant in time we must observe the wave for an infinite time, so that At = co. For this simple case we satisfy Ap xAx > h/2, and AEAt > h/2, in the limits Ap x = O, Ax = co, and AE = O, At = co. S3AVM1:1311VW d 0S3111:13 dOad In order to have a wave whose amplitude varies with x or t, we must superpose several monochromatic waves of different wavelengths or frequencies. For two such waves superposed we obtain the familiar phenomenon of beats, as we have seen earlier in this section, with the amplitude being modulated in a regular way throughout space or time. If we wish to construct a wave having a finite extent in space (a single group with a definite beginning and end), then we must superpose sinusoidal waves having a continuous spectrum of wavelengths with a range A2. The amplitude of such a group will be zero everywhere outside a region of extent Ax. To help visualize this, consider first a case in which we superpose a finite number of sinusoidal waves of slightly different wavelengths ), or reciprocal wavelengths K. Figure 3-9 shows seven component sinusoidal waves 111 K = A, cos 2n(Kx — vt), at time t = O. Their reciprocal wavelengths K = 1/). take on integral values from K = 9 to K = 15. The amplitude of each is given by A K , with Al2 = 1 , A13 = A11 = 1/2, A 14 = A 10 = 1/3, and A15 = A9 = 1/4, as shown in the figure. All the waves are in phase at x = 0 where they are centered (this is why cosines are used), but they get out of phase with one another proceeding in either direction from that point. As a result, their sum 'P = T9 + • • • +'Y15 oscillates with maximum amplitude at x = 0, but its oscillations die out with increasing or decreasing x as the phase relations of the component waves get scrambled. The superposition thus contains a group whose extent in space Ax has a value that can be read from the figure to be slightly larger than 1/12, if we adopt the usual convention and measure from maximum amplitude to half-maximum amplitude. With an analogous convention, the range of reciprocal wavelengths used to compose the group, AK, has a value of 1. Note that the approximate value of the product AxAK equals 1/12. Indicated on the right edge of the figure is the presence of an auxiliary group, of the same shape as the central group. Auxiliary groups are formed at uniformly spaced intervals along the positive and negative x axis. They occur because, with only a finite number of component waves, there are points on the axis separated from x = 0 by a distance which is exactly some different integral number of wavelengths for each component. At these points the components are in phase again, and so the group is repeated. If the number of component waves spanning a fixed range AK of reciprocal wavelengths is doubled, the width of the central group will be essentially unchanged but the distances separating it from the auxiliary groups will be doubled. If we combine an infinitely large number of sinusoidal component waves, each with infinitesimally different reciprocal wavelength drawn from the same range K = 9 to 15, we obtain a central group quite similar to the one shown in Figure 3-9, but the auxiliary groups will not be present. The reason is that in such a case there is no length of the x axis into which an exactly integral number of wavelengths fits for every one of the infinite number of components. The components are all in phase at and near x = 0, and so they combine constructively to form the group. Proceeding away from this point, in either direction, the component waves begin to get out of phase with each other because their wavelengths or reciprocal wavelengths differ. Beyond certain points the phases of the infinite number of components become completely random, and so the component waves sum up to zero. Furthermore, they never again get back into phase. Thus the components form one group of restricted length Ax. It is clear that the larger the range of reciprocal wavelengths AK from which the components are drawn, the smaller the length Ax of the group; the reason is simply that if the wavelengths cover a bigger span the phases will become random in a shorter distance. In fact, Ax is just inversely proportional to AK. The exact value of the proportionality constant depends on the relative amplitudes of the component waves, as does the exact shape of the group that they form. The mathematics used in carrying out the procedure just described involves the so-called Fourier integral. Appendix D applies the Fourier integral to a simple case, K A DE BR OG LIE 'S POSTULATE qY9 ^ = A A V V V V VV V V A V V V 10 W LAAAAAAÀAAAAAAAA Y 1 VVVYYT 11^ YY 1 VVY 4 '12 11 12 4 '13 13 4 '14 14 4'15 LA V V•V•V•V• V• VnV•V• V•V•V•V•V•V•V•V•V•V•V•V•V —6 —5 —4 —3 — - 1 1 5 6 7 2 3 4 s (units of 1/12) --?- 8 9 10 11 15 12 V Figure 3-9 Showing, at t = 0, the superposition of seven cosine waves "K = A K cos 2zc(Kx — vt) with uniformly spaced reciprocal wavelengths drawn from the range K = 9 to K = 15. Their amplitudes AK maximize at the value Al2 = 1 for the wave whose K lies in the center of the range, and they decrease symmetrically through the values 1/2, 1/3, and 1/4 for the other waves as their K approach the ends of the range. The sum = EK 111K of these waves consists of a group centered on x = 0, plus repeating groups of the same shape periodically spaced along the x axis in both directions from x = O. With Ax defined as the maximum amplitude to half-maximum amplitude width of tP, and AK defined as the range of reciprocal wavelengths of the components of `P from 1, and 1/12, AK maximum amplitude to half-maximum amplitude, we have Ax AXAK ^ 1/12. obtaining numerical results that are similar to the results we obtained from the construction in Figure 3-9. Furthermore, the Fourier integral can be used to prove the following relation AxAK > 1/47r (3-14) This relation states that the optimum job that can be done in composing a group of — Thus the frequency of the group is spread over the range Av if its duration covers the range At, just as its reciprocal wavelength is uncertain to within AK if its width is Ax. Equation (3-15) is also obtained from a Fourier integral. It and (3-14) are different expressions of the same property; but the frequency-time relation, or at least some of its implications, may be more familiar to the student, as the following example shows. The signal from a television station contains pulses of full-width At — 10 - 6 sec. Explain why it is not feasible to transmit television in the AM broadcasting band. •The full-width range of frequencies in the signal is, from (3-15), Av -. 1/10 -6 sec = 106 sec - = 106 Hz. Thus the entire broadcast band (v -= 0.5 x 10 6 Hz to v ^ 1.5 x 10 6 Hz) would be able to accommodate only a single television "channel." There would also be serious difficulties in building transmitters and receivers with such a very large fractional bandpass. At the frequencies used in television transmission (v ^ 10 8 Hz) many channels fit into a rea• sonable portion of the spectrum, and the bandpass requirements are nominal. Example 3 4. - Equations (3-14) and (3-15) are universal properties of all waves. If we apply them to matter waves by combining them with the de Broglie-Einstein relations, we immediately obtain the Heisenberg uncertainty relations. That is, if in AxAK = AxA(1/A) _ > 1/4it we set p = h/2 or 1/A = p/h, we obtain AxA(p/h) _ (1/h)AxAp > 1/4n or ApAx _ > h/2 (3-16) And if in AtAv > 1/4n we set E = hv or v = E/h, we obtain AtA(E/h) = (1/h)AtAE > 1/4n or (3-17) AEAt _ > h/2 These results agree with our original statements of the relations in (3-5) and (3-6). To summarize, we have seen that physical measurement necessarily involves interaction between the observer and the system being observed. Matter and radiation are the entities available to us for such measurements. The relations p = h/A and E = hv apply to matter and to radiation, being the expression of the wave-particle duality. When we combine these relations with the properties universal to all waves we obtain the uncertainty relations. Hence, the uncertainty principle is a necessary consequence of this duality, that is, of the de Broglie-Einstein relations, and the uncertainty principle itself is the basis for the Heisenberg-Bohr contention that probability is fundamental to quantum physics. Example 3 5. An atom can radiate at any time after it is excited. It is found that in a typical case the average excited atom has a life-time of about 10 -8 sec. That is, during this period it emits a photon and is deexcited. - S3AVM 1:1311b'W 3 OS3111:1 3d Od d (half-width at half-maximum amplitude) length Ax from components with reciprocal wavelengths covering a (half-width at half-maximum amplitude) range of AK yields Ax = 1/4nAK, or AxAK = 1/4n. Generally a somewhat larger value of this product is obtained. A group of waves traveling through space of limited extent passes any given point of observation in a limited time. If At is the duration of the group, or pulse, of waves then it necessarily must be composed from component sinusoidals whose frequencies span a range Av, where (3-15) AtAv _ > 1/4n DE BROGLIE 'S POSTULATE (a) What is the minimum uncertainty Av in the frequency of the photon? ^ From (3-15) we have AVAt _> 1/4n or Av _ > 1/4xAt With At = 10 -8 sec we obtain Av > 8 x 10 6 sec -1 . (b) Most photons from sodium atoms are in two spectral lines at about A = 5890 A. What is the fractional width of either line, Av/v? • For 2 = 5890 A, we obtain v = c//1, = 3 x 10 1° cm-sec -1/5890 x 10 -8 cm = 5.1 x 10 14 -1 . Hence Av/v = 8 x 106 sec -1/5.1 x 10 14 sec -1 = 1.6 x 10 -8or about two parts in 100 sec million. This is the so-called natural width of the spectral line. The line is much broader in practice because of the Doppler broadening and pressure broadening due to the motions and collisions of atoms in the source. • (c) Calculate the uncertainty AE in the energy of the excited state of the atom. • The energy of the excited state is not precisely measurable because only a finite time is available to make the measurement. That is, the atom does not stay in an excited state for an indefinite time but decays to its lowest energy state, emitting a photon in the process. The spread in energy of the photon equals the spread in energy of the excited state of the atom in accordance with the energy conservation principle. From (3-17), with At equal to the mean life-time of the excited state, we have AE > h/4n At — 6.63 x 10 -34 joule-sec 47tAt 4ir x 10 -8 sec 4.14 x 10 -15 eV-sec 3.3 x 10 -8 eV 4n x 10 -8 sec h _ This agrees, of course, with the value obtained from part (a) by multiplying the uncertainty in photon frequency Av by h to obtain AE = hAy. The energy spread of an excited state is usually called the width of the state. • (d) From the previous results determine, to within an accuracy AE, the energy E of the excited state of a sodium atom, relative to its lowest energy state, that emits a photon whose wavelength is centered at 5890 A. ^ We have Av/v = hAv/hv = AE/E. Hence, E = AE/(Av/v) = 3.3 x 10 -8 eV/1.6 x 10 -8 = 2.1 eV, in which we have used the results of the calculations in parts (b) and (c). • A measurement is made on the y coordinate of an electron, which is a member of a broad parallel beam moving in the x direction, by introducing into the beam a slit of narrow width Ay. Show that as a result an uncertainty Ap y is introduced in the y component of momentum of the electron, such that Ap yAy > h/2, as required by the uncertainty principle. Do this by considering the diffraction of the wave associated with the electron. • In propagating through the apparatus shown in Figure 3-10, the wave will be diffracted by the slit. The angle 0 to the first minimum of the "single-slit" diffraction pattern sketched in the figure is given by sin 0 = 2/Ay. (This is another example of the general relation 0 ^ Ala between diffraction angle, wavelength, and characteristic dimension of a diffraction apparatus.) Since the propagation of the wave governs the motion of the associated particle, the diffraction pattern also gives the relative probabilities for the electron to arrive at different locations on the photographic plate. Thus the electron passing through the slit will be deflected through an angle which lies anywhere within a range from about —0 to + 0. Even though its y momentum was known with great precision to be zero before passing through the slit (because very little was then known about its y position), after passing the slit where the measurement of its y position was made its y momentum can be anywhere within a range from about —p y to +py , where sin 0 = py /p. So the y momentum of the electron is made uncertain by the y position measurement due to diffraction of the electron wave. The uncertainty is Example 3-6. Ap y ^ p y = p sin 0 = p2/Ay Using the de Broglie relation p = h/a, to connect the momentum of the particle with the wave- y ^ f Slit Photographic plate Figure 3-10 Measurement of the y coordinate of an electron in a broad parallel beam, by requiring it to pass through a slit. The intensity pattern of the diffracted electron wave is indicated by using the line representing the photographic plate as an axis for a plot of the pattern. length of the wave, we obtain or 4py = h/Ay Apydy = h Our result agrees with the limit set by the uncertainty principle. Diffraction, which refers to waves, and the uncertainty principle, which refers to particles, provide alternative but equivalent ways of treating this and all similar problems. • Note that in Example 3-6 the wave associated with a single electron is regarded as being diffracted. The probability that the electron hits some point on the photographic plate is determined by the intensity of the electron wave. If only one electron goes through the apparatus it can hit anywhere except at the zero intensity locations of the diffraction pattern, and it will most likely hit somewhere near the principal maximum. If many electrons go through the apparatus each of their waves is diffracted independently in the same way and their points of arrival on the photographic plate are distributed according to the same pattern. The fact that diffraction phenomena involve interference between different parts of a wave belonging to a single particle, and not interference between waves belonging to different particles, was first shown experimentally by G. I. Taylor for the case of photons and light waves. Using light of such low intensity that the photons were known to be going through a diffraction apparatus one at a time, he obtained, after a very long exposure, a diffraction pattern. Then turning the intensity up to normal levels where many photons were in the apparatus at any time, he obtained the same diffraction pattern. Essentially the same experiment has subsequently been performed for electrons and other material particles. 35 - SOME CONSEQUENCES OF THE UNCERTAINTY PRINCIPLE The uncertainty principle allows us to understand why it is possible for radiation, and matter, to have a dual (wave-particle) nature. If we try experimentally to determine whether radiation is a wave or a particle, for example, we find that an experiment which forces radiation to reveal its wave character strongly suppresses its SO ME C ON SEQUE NC E S O F THEUN CE RTAINTY PRINCI PL E Incident electron beam DE BROGLIE 'S POSTULATE particle character. If we modify the experiment to bring out the particle character, its wave character is suppressed. We can never bring the wave and the particle view face to face in the same experimental situation. Radiation, and also matter, are like coins that can be made to display either face at will but not both simultaneously. This, of course, is the essence of Bohr's principle of complementarity; the ideas of wave and of particle complement rather than contradict one another. Consider Young's two-slit interference experiment with light. On the wave picture the original wave front is split into two coherent wave fronts by the slits, and these overlapping wave fronts produce the interference fringes on the screen that are so characteristic of wave phenomena. Suppose now that we replace the screen by a photoelectric surface. Measurements of where the photoelectrons are ejected from the surface yield a pattern corresponding to the double-slit intensity pattern, so the wavelike aspects of the radiation seem to be present. But if the energy and time distributions of the ejected photoelectrons are measured, we obtain evidence which shows that the radiation consists of photons, so the particlelike aspects will seem to be present. If we then think of radiation as photons whose motion is governed by the wave propagation properties of certain associated (de Broglie) waves, we are faced with another apparent paradox. Each photon must pass through either one slit or the other; if this is the case, how can its motion beyond the slits be influenced by the interaction of its associated waves with a slit through which it did not pass? The fallacy in the paradox lies in the statement that each photon must pass through either one slit or the other. How can we actually determine experimentally whether a photon detected at the screen has gone through the upper or the lower of the two slits? To do this we would have to set up a detector at each slit, but the detector that interacts with the photon at a slit throws it out of the path that it would otherwise follow. We can show from the uncertainty principle that a detector with enough space resolution to determine through which slit the photon passes disturbs its momentum so much that the double-slit interference pattern is destroyed. In other words, if we do prove that each photon actually passes through one slit or the other, we shall no longer obtain the interference pattern. If we wish to observe the interference pattern, we must refrain from disturbing the photons and not try to observe them as particles along their paths to the screen. We can observe either the wave or the particle behavior of radiation; but the uncertainty principle prevents us from observing both together, and so this dual behavior is not really self-contradictory. The same is true of the wave-particle behavior of matter. The uncertainty principle also makes it clear that the mechanics of quantum systems must necessarily be expressed in terms of probabilities. In classical mechanics, if at any instant we know exactly the position and momentum of each particle in an isolated system, then we can predict the exact behavior of the particles of the system for all future time. In quantum mechanics, however, the uncertainty principle shows us that it is impossible to do this for systems involving small distances and momenta because it is impossible to know, with the required accuracy, the instantaneous positions and momenta of the particles. As a result, we shall be able to make predictions only of the probable behavior of these particles. Consider a microscopic particle moving freely along the x axis. Assume that at the instant t = 0 the position of the particle is measured and is uncertain by the amount Axo . Calculate the uncertainty in the measured position of the particle at some later time t. •The uncertainty in the momentum of the particle at t = 0 is at least Apx = h/2Ax 0 Therefore, the velocity of the particle at that instant is uncertain by at least Avx = Apxl in = h/2rnAxo Example 3-7 3-6 THE PHILOSOPHY OF QUANTUM THEORY Although there is agreement by all physicists that quantum theory works in the sense that it predicts results that are in excellent agreement with experiment, there is a growing controversy over its philosophic foundation. Neils Bohr has been the principal architect of the present interpretation, known as the Copenhagen interpretation, of quantum mechanics. His approach is supported by the vast majority of theoretical physicists today. Nevertheless, a sizable body of physicists, not all in agreement with one another, questions the Copenhagen interpretation. The principal critic of this interpretation was Albert Einstein. The EinsteinBohr debates are a fascinating part of the history of physics. Bohr felt that he had met every challenge that Einstein invented by way of thought experiments intended to refute the uncertainty principle. Einstein finally conceded the logical consistency of the theory and its agreement with the experimental facts, but he remained unconvinced to the end that it represented the ultimate physical reality. "God does not play dice with the universe," he said, referring to the abandonment of strict causality and individual events by quantum theory in favor of a fundamentally statistical interpretation. Heisenberg has stated the commonly accepted view succinctly: "We have not assumed that the quantum theory, as opposed to the classical theory, is essentially a statistical theory, in the sense that only statistical conclusions can be drawn from exact data .... In the formulation of the causal law, namely, `If we know the present exactly, we can predict the future,' it is not the conclusion, but rather the premise which is false. We cannot know, as a matter of principle, the present in all its details." Among the critics of the Bohr-Heisenberg view of a fundamental indeterminacy in physics is Louis de Broglie. In a foreward to a book by David Bohm, a young colleague of Einstein's whose attempts at a new theory revived interest in reexamining the philosophic basis of quantum theory, de Broglie writes: "We can reasonably accept that the attitude adopted for nearly 30 years by theoretical quantum physicists is, at least in appearance, the exact counterpart of information which experiment has given us of the atomic world. At the level now reached by research in microphysics it is certain that the methods of measurement do not allow us to determine simultaneously all the magnitudes which would be necessary to obtain a picture of the classical type of corpuscles (this can be deduced from Heisenberg's uncertainty principle), and that the perturbations introduced by the measurement, which are impossible to eliminate, prevent us in general from predicting precisely the result which it will produce and allow only statistical predictions. The construction of purely probabilistic formulae that all theoreticians use today was thus completely justified. However, the majority of them, often under the influence of preconceived ideas derived from positivist doctrine, have thought that they could go further and assert that the uncertain and incomplete character of the knowledge that experiment at its present stage gives us about what really happens in microphysics is the result of a real indeterminacy of the physical states and of their evolution. Such an extrapolation does not appear in any way to be justified. It is possible that looking into the future to a deeper level of physical reality we will be able to interpret the laws of probability and quantum physics as being the statistical results of the development of completely determined values of variables which are at present hidden from us. It may be that the powerful means we are beginning to use to break up the structure of the nucleus and to make new particles appear will give us one day a direct knowledge which we do not now have at this deeper level. To try to stop all attempts to pass beyond the present viewpoint of quantum ^ CO A1:1 O3H1 IN f11NHf1 0dO .IHdOSO1IHd 3H1 and the distance x travelled by the particle in the time t cannot be known more accurately than within Ax = tAvx = ht/2mAx o If by a measurement at t = 0 we have localized the particle within the range Ax o , then in a measurement of its position at time t the particle could be found anywhere within a range at least as large as Ax. Note that Ax is inversely proportional to Ax o , so that the more carefully we localize the particle at the initial instant, the less we shall know about its final position. Also, the uncertainty Ax increases linearly with time t. This corresponds to a spreading out, as time goes on, of the group of waves associated with the motion of the particle. • physics could be very dangerous for the progress of science and would furthermore be contrary to the lessons we may learn from the history of science. This teaches us, in effect, that the actual state of our knowledge is always provisional and that there must be, beyond what is actually known, immense new regions to discover." (From Causality and Chance in Modern Physics by David Bohm, © 1957 D. Bohm; reprinted by permission of D. Van Nostrand Co.) The student should notice here the acceptance of the correctness of quantum mechanics at the atomic and nuclear level. The search for a deeper level, where quantum mechanics might be superseded, is motivated much more by objection to its philosophic indeterminism than by other considerations. According to Einstein, "The belief in an external world independent of the perceiving subject is the basis of all natural science." Quantum mechanics, however, regards the interactions of object and observer as the ultimate reality. It uses the language of physical relations and processes rather than that of physical qualities and properties. It rejects as meaningless and useless the notion that behind the universe of our perception there lies a co hidden objective world ruled by causality; instead it confines itself to the description of the relations among perceptions. Nevertheless, there is a reluctance by many to give up attribco uting objective properties to elementary particles, say, and dealing instead with our subjective Û knowledge of them, and this motivates their search for a new theory. According to de Broglie, such a search is in the interest of science. Whether it will lead to a new theory that in some currently unexplored realm contradicts quantum theory and also alters its philosophic foundations, no one knows. 0 DE BROGLIE 'S POSTU LATE CO QUESTIONS 1. Why is the wave nature of matter not more apparent to us in our daily observations? 2. Does the de Broglie wavelength apply only to "elementary particles" such as an electron or neutron, or does it apply as well to compound systems of matter having internal structure? Give examples. 3. If, in the de Broglie formula, we let m --> oo, do we get the classical result for macroscopic particles? 4. Can the de Broglie wavelength of a particle be smaller than a linear dimension of the particle? Larger? Is there necessarily any relation between such quantities? 5. Is the frequency of a de Broglie wave given by E/h? Is the velocity given by Ay? Is the velocity equal to c? Explain. 6. Can we measure the frequency v for de Broglie waves? If so, how? 7. How can electron diffraction be used to study properties of the surface of a solid? 8. How do we account for regularly reflected beams in diffraction experiments with electrons and atoms? 9. Does the Bragg formula have to be modified for electrons to account for the refraction of electron waves at the crystal surface? 10. Do electron diffraction experiments give different information about crystals than can be obtained from x-ray diffraction experiments? From neutron diffraction experiments? Discuss. 11. Could crystallographic studies be carried out with protons? With neutrons? 12. Discuss the analogy: physical optics is to geometrical optics as wave mechanics is to classical mechanics. 13. Is an electron a particle? Is it a wave? Explain. 14. Does the de Broglie wavelength associated with a particle depend on the motion of the reference frame of the observer? What effect does this have on the wave-particle duality? 15. Give examples of how the process of measurement disturbs the system being measured. 16. Show the relation between the uncontrollable nature of the Compton recoil in Bohr's y-ray microscope experiment and the fact that there are four unknowns and only three conservation equations in the Compton effect. PROBLEMS 1. A bullet of mass 40 g travels at 1000 m/sec. (a) What wavelength can we associate with it? (b) Why does the wave nature of the bullet not reveal itself through diffraction effects? 2. The wavelength of the yellow spectral emission of sodium is 5890 A. At what kinetic energy would an electron have the same de Broglie wavelength? 3. An electron and a photon each have a wavelength of 2.0 A. What are their (a) momenta and (b) total energies? (c) Compare the kinetic energies of the electron and the photon. 4. A nonrelativistic particle is moving three times as fast as an electron. The ratio of their de Broglie wavelengths, particle to electron, is 1.813 x 10 - 4 . Identify the particle. 5. A thermal neutron has a kinetic energy (3/2)k T where T is room temperature, 300°K. Such neutrons are in thermal equilibrium with normal surroundings. (a) What is the energy in electron volts of a thermal neutron? (b) What is its de Broglie wavelength? 6. A particle moving with kinetic energy equal to its rest energy has a de Broglie wavelength of 1.7898 x 10 - 6 A. If the kinetic energy doubles, what is the new de Broglie wavelength? 7. (a) Show that the de Broglie wavelength of a particle, of charge e, rest mass m o , moving at relativistic speeds is given as a function of the accelerating potential V as = h 1/2 (1 + eVc2l /2 mo /2moe V ` (b) Show how this agrees with 2 = h/p in the nonrelativistic limit SW 3 1 80a d 17. The uncertainty principle is sometimes stated in terms of angular quantities as AL 4 Arp > h/2 where AL4, is the uncertainty in a component of angular momentum and Acp is the uncertainty in the corresponding angular position. In some quantum mechanical systems the angular momentum is measured to have a definite (quantized) magnitude. Does this contradict this statement of the uncertainty principle? 18. Argue from the Heisenberg uncertainty principle that the lowest energy of an oscillator cannot be zero. 19. Discuss similarities and differences between a matter wave and an electromagnetic wave. 20. Explain qualitatively the results of Example 3-7 that the uncertainty in position of a particle increases the more accurately we localize the particle initially and that the uncertainty increases with time. 21. Does the fact that interference occurs between various parts of the wave associated with a single particle (as in the G. I. Taylor experiments) simplify or complicate quantum physics? 22. Games of chance contain events which are ruled by statistics. Do such games violate the strict determination of individual events? Do they violate cause and effect? 23. According to operational philosophy, if we cannot prescribe a feasible operation for determining a physical quantity, the quantity should be given up as having no physical reality. What are the merits and drawbacks of this point of view in your opinion? 24. Bohm and de Broglie suggest that there may be hidden variables at a level deeper than quantum theory which are strictly determined. Draw an analogy to the relation between statistical mechanics and Newton's law of motion. 25. In your opinion is there an objective physical reality independent of our subjective sense impressions? How is this question answered by defenders of the Copenhagen interpretation? By critics of the Copenhagen interpretation? 26. Are our concepts limited in principle by our everyday experiences or is this only our conceptual starting point? How is this question related to a resolution of the waveparticle duality? DE BR OGLIE 'S POSTULATE N CO 8. Show that for a relativistic particle of rest energy E 0 , the de Broglie wavelength in A is given by 1.24 x 10 -2 (1 — /32)1/2 A = where /3 = v/c. E0(MeV) /3 9. Determine at what energy, in electron volts, the nonrelativistic expression for the de Broglie wavelength will be in error by 1% for (a) an electron and (b) a neutron. (Hint: See Problem 7.) 10. (a) Show that for a nonrelativistic particle, a small change in speed leads to a change in de Broglie wavelength given from AA, Av Ao vo (b) Derive an analogous formula for a relativistic particle. 11. The 50-GeV (i.e., 50 x 10 9 eV) electron accelerator at Stanford University provides an electron beam of very short wavelength, suitable for probing the fine details of nuclear structure by scattering experiments. What is this wavelength and how does it compare to the size of an average nucleus? (Hint: At these energies it is simpler to use the extreme relativistic relationship between momentum and energy, namely p = E/c. This is the same relationship used for photons, and it is justified whenever the kinetic energy of a particle is very much greater than its rest energy m oc2, as in this case.) 12. Make a plot of de Broglie wavelength against kinetic energy for (a) electrons and (b) protons. Restrict the range of energy values to those in which classical mechanics applies reasonably well. A convenient criterion is that the maximum kinetic energy on each plot be only about, say, 5% of the rest energy m oc2 for the particular particle. 13. In the experiment of Davisson and Germer, (a) show that the second- and third-order diffracted beams, corresponding to the strong first maximum of Figure 3-2, cannot occur and (b) find the angle at which the first-order diffracted beam would occur if the accelerating potential were changed from 54 to 60 V? (c) What accelerating potential is needed to produce a second-order diffracted beam at 50°? 14. Consider a crystal with the atoms arranged in a cubic array, each atom a distance 0.91 A from its nearest neighbor. Examine the conditions for Bragg reflection from atomic planes connecting diagonally placed atoms. (a) Find the longest wavelength electrons that can produce a first-order maximum. (b) If 300 eV electrons are used, at what angle from the crystal normal must they be incident to produce a first-order maximum? 15. What is the wavelength of a hydrogen atom moving with a velocity corresponding to the mean kinetic energy for thermal equilibrium at 20°C? 16. The principal planar spacing in a potassium chloride crystal is 3.14 A. Compare the angle for first-order Bragg reflection from these planes of electrons of kinetic energy 40 keV to that of 40 keV photons. 17. Electrons incident on a crystal suffer refraction due to an attractive potential of about 15 V that crystals present to electrons (due to the ions in the crystal lattice). If the angle of incidence of an electron beam is 45° and the electrons have an incident energy of 100 eV, what is the angle of refraction? 18. What accelerating voltage would be required for electrons in an electron microscope to obtain the same ultimate resolving power as that which could be obtained from a "y-ray microscope" using 0.2 MeV y rays? 19. The highest achievable resolving power of a microscope is limited only by the wavelength used; that is, the smallest detail that can be separated is about equal to the wavelength. Suppose we wish to "see" inside an atom. Assuming the atom to have a diameter of 1.0 A, this means that we wish to resolve detail of separation about 0.1 A. (a) If an electron microscope is used, what minimum energy of electrons is needed? (b) If a photon microscope is used, what energy of photons is needed? In what region of the electromagnetic spectrum are these photons? (c) Which microscope seems more practical for this purpose? Explain. Ax min min h 47Lm0c (1 — /32)112 Ac 47r /1 _ fl2 where 2c is the Compton wavelength h/m 0 c. (b) What is the meaning of this equation for fi = 0? For /3 = 1? 26. A microscope using photons is employed to locate an electron in an atom to within a distance of 0.2 A. What is the uncertainty in the velocity of the electron located in this way? 27. The velocity of a positron is measured to be: vx = (4.00 ± 0.18) x 10 5 m/sec, vy = (0.34 + 0.12) x 10 5 m/sec, vZ = (1.41 ± 0.08) x 10 5 m/sec. Within what minimum volume was the positron located at the moment the measurement was carried out? 28. (a) Consider an electron whose position is somewhere in an atom of diameter 1 A. What is the uncertainty in the electron's momentum? Is this consistent with the binding energy of electrons in atoms? (b) Imagine an electron to be somewhere in a nucleus of diameter 10 -12 cm. What is the uncertainty in the electron's momentum? Is this consistent with the binding energy of nuclear constituents? (c) Consider now a neutron, or a proton, to be in such a nucleus. What is the uncertainty in the neutron's, or proton's, momentum? Is this consistent with the binding energy of nuclear constituents? 29. The lifetime of an excited state of a nucleus is usually about 10 -12 sec. What is the uncertainty in energy of the y-ray photon emitted? 30. An atom in an excited state has a lifetime of 1.2 x 10 -8 sec; in a second excited state the lifetime is 2.3 x 10 8 sec. What is the uncertainty in energy for the photon emitted when an electron makes a transition between these two levels? 31: Use relativistic expressions for total energy and momentum to verify that the group velocity g of a matter wave equals the velocity v of the associated particle. 32. The energy of a linear harmonic oscillator is E = p/2m + Cx2/2. (a) Show, using the uncertainty relation, that this can be written as h2 Cx2 E= + 327c2mx2 2 (b) Then show that the minimum energy of the oscillator is by/2 where 1 C v =— 27z is the oscillatory frequency. (Hint: This result depends on the AxAp x product achieving its limiting value h/2. Find E in terms of Ax or Apx as in part (a), then minimize E with o0 sw31eoad 20. Show that for a free particle the uncertainty relation can also be written as A1Ax _ > 2 /4n where Ax is the uncertainty in location of the wave and A2 the simultaneous uncertainty in wavelength. 21. If A2/2 = 10 -7 for a photon, what is the simultaneous value of Ax for (a) 2 = 5.00 x 10 -4 A (y ray)? (b) 2 = 5.00 A (x ray)? (e) 2 = 5000 A (light)? 22. In a repetition of Thomson's experiment for measuring elm for the electron, a beam of 104 eV electrons is collimated by passage through a slit of width 0.50 mm. Why is the beamlike character of the emergent electrons not destroyed by diffraction of the electron wave at this slit? 23. A 1 MeV electron leaves a track in a cloud chamber. The track is a series of water droplets each about 10 -5 m in diameter. Show, from the ratio of the uncertainty in transverse momentum to the momentum of the electron, that the electron path should not noticeably differ from a straight line. 24. Show that if the uncertainty in the location of a particle is about equal to its de Broglie wavelength, then the uncertainty in its velocity is about equal to one tenth its velocity. 25. (a) Show that the smallest possible uncertainty in the position of an electron whose speed is given by fi = v/c is DE B ROGLIE 'S POSTU LATE co respect to Ax or Ap x in part (b). Note that classically the minimum energy would be zero.) 33. A TV tube manufacturer is attempting to improve the picture resolution, while keeping costs down, by designing an electron gun that produces an electron beam which will make the smallest possible spot on the face of the tube, using only an electron emitting cathode followed by a system of two well-spaced apertures. (a) Show that there is an optimum diameter for the second aperture. (b) Using reasonable TV , tube parameters, estimate the minimum possible spot size. 34. A boy on top of a ladder of height H is dropping marbles of mass m to the floor and trying to hit a crack in the floor. To aim, he is using equipment of the highest possible precision. (a) Show that the marbles will miss the crack by an average distance of the order of (h/m) 1/2(H/g)1"4, where g is the acceleration due to gravity. (b) Using reasonable values of H and m, evaluate this distance. 35. Show that in order to be able to determine through which slit of a double-slit system each photon passes without destroying the double-slit diffraction pattern, the condition AyAp y « h/2 must be satisfied. Since this condition violates the uncertainty principle, it cannot be met. 4 BOHR'S MODEL OF THE ATOM 4 1 - THOMSON'S MODEL 86 properties of model; a particles; multiple scattering; Geiger-Marsden experiment; failure of model 4 2 - RUTHERFORD'S MODEL 90 nuclei; cc-particle trajectories; impact parameter and distance of closest approach; Rutherford's calculation; comparison with Geiger-Marsden experiment; nuclear radii; definition of differential cross section; solid angle; Rutherford scattering cross section 4-3 THE STABILITY OF THE NUCLEAR ATOM 95 radiation by an accelerated classical charged body 4 4 - ATOMIC SPECTRA 96 line spectra; hydrogen series; Balmer formula; Rydberg constant; alkali series; absorption spectra 4 5 - BOHR'S POSTULATES 98 statement of postulates; orbital angular momentum quantization; appraisal 46 - BOHR'S MODEL 100 Bohr's calculation; orbit radii; one-electron atom energy quantization; comparison with Balmer formula; singly ionized helium 47 - CORRECTION FOR FINITE NUCLEAR MASS 105 reduced mass; Rydberg constant evaluation; positronium; deuterium; muonic atom 48 - ATOMIC ENERGY STATES 107 Franck Hertz experiment; ionization energy; continuum states - 4 9 - INTERPRETATION OF THE QUANTIZATION RULES 110 Wilson- Sommerfeld quantization rules; phase space and phase diagrams; simple harmonic oscillator; one-electron atom and de Broglie's interpretation; particle in one-dimensional box 4 10 - SOMMERFELD'S MODEL 114 quantization of elliptical orbits; principal and azimuthal quantum numbers; degeneracy; effect of relativity; hydrogen fine structure; fine-structure constant; selection rules 85 BOHR 'S MOD EL OF THE ATOM 4-11 THE CORRESPONDENCE PRINCIPLE 117 statement of principle; justification; charged simple harmonic oscillator; hydrogen atom 4-12 A CRITIQUE OF THE OLD QUANTUM THEORY 118 recapitulation; failures of the old quantum theory; search for a replacement QUESTIONS 119 PROBLEMS 120 4-1 THOMSON'S MODEL By 1910 experimental evidence had been accumulated which showed that atoms contain electrons (e.g., scattering of x rays by atoms, photoelectric effect, etc.). These experiments also provided an estimate of Z, the number of electrons in an atom. They found it to be roughly equal to A/2, where A is the chemical atomic weight of the atom in question. Since atoms are normally neutral, they must also contain positive charge equal in magnitude to the negative charge carried by their normal complement of electrons. Thus a neutral atom has a negative charge — Ze, where — e is the electron charge, and also a positive charge of the same magnitude. That the mass of an electron is very small compared to the mass of even the lightest atom implies that most of the mass of the atom must be associated with the positive charge. These considerations naturally led to the question of the distribution of the positive and negative charges within the atom. J. J. Thomson proposed a tentative description, or model, of an atom according to which the negatively charged electrons were located within a continuous distribution of positive charge. The positive charge distribution was assumed to be spherical in shape with a radius of the known order of magnitude of the radius of an atom, 10 -20 m. (This value can be obtained from the density of a typical solid, its atomic weight, and Avogadro's number.) Owing to their mutual repulsion, the electrons would be uniformly distributed through the sphere of positive charge. Figure 4-1 illustrates this "plum pudding" model of the atom. In an atom in its lowest possible energy state, the electrons would be fixed at their equilibrium positions. In excited atoms (e.g., atoms in a material at high temperature), the electrons would vibrate about their equilibrium positions. Since classical electromagnetic theory predicts that an accelerated charged body, such as a vibrating electron, emits electromagnetic radiation, it was possible to understand qualitatively the emission of such radiation by excited atoms on the basis of Thomson's model. Quantitative agreement with experimentally observed spectra was lacking, however. (a) Assume that there is one electron of charge —e inside a spherical region of uniform positive charge density p (a Thomson hydrogen atom). Show that its motion, if it has kinetic energy, can be simple harmonic oscillation about the center of the sphere. Example 4-1. Thomson's model of the atom—a sphere of positive charge embedded with electrons. Figure 4-1 p= e 4 ^^, 3 3 so that k= pe e e e2 4 /3 3E0 47cE 0r' 3 3 9.0 x 109 nt -m2/coul2 x (1.6 x 10 -19 coul)2 102—2.3x10nt/m (1.0 x 10 -10 m)3 The frequency of the simple harmonic motion is then J 1 2.3 x 102 nt /m is 1 = 2.5 x 1 0 sec m 27c 9.11 x 10 -31 kg Since (in analogy to radiation emitted by electrons oscillating in an antenna) the radiation emitted by the atom will have this same frequency, it will correspond to a wavelength c 3.0 x 10 8 m/sec 1.2x10 m=1200 A v 2.5x1015/sec in the far ultraviolet portion of the electromagnetic spectrum. It is easy to show that an electron moving in a stable circular orbit of any radius inside the Thomson atom revolves at this same frequency, and so it would radiate at this frequency also. Of course, a different assumed radius of the sphere of positive charge would give a different frequency. But the fact that a Thomson hydrogen atom has only one characteristic emission frequency conflicts with the very large number of different frequencies observed in the spectrum of hydrogen. • 3E0 — 7cr — Conclusive proof of the inadequacy of Thomson's model was obtained in 1911 by Ernest Rutherford, a former student of Thomson's, from the analysis of experiments on the scattering of a particles by atoms. Rutherford's analysis showed that, instead of being spread throughout the atom, the positive charge is concentrated in a very small region, or nucleus, at the center of the atom. This was one of the most important developments in atomic physics and was the foundation of the subject of nuclear physics. Rutherford had already been awarded the Nobel Prize in 1908 for his "investigations in regard to the decay of elements and ... the chemistry of radioactive substances." He was a talented, hard-working physicist with enormous drive and self-confidence. In a letter written later in life, the then Lord Rutherford wrote, "I've just been reading some of my early papers and, you know, when I'd finished, I said to myself, `Rutherford, my boy, you used to be a damned clever fellow. — Though pleased at winning a Nobel Prize he was not happy that it was a chemistry prize, rather than one in physics. (Any research in the elements was then 13 aO1A1 SA OS WOHl •Let the electron be displaced to a distance a from the center, with a less than the radius of the sphere. From Gauss's law, we know that we can calculate the force on it by using Coulomb's law _ 1 (4 3 e _ pea F Ira p 4zcEO 3 a2 3E0 where (4/3)7ca 3 p is the net positive charge in a sphere of radius a. Hence, we can write F = — ka, where the constant k = pe/3E0 . If the electron at a is freed with no initial velocity, this force will produce simple harmonic motion along a diameter of the sphere since it is always directed towards the center and has a strength which is proportional to the displacement from the center. • (b) Let the total positive charge have the magnitude of one electron charge (so that the atom has no net charge), and let it be distributed over a sphere of radius r' = 1.0 x 10 -10 m. Find the force constant k and the frequency of the motion of the electron. ^^We have B O HR 'S MOD EL O F THE ATOM Diaphragm a-particle source Thin foil Figure 4-2 Arrangement of an a-particle scattering experiment. The region traversed by the a particles is evacuated. mistry.) In his speech accepting the prize he noted that he had observed many considers transformations 'n his work with radioactivity but never had seen one as rapid as his own, from physicist to chemist. Rutherford already knew a particles to be doubly ionized helium atoms (i.e., He atoms with two electrons removed), emitted spontaneously from several radioactive materials at high speed. In Figure 4-2 we show a typical arrangement that he and his colleagues used to study the scattering of a particles on passing through thin foils of various substances. The radioactive source emits a particles which are collimated into a narrow parallel beam by a pair of diaphragms. The parallel beam is incident upon a foil of some substance, usually a metal. The foil is so thin that the particles pass completely through with only a small decrease in speed. In traversing the foil, however, each a particle experiences many small deflections due to the Coulomb force acting between its charge and the positive and negative charges of the atoms of the foil. Since the deflection of an a particle in passing through a single atom depends on the details of its trajectory through the atom, the net deflection in passing through the entire foil will be different for different a particles in the beam. As a result, the beam emerges from the foil not as a parallel beam but as a divergent beam. A quantitative measure of its divergence is found by measuring the number of a particles scattered into each angular range O to O + d0. The a particle detector consisted of a layer of the crystalline compound ZnS and a microscope. The crystal ZnS has the useful property of producing a small flash of light when struck by an cc particle. If observed with a microscope, the flash due to the incidence of a single cc particle can be distinguished. In the experiment an observer counts the number of light flashes produced per unit time as a function of the angular position of the detector. Let .N' represent the number of atoms that deflect an cc particle in its passage through the foil. If B represents the angle of deflection in passing through one atom, as in Figure 4-3, and O is the net deflection in passing through all the atoms in its a-particle trajectory Figure 4-3 An a particle passing through a Thomson model atom. The angle B specifies the deflection of the cc particle. trajectory through the foil, then statistical theory shows that (4-1) (92)1/2 is Here (0 2)112 is the root mean square net deflection, or scattering, angle and the root mean square scattering angle in a deflection from a single atom. The factor J.At comes from the randomness of the deflection; if all deflections were in the same direction, clearly we would obtain X instead of /.N' . More generally, statistical theory gives the following angular distribution of the scattered a particles - e2 /— (4-2) N(0) d0 = OO e e2 d0 CO ^ where N(0) de is the number of a's scattered within the angular range O to O + dO, and I is the number of a's passing through the foil. Because electrons have a very small mass compared to the a particle, they can in any case produce only small a-particle deflections; and because the positive charge is distributed over all the volume of the r' 10' 10 m radius Thomson atom it cannot provide a Coulomb repulsion intense enough to produce a large deflection of the a particle. Indeed, using Thomson's model we find that the deflection caused by one atom is 0 < 10 -4 rad. This result and (4-1) and (4-2) comprise the a-particle scattering predictions of the Thomson model of the atom. Rutherford and his group tested these predictions. Example 4-2. (a) In a typical experiment (Geiger and Marsden, 1909), a particles were scattered by a gold foil 10 -6 m thick. The average scattering angle was found to be (02)1/2 1° ^ 2 x 10 -2 rad. Calculate (02)1/2. ■ The number of atoms traversed by the a particle is approximately equal to the thickness of the foil divided by the diameter of the atom. Hence m/10-tom= 104 ✓V ^ 10-6 The average deflection angle in traversing a single atom then, from (4-1), is (02)1/2 2 2 x1010 (82)1/2 = ,,, 2 x 10 -4 rad 4 not in disagreement with the Thomson atom estimate 0 < 10 -4 rad. (b) More than 99% of the a particles were scattered at angles less than 3°. The measurements, using 1° for (02)1/2, were in agreement with (4-2) for NOD) de for angles Co in this range; but the angular distribution of the small number of particles scattered at larger angles was in marked disagreement with (4-2). It was found, for example, that the fraction of a's scattered at angles greater than 90°, N(O > 90°)/I, was about 10 -4. What does (4-2) predict? ^^ We have 180° N(0 > 90 °) I _ 90e N(0) de e I — —(90) 2 = 10-3500 a strikingly different result than the experiment value of 10 -4. In general the number of scattered a particles was observed to be very much larger than • the predicted number for all scattering angles greater than a few degrees. The existence of a small, but nonzero probability for scattering at large angles could not be explained at all in terms of Thomson's model of the atom, which basically involves small angle scattering from many atoms. To scientists accustomed to thinking in terms of this model it came as a great surprise that some a particles were deflected through very large angles, up to 180°. In Rutherford's words: "It was quite the most incredible event that ever happened to me in my life. It was as incredible as if you fired a 15-inch shell at a piece of tissue paper and it came back and hit you." 13a01A1SNOS W OHl (0 2) 1/2 = Vs- (0 2) 1/2 0 BOHR 'S MO D EL OF THEATOM rn Experiments using foils of various thicknesses showed that the number of large angle scatterings was proportional to ✓V' , the number of atoms traversed by the a particle. This is just the dependence on .N 1 that would arise if there were a small probability that an a particle could be scattered through a large angle in traversing a single atom. That cannot happen in Thomson's model of the atom, and this led Rutherford in 1911 to propose a new model. 4 2 RUTHERFORD'S MODEL - In Rutherford's model of the structure of the atom, all the positive charge of the atom, and consequently essentially all its mass, are assumed to be concentrated in a small region in the center called the nucleus. If the dimensions of the nucleus are small enough, an a particle passing very near it can be scattered by a strong Coulomb repulsion through a large angle in the traversal of a single atom. If, instead of using r' = 10' 1 ° m for the radius of the positive charge distribution of the Thomson atom, which leads to a maximum deflection angle 0 10 -4 rad, we ask what the 14 m. This, radius r' of a nucleus should be to obtain 0 ^ 1 rad, say, we find r' = 10' as we shall see, turns out to be a good estimate of the radius of the atomic nucleus. Rutherford made a detailed calculation of the angular distribution to be expected for the scattering of a particles from atoms of the type proposed in his model. The calculation was concerned only with scattering at angles greater than several degrees. Hence, scattering due to atomic electrons can be ignored. The scattering is then due to the repulsive Coulomb force acting between the positively charged a particle and the positively charged nucleus. Furthermore, the calculation considered only the scattering from heavy atoms, to permit the assumption that the mass of the nucleus is so large compared to that of the a particle that the nucleus does not recoil appreciably (remains fixed in space) during the scattering process. It was also assumed that the a particle does not actually penetrate the nuclear region, so that the particle and the nucleus (both assumed to be spherical) act like point charges as far as the Coulomb force is concerned. We shall see later that all these assumptions are quite valid except for the scattering of a particles from the lighter nuclei, and we can correct for the finite nuclear mass in such cases. The calculation, finally, uses nonrelativistic mechanics, since v/c 1/20. Figure 4-4 illustrates the scattering of an a particle, of charge + ze and mass M, in passing near a nucleus of charge + Ze. The nucleus is fixed at the origin of the coordinate system. When the particle is very far from the nucleus, the Coulomb force on it is negligible so that the particle approaches the nucleus along a straight line with constant speed v. After the scattering, the particle will move off finally along a straight line again with constant speed y'. The position of the particle relative to the nucleus is specified by the radial coordinate r and the polar angle (p, with the latter measured from an axis drawn parallel to the initial trajectory line. The perpendicular distance from that axis to the line of initial motion is called the impact parameter, specified by b. The scattering angle 9 is just the angle between the axis and a line drawn through the origin parallel to the line of final motion; the perpendicular distance between these two lines is b'. Show that y = v' and b = b'. •The force acting on the particle, being a Coulomb force, is always in the radial direction. Hence, the angular momentum of the particle about the origin has a constant value, L. Specifically then, the initial angular momentum is equal to the final angular momentum, or Mvb = Mv'b' = L Of course, the kinetic energy of the particle does not remain constant during the scattering, but the initial kinetic energy must be equal to the final kinetic energy since the nucleus is Example 4-3. ^ Ze The hyperbolic Rutherford trajectory, showing the polar coordinates r, 9 and the parameters b, D. These two parameters completely determine the trajectory, in particular the scattering angle 8 and the distance of closest approach R. The nuclear point charge Ze lies at a focus of the branch of the hyperbola. Figure 4 4 - assumed to remain stationary. Thus 2 M v2 = 1 Mv '2 2 2 Therefore, y = y' and so from the previous equation b = b', as drawn in Figure 4-4. t By a straightforward calculation of classical mechanics, using the repulsive Coulomb force (1/4rr€ 0)(zZe2/r2), we can obtain the following equation for the trajectory of the a particle (see Appendix E for a derivation) 1 b sin 9+ 2b2 (cos ç — 1) (4-3) the equation of a hyperbola in polar coordinates. Here D is a constant, defined by 1 zZe 2 (4-4) D 47tE0 Mv2/2 It is a convenient parameter equal to the distance of closest approach to the nucleus in a head-on collision (b = 0), since D is the distance at which the potential energy (1/4it€ 0)(zZe 2/D) is equal to the initial kinetic energy Mv 2/2 (simply equate the two and solve for D). At this point the particle would come to a stop and then reverse its direction of motion. The scattering angle 0 follows from (4-3) by finding the value of cp as r — co and setting 0 = m — gyp. In this way we find cot = (4-5) Evaluate R, the distance of closest approach of the particle to the center of the nucleus (the origin in Figure 4-4). ■ The radial coordinate r will equal R when the polar angle is rp = (7c — 9)/2. Evaluating (4-3) for this angle, we get Example 4 4. - R = 1 sin (7r 2 9 I+b 2 2 I cos (7 2 B ) 1^ N BOHR 'S MOD EL OF THE ATOM w Now, from (4-5) we can put B D D 2 cot2 =2 b= tan and, after some manipulation, obtain D 1+ R= 2 cos (71c C^—2 B ^ —B 2 or D (4-6) sin (1(9/2)1 This result can be checked physically. Note that as 0 —> 7E, corresponding to b = 0 or a head-on collision, R —* D, the distance of closest approach. Also, as B -* 0, corresponding to no deflection at all, both b and R go to infinity, as would be expected. 4 R= [1 + From (4-5) we see that, in the scattering of an a particle by a single nucleus, if the impact parameter is in the range b to b + db then the scattering angle is in the range 8 to 0 + dB, where the relation between b and 9 is given by the equation. This is illustrated in Figure 4-5. The problem of calculating the number N(0) dO of a particles scattered into the angular range O to O + dO in traversing , the entire foil is therefore equivalent to the problem of calculating the number which are incident, with impact parameter from b to b + db, upon the nuclei in the foil. As we show in the following example, the result is ( ) 1 1 2 / zZe 2 ^2 I pt2it sin Co dO ( ) M O N N O = 47c€O/f 2Mv2 sin' (0/2) -7 C where I is the number of a particles incident on a foil of thickness t cm containing p nuclei per cubic centimeter. Example 4-5. Verify (4-7). ■ Consider a segment of the foil with a cross-sectional area of 1 cm 2, as shown in Figure 4-6. A ring, of inner radius b and outer radius b + db, is drawn around an incident axis passing through each nucleus, the area of each ring being 2irb db. The number of such rings in this segment of the foil is pt. The probability that an a particle will pass through one of these rings, P(b) db, is equal to the total area obscured by the rings, as seen by the incident a particles, divided by the total area of the segment. We assume the foil to be thin enough that we can ignore overlapping of rings from different nuclei. The process involves single scattering and the probability for appreciable scattering by more than one nucleus is very low. Hence P(b) db = pt2icb db +ze +ze • +Ze Figure 4 5 The relation between the impact parameter b and the scattering angle B. As b increases (less close nuclear approach) the angle B decreases (smaller scattering angle). The a particles with impact parameters between b and b + db are scattered into the angular range between 0 and 0 + dB. - but b = (D/2) cot (0/2) so that db = D d0/2 2 sine (0/2) and b db D2 cos (0/2) d0 8 sin3 (0/2) D2 sin 0 d0 16 sin4 (0/2) Thus P(b) db = — 8 ptD2 sin 0 sin4 (0/2) But —P(b)db is equal to the probability that the incident particles will be scattered into the angular range 0 to 0 + d0. The minus sign arises from the fact that a decrease in b, i.e., —db, corresponds to an increase in 0, i.e., +d0. Using our earlier notation O for the scattering angle in passing through the entire foil, this is N(0) dO — 2 sin O d0 I — P(b) db = ptD 8 sin4 (0/2) Finally, with D = (1/4it€ 0)zZe 2/(Mv2/2), we obtain (4-7). • If we compare the Rutherford atom result, (4-7), to the Thomson atom result, (4-2), we see that although the angular factor decreases rapidly with increasing angle in both, the decrease is very much less rapid for Rutherford's prediction. Large angle scattering is very much more probable in single scattering from a nuclear atom than in multiple small angle scattering from a plum pudding atom. Detailed experimental tests of (4-7) were performed within a few months of its derivation by Geiger and Marsden, with the following results: 1. The angular dependence was tested, using foils of Ag and Au, over the angular range 5° to 150°. Although N(0) dO varies by a factor of about 10 5 over this range, the experimental data remained proportional to the theoretical angular distribution to within a few percent. 2. The quantity N(0) dO was found indeed to be proportional to the thickness t of the foil for a range of about 10 in thickness for all the elements investigated. 3. Equation (4-7) predicts that the number of scattered a's will be inversely proportional to the square of their kinetic energy, Mv 2/2. This was tested by using a particles from several different radioactive sources and the predicted energy dependence was confirmed experimentally over an available energy variation of about a factor of 3. 4. Finally, the equation predicts N(0) dO to be proportional to (Ze) 2, the square of the nuclear charge. At the time Z was not known for the various atoms. Assuming saa oAa3 Hlna Figure 4-6 A beam of a particles incident on a foil of 1 cm 2 area and thickness t cm. The rings, which are purely geometrical constructs and not anything physical, are centered on nuclei. Actually there are enormously many more rings than shown and the rings are very much smaller than shown. 13aO 1A1 Incident a particles B O HR 'S MOD EL O F THE ATOM (4-7) to be valid, the experiment was used to determine Z and it was found that Z was equal to the chemical atomic number of the target atoms. This implied that the first atom, H, in the periodic table contains one electron, the second atom, He, contains two electrons, the third atom, Li, contains three, etc., since Z is also the number of electrons in the neutral atom. This result was soon independently confirmed by x-ray techniques that will be discussed in Chapter 9. Rutherford, his model now confirmed, was able to put limits on the size of the nucleus. The distance of closest approach, D, is the smallest value that R takes on, which is R at O = 180°. Hence R1800 = 1 zZe 2 D = 4n€0 Mv 2/2 The nucleus radius must be no larger than D because the results are based on the assumption that the force acting on the a particle is always strictly a Coulomb force between two point charges. This assumption would not be true if the particle penetrated the nuclear region at its distance of closest approach. The previous equation shows that R1800 decreases as Z decreases. The question arises: How much can R1800 decrease before R1800 is less than the nuclear radius? Departures from the predicted Rutherford scattering were actually observed from the very light (low Z) nuclei. Part of this was due to a violation, for the very light nuclei, of the assumption that the nuclear mass is large compared to the alpha particle mass; however, deviations remained even after the finite nuclear mass was taken into account in the theory. This suggests that penetration of the nucleus occurs in these cases thereby altering the predicted scattering. Hence, the nuclear radius can be defined as the value of R at the limiting scattering angle, or limiting incident energy, at which deviations from Rutherford scattering set in. In Figure 4-7, for example, we show data from Rutherford's group for the scattering of a particles, of various energies, at a fixed large angle from an Al foil. The ordinate is the ratio of the observed number of scattered particles to the number predicted by the Rutherford theory (corrected for the finite nuclear mass). The abscissa is the distance of closest approach calculated from (4-6). These data imply that the radius of the Al nucleus is about 10 -14 m = 10 F. (The 15 m. Note unit of distance used in nuclear physics is the fermi, which equals 10' that 1 F = 10'5 A, where A, the angstrom, is the unit used in atomic physics.) The Rutherford scattering formula, (4-7), is usually expressed in terms of a differential cross section da/dQ. This quantity is defined so that the number dN of a particles scattered into a solid angle dS2 at scattering angle Co is dN = do- In dS2 (4-8) Aluminum x d X 7C- o â .--. 0 0 ' 0.6 0.8 1.0 1.2 R (10 -14 m) 1.4 1.6 1.8 Sorte data obtained in the scattering of a particles from a radioactive source by aluminium. The abscissa is the distance of closest approach to the nuclear center. Figure 4-7 du = area/r 2 = 27r sin Ode n nuclei per cm 2 of target dN particles emitted into solid angle du Figure 4 8 Illustrating the definition of the differential cross section doidS2. If the target is thin enough for an incident particle to have negligible chance of interacting with more than one nucleus while passing through the target, then dN = (doIdfl)In dS2 - if I a particles are incident on a target foil containing n nuclei per square centimeter. The definition is analogous to the definition of a cross section 6 in (2-18) N = 6In It is illustrated in Figure 4-8. The solid angle AI, which is essentially a two- dimensional angular range, is measured numerically by the area which the angular range includes on a sphere of unit radius centered where the scatterings occur. For Rutherford scattering, which is symmetric about the axis of the incident beam, we are interested in the solid angle (K2 corresponding to all events in which the scattering angle lies in the range dO at O. As is shown in the figure dS2 = 27c sin O dO Using this in (4-7), writing N(0) dO in that equation as dN, and also writing the term pt appearing there as n, we immediately obtain C 1 ^ 2 zZe2 ^ 2 1 In df2 dN = 4ic0 2Mv2 sin4 (0/2) Comparison with the definition of (4-8) then shows that the Rutherford scattering differential cross section is d6 dS2 ) 2 (zZe 2 12 1 /2) 47r€0 J 2Mv2 J sin4 (0/2) ( 1 (4-9) 4-3 THE STABILITY OF THE NUCLEAR ATOM The detailed experimental verification of the predictions of Rutherford's nuclear model of the atom left little room for doubt concerning the validity of the model. At the center of the atom is a nucleus whose mass is approximately that of the entire atom and whose charge is equal to the atomic number Z times e; around this nucleus there exist Z electrons, neutralizing the atom as a whole. But serious questions emerge about the stability of such an atom. If we assume, for example, that the electrons in the atom are stationary, there exists no stable arrangement of the electrons which would prevent the electrons from falling into the nucleus under the influence of its Coulomb attraction. We cannot allow the atom to collapse (back to a nuclear-sized plum pudding) because then its radius would be of the order of a THE STABILITY OF THE NUCLEA R ATOM Incident beam of I particles co BOHR 'S MODEL OF THE ATOM ^ nuclear radius, which is four orders of magnitude smaller than diverse experiments show the radius of the' atom to be. At first glance it seems that we can simply allow the electrons to circulate about the nucleus in orbits similar to the orbits of the planets circulating about the sun. Such a system can be stable mechanically, as is the solar system. A serious difficulty arises, however, in trying to carry over this idea from the planetary system to the atomic system. The problem is that the charged electrons would be constantly accelerating in their motion around the nucleus and, according to classical electromagnetic theory, all accelerating charged bodies radiate energy in the form of electromagnetic radiation (see Appendix B). The energy would be emitted at the expense of the mechanical energy of the electron, and the electron would spiral into the nucleus. Again we have an atom which would rapidly collapse to nuclear dimensions. (For an atom of diameter 10'° m the time of collapse can be computed to be 10 -12 sec!) Furthermore, the continuous spectrum of the radiation that would be emitted in this process is not in agreement with the discrete spectrum which is known to be emitted by atoms. This difficult problem of the stability of atoms actually led to a simple model of atomic structure. A key feature of this very successful model, proposed by Niels Bohr in 1913, was the prediction of the spectrum of radiation emitted by certain atoms. Hence, it is appropriate at this point to describe some of the principal features of such spectra. 4 4 ATOMIC SPECTRA - A typical apparatus used in the measurement of atomic spectra is indicated in Figure 4-9. The source consists of an electric discharge passing through a region containing a monatomic gas. Owing to collisions with electrons, and with each other, some of the atoms in the discharge are put into a state in which their total energy is greater than it is in a normal atom. In returning to their normal energy state, the atoms give up their excess energy by emitting electromagnetic radiation. The radiation is collimated by the slit and then it passes through a prism (or diffraction grating for better resolution) where it is broken up into its wavelength spectrum which is recorded on the photographic plate. The nature of the observed spectra is indicated on the photographic plate. In contrast to the continuous spectrum of electromagnetic radiation emitted, for instance, from the surface of solids at high temperature, the electromagnetic radiation emitted Photographic plate Slit Figure 4 9 - Schematic of an apparatus used to measure atomic spectra. ^^ duces on the photographic plate. Investigation of the spectra emitted from different kinds of atoms shows that each kind of atoms has its own characteristic spectrum, i.e., a characteristic set of wavelengths at which the lines of the spectrum are found. This feature is of greatest practical importance because it makes spectroscopy a very useful addition to the usual techniques of chemical analysis. Chiefly for this reason much effort was devoted to the accurate measurement of atomic spectra, and, in fact, much effort was needed because the spectra consist of many hundreds of lines and in general are very complicated. However, the spectrum of hydrogen is relatively simple. This is perhaps not surprising since hydrogen, which contains just one electron, is itself the simplest atom. Most of the universe consists of isolated hydrogen atoms so that the hydrogen spectrum is of considerable practical interest. There are historical and theoretical reasons as well for studying it, as will become apparent later. Figure 4-10 shows that part of the atomic hydrogen spectrum which falls approximately within the wavlength range of visible light. We see that the spacing, in wavelengths, between adjacent lines of the spectrum continuously decreases with decreasing wavelength of the lines, so that the series of lines converges to the so-called series limit at 3645.6 A. The short wavelength lines, including the series limit, are hard to observe experimentally because of their close spacing and because they are in the ultraviolet. The obvious regularity of the H spectrum tempted several people to look for an empirical formula which would represent the wavelength of the lines. Such a formula was discovered in 1885 by Balmer. He found that the simple equation n2n2 4 (in A units) = 3646 where n = 3 for H OE, n = 4 for HR , n = 5 for Hy, etc., was able to predict the wavelength of the first nine lines of the series, which were all that were known at the time, to better than one part in 1000. This discovery initiated a search for similar empirical formulas that would apply to series of lines which can sometimes be identified in the complicated distribution of lines that constitute the spectra of other elements. Most of this work was done around 1890 by Rydberg, who found it convenient to deal with the reciprocal of the wavelength of the lines, instead of their wavelength. In terms of reciprocal wavelength K the Balmer formula can be written n = 3, 4, 5, ... (4-10) K = 1/11, = RH(1/2 2 — 1/n2) where RH is the so-called Rydberg constant for hydrogen. From recent spectroscopic Designation of line H^ H^ co g A (R) Color ° co Red Hy Hs HE HI, Ln co Blue N C7 ^ cr ; Violet m T. ^ CO Hx CO CO Lri C^7 Near ultraviolet Figure 4-10 A photograph of the visible part of the hydrogen spectrum. (Spectrum from W. Finkelnburg, Structure of Matter, Springer-Verlag, Heidelberg, 1964.) m ^ `d1:110 3d S0I1/4 O1d1717'OaS by free atoms is concentrated at a number of discrete wavelengths. Each of these wavelength components is called a line because of the line (image of the slit) which it pro- Table 4 1 The Hydrogen Series Names Wavelength Ranges BOHR 'S MOD EL OF THE ATOM - C Formulas Lyman Ultraviolet K=RH Balmer Near ultraviolet and visible K= Paschen Infrared Brackett Infrared Pfund Infrared 1 12 - 1^ n z 1 11 ^ 1 K RH 3 2 11 C RH 22 n2 n2 l l) K-R H (42 — nz ^ K = RH l 11 5 2 n2 n= 2,3,4,... n=3,4,5,... n=4,5,6,... n= 5,6,7,... n= 6,7,8,... data, its value is known to be 10967757.6 ± 1.2 m -1 This indicates the accuracy possible in spectroscopic measurements. Formulas of this type were found for a number of series. For instance, we now know of the existence of five series of lines in the hydrogen spectrum, as shown in Table 4-1. For alkali element atoms (Li, Na, K, . . .) the series formulas are of the same general structure. That is 1 1 1 ) (4-11 K= = R (m — a)z (n—b)2 where R is the Rydberg constant for the particular element, a and b are constants for the particular series, m is an integer which is fixed for the particular series, and n is a variable integer. To within about 0.05% the Rydberg constant has the same value for all elements, although it does show a very slight systematic increase with increasing atomic weight. We have been discussing the emission spectrum of an atom. A closely related property is the absorption spectrum. This may be measured with apparatus similar to that shown in Figure 4-9 except that a source emitting a continuous spectrum is used and a glass-walled cell, containing the monatomic gas to be investigated, is inserted somewhere between the source and the prism. After exposure and development, the photographic plate is found to be darkened everywhere except for a number of unexposed lines. These lines represent a set of discrete wavelength components which were missing from the otherwise continuous spectrum incident upon the prism, and which must have been absorbed by the atoms in the gas cell. It is observed that for every line in the absorption spectrum of an element there is a corresponding (same wavelength) line in its emission spectrum; however, the reverse is not true. Only certain emission lines show up in the absorption spectrum. For hydrogen gas, normally only lines corresponding to the Lyman series appear in the absorption spectrum; but, when the gas is at very high temperatures, e.g., at the surface of a star, lines corresponding to the Balmer series are found. RH = 4-5 BOHR'S POSTULATES All these features of atomic spectra, and many more which we have not discussed, must be explained by any successful model of atomic structure. Furthermore, the very great precision of spectroscopic measurements imposes severe requirements on the 1. An electron in an atom moves in a circular orbit about the nucleus under the influence of the Coulomb attraction between the electron and the nucleus, obeying the laws of classical mechanics. 2. Instead of the infinity of orbits which would be possible in classical mechanics, it is only possible for an electron to move in an orbit for which its orbital angular momentum L is an integral multiple of h, Planck's constant divided by 2n. 3. Despite the fact that it is constantly accelerating, an electron moving in such an allowed orbit does not radiate electromagnetic energy. Thus, its total energy E remains constant. 4. Electromagnetic radiation is emitted if an electron, initially moving in an orbit of total energy E i, discontinuously changes its motion so that it moves in an orbit of total energy E f . The frequency of the emitted radiation v is equal to the quantity (E 1 — E f) divided by Planck's constant h. The first postulate bases Bohr's model on the existence of the atomic nucleus. The second postulate introduces quantization. Note the difference, however, between Bohr's quantization of the orbital angular momentum of an atomic electron moving under the influence of an inverse square (Coulomb) force L = nh n= 1, 2, 3, ... (4-12) and Planck's quantization of the energy of a particle, such as an electron, executing simple harmonic motion under the influence of a harmonic restoring force : E = nhv, n = 0, 1, 2, .... We shall see in the next section that the quantization of the orbital angular momentum of the atomic electron does lead to the quantization of its total energy, but with an energy quantization equation which is different from Planck's equation. The third postulate removes the problem of the stability of an electron moving in a circular orbit, due to the emission of the electromagnetic radiation required of the electron by classical theory, by simply postulating that this particular feature of the classical theory is not valid for the case of an atomic electron. The postulate was based on the fact that atoms are observed by experiment to be stable— even though this is not predicted by the classical theory. The fourth postulate v= Ei — E f h (4-13) is really just Einstein's postulate that the frequency of a photon of electromagnetic radiation is equal to the energy carried by the photon divided by Planck's constant. These postulates do a thorough job of mixing classical and nonclassical physics. The electron moving in a circular orbit is assumed to obey classical mechanics, and yet the nonclassical idea of quantization of orbital angular momentum is included. The electron is assumed to obey one feature of classical electromagnetic theory (Coulomb's law), and yet not to obey another feature (emission of radiation by an accelerated charged body). However, we should not be surprised if the laws of classical physics, which are based on our experience with macroscopic systems, are not completely valid when dealing with microscopic systems such as the atom. S31V1IIlSOd 8, 1:1H 09 accuracy with which such a model must be able to predict the quantitative features of the spectra. Nevertheless, in 1913 Niels Bohr developed a model which was in accurate quantitative agreement with certain of the spectroscopic data (e.g., the hydrogen spectrum). It had the additional attraction that the mathematics involved was very easy to understand. Although the student has probably seen something of Bohr's model in studying elementary physics, or chemistry, we shall consider it in detail here in order to obtain various results that we shall want to make comparisons with elsewhere in this book, and also in order to take a careful look at the rather confusing postulates on which the model is based. These postulates are: BOHR 'S MODEL OF TH E ATOM 0 0 4-6 BOHR'S MODEL The justification of Bohr's postulates, or of any set of postulates, can be found only by comparing the predictions that can be derived from the postulates with the results of experiment. In this section we derive some of these predictions and compare them with the data of Section 4-4. Consider an atom consisting of a nucleus of charge + Ze and mass M, and a single electron of charge —e and mass m. For a neutral hydrogen atom Z = 1, for a singly ionized helium atom Z = 2, for a doubly ionized lithium atom Z = 3, etc. We assume that the electron revolves in a circular orbit about the nucleus. Initially we suppose the mass of the electron to be completely negligible compared to the mass of the nucleus, and consequently assume that the nucleus remains fixed in space. The condition of mechanical stability of the electron is 1 Ze2 y2 (4-14) =m— 4rrEO r2 r where y is the speed of the electron in its orbit, and r is the radius of the orbit. The left side of this equation is the Coulomb force acting on the electron, and the right side is ma, where a is the centripetal acceleration keeping the electron in its circular orbit. Now, the orbital angular momentum of the electron, L = mvr, must be a constant, because the force acting on the electron is entirely in the radial direction. Applying the quantization condition, (4-12), to L, we have mvr = nh n = 1, 2, 3, ... (4-15) Solving for y and substituting into (4-14), we obtain 2h2 n 2 = 4rr€ O n Ze 2 = 47nE0mv2r = 47rEOmr mr mr so r = 4nEO n2h2 mZe 2 n = 1, 2, 3, . . . (4-16) and v 1 Ze 2 nh =—= mr 47rE0 nh n = 1, 2, 3, . . . (4-17) The application of the angular momentum quantization condition has restricted the possible circular orbits to those of radii givèn by (4-16). Note that these radii are proportional to the square of the quantum number n. If we evaluate the radius of the smallest orbit (n = 1) for a hydrogen atom (Z = 1) by inserting the known values of h, m, and e, we obtain r = 5.3 x 10 -11 m ^ 0.5 A. We shall show later that the electron has its minimum total energy when in the orbit corresponding to n = 1. Consequently we may interpret the radius of this orbit as a measure of the radius of a hydrogen atom in its normal state. It is in good agreement with the estimate, mentioned previously, that the order of magnitude of an atomic radius is 1 A. Hence, Bohr's postulates predict a reasonable size for the atom. Evaluating the orbital velocity of an electron in the smallest orbit of a hydrogen atom from (4-17), we find y = 2.2 x 106 m/sec. It is apparent from the equation that this is the largest velocity possible for a hydrogen atom electron. The fact that this velocity is less than 1% of the velocity of light is the justification for using classical mechanics instead of relativistic mechanics in the Bohr model. On the other hand, (4-17) shows that for large values of Z the electron velocity becomes relativistic; the model could not be applied in such cases. That equation also makes it apparent why Bohr could not allow the quantum number n ever to assume the value n = 0, as it may in Planck's quantization equation. Next we calculate the total energy of an atomic electron moving in one of the allowed orbits. Let us define the potential energy to be zero when the electron is infinitely distant from the nucleus. Then the potential energy V at any finite distance r can be obtained by integrating the work that would be done by the Coulomb force acting from r to oo. Thus 1 2 K =—mv = 2 The total energy of the electron, E, is then E=K+V= — Ze e 4ic€02r Ze 2 = —K 4t€02r Using (4-16) for r in the preceding equation, we have E=_ 24 mZ e 1 n = 1, 2, 3, ... (4-18) (47.(e0)22h2 n2 We see that the quantization of the orbital angular momentum of the electron leads to a quantization of its total energy. The information contained in (4-18) is presented as an energy-level diagram in Figure 4-11. The energy of each level, as evaluated from (4-18), is shown on the left, in terms of joules and electron volts, and the quantum number of the level is shown on the right. The diagram is so constructed that the distance from any level to the level of zero energy is proportional to the energy of that level. Note that the lowest (most negative) allowed value of total energy occurs for the smallest quantum number n = 1. As n increases, the total energy of the quantum state becomes less negative, with E approaching zero as n approaches infinity. Since the state of lowest total energy is, of course, the most stable state for the electron, we see that the normal state of the electron in a one-electron atom is the state for which n = 1. n -19 —1.36 x 10 joule = - 0.85 eV -19 - 2.41 x10 joule _ -1.51 eV 00 4 3 - 5.42 x 10 -19 joule = - 3.39 eV 2 - 21.7 x10 -19 joule = -13.6 eV 1 Figure 4 11 - CD ^ 471E0r2 dr — 47tEOr The potential energy is negative because the Coulomb force is attractive; it takes work to move the electron from r to infinity against this force. The kinetic energy of the electron, K, can be evaluated, with the aid of (4-14), to be 0 Ci) An energy-level diagram for the hydrogen atom. 13Q 01A1 S, 1:IÎH O 8 Ze Ze 2 2 r E J ^ CO V =— ô BO HR 'S MOD EL O F THE ATOM N 0 Calculate the binding energy of the hydrogen atom (the energy binding the electron to the nucleus) from (4-18). The binding energy is numerically equal to the energy of the lowest state in Figure 4-11, corresponding to n = 1 in (4-18). This yields, with Z = 1 1 me 4 E_ 47rEO ) 2h2 _ (9.0 x 10 9 nt-m 2/coul 2)2 x 9.11 x 10 -31 kg x (1.60 x 10 -19 coul)4 2 x (1.05 x 10 -34 joule-sec) 2 = —2.17 x 10 -18 joule= —13.6 eV Example 4 6. - which agrees very well with the experimentally observed binding energy for hydrogen. t Next we calculate the frequency y of the electromagnetic radiation emitted when the electron makes a transition from the quantum state n i to the quantum state nf , that is, when an electron initially moving in an orbit characterized by the quantum number ni discontinuously changes its motion so that it moves in an orbit characterized by quantum number nf . Using Bohr's fourth postulate (4-13), and (4-18), we have v — Ei — 2 mZ 2 e4 1 Ef — + 1 \47c€O h 4r7h3 of ni In terms of the reciprocal wavelength K __ 1 47rE0 K= = v/c, 2 (1 47rh3c Z nf2 2 me4 this is 1 ni2 or K=R„Z2 ^ 1 1) nf n? where R 1 2 me 4 47rE0 47rh3c (4-19) and where ni and n f are integers. The essential predictions of the Bohr model are contained in (4-18) and (4-19). Let us first discuss the emission of electromagnetic radiation by a one-electron Bohr atom in terms of these equations. 1. The normal state of the atom will be the state in which the electron has the lowest energy, i.e., the state n = 1. This is called the ground state. (Ground state means fundamental state, the term originating from the German word grund, meaning fundamental.) 2. In an electric discharge, or in some other process, the atom receives energy due to collisions, etc. This means that the electron must make a transition to a state of higher energy, or excited state, in which n > 1. 3. Obeying the common tendency of all physical systems, the atom will emit its excess energy and return to the ground state. This is accomplished by a series of transitions in which the electron drops to excited states of successively lower energy, finally reaching the ground state. In each transition electromagnetic radiation is emitted with a wavelength which depends on the energy lost by the electron, i.e., on the initial and final quantum numbers. In a typical case, the electron might be excited into state n = 7 and drop successively through the states n = 4 and n = 2 to the ground state n = 1. Three lines of the atomic spectrum are emitted with reciprocal wavelengths given by (4-19) for ni = 7 and n f = 4, ni = 4 and n f = 2, and ni = 2 and of =1. 4. In the very large number of excitation and deexcitation processes which take place during a measurement of an atomic spectrum, all possible transitions occur and the complete spectrum is emitted. The reciprocal wavelengths, or wavelengths, of the set of lines which constitute the spectrum are given by (4-19), where we allow nl and of to take on all possible integral values subject only to the restriction that n• > nf For hydrogen (Z = 1) let us consider the subset of spectral lines which arises from transitions in which of = 2. According to (4-19) the reciprocal wavelengths of these . K= R oe (1/nf — 1 /n?) nf = 2 and ni > nf or R.,(1/2 2 -1/n2) n=3,4,5,6,... This is identical with the series formula for the Balmer series of the hydrogen spectrum (4-10), if R oe is equal to RH. According to the Bohr Model 1 2 me4 K= Ro = (47TE ° ) 47ti 3 c Although the numerical values of some of the quantities entering into this equation were not very accurately known at the time, Bohr evaluated R oe in terms of these quantities and found that the resulting value was in quite good agreement with the experimental value of RH. In the next section we shall make a detailed comparison, using recent data, between the experimental value of RH and Bohr's prediction, and we shall show that the two agree almost perfectly. According to the Bohr model, each of the five known series of the hydrogen spectrum arises from a subset of transitions in which the electron goes to a certain final quantum state n f . For the Lyman series n f = 1; for the Balmer n f = 2; for the Paschen n f = 3; for the Brackett n f = 4; and for the Pfund nf = 5. The first three of these series are conveniently illustrated in terms of the energy-level diagram of Figure 4-12. The transition giving rise to a particular line of a series is indicated in this diagram by an arrow going from the initial quantum state ni to the final quantum state nf . Only the arrows corresponding to the first few lines of each series and to the series limit are shown. Since the distance between any two energy levels in such a diagram is proportional to the difference between the energy of the two levels, and since (4-13) states that the frequency y (or reciprocal wavelength) is proportional to the energy difference, the length of any arrow is proportional to the frequency (or reciprocal wavelength) for the corresponding spectral line. The wavelengths of the lines of all these series are fitted very accurately by (4-19) by using the appropriate value of nf . This was a great triumph for Bohr's model. The success of the model was particularly impressive because the Lyman, Brackett, and Pfund series had nbt been discovered at the time the model was developed by Bohr. The existence of these series was predicted, and the series were soon found experimentally by the persons after whom they are named. The model worked equally well when applied to the case of one-electron atoms with Z = 2, i.e., singly ionized helium atoms He + . Such atoms can be produced by passing a particularly violent electric discharge (a spark) through normal helium gas. They make their presence apparent by emitting a simpler spectrum than that emitted by normal helium atoms. In fact, the atomic spectrum of He + is exactly the same as the hydrogen spectrum except that the reciprocal wavelengths of all the lines are almost exactly four times as great. This is explained very easily, in terms of the Bohr model, by setting Z 2 = 4 in (4-19). The properties of the absorption spectrum of one-electron atoms are also easy to understand in terms of the Bohr model. Since the atomic electron must have a total energy exactly equal to the energy of one of the allowed energy states, the atom can only absorb discrete amounts of energy from the incident electromagnetic radiation. This fact leads to the idea that we consider the incident radiation to be a beam of photons, and that only those photons can be absorbed whose frequency is given by 13Q OWSaHOB lines are given by o BO HR 'S MO D EL O F THE ATO M n E (eV) 4 3 0.85 —1.51 2 3.39 1 13.6 0 I I 1000 I 3000 r I 2000 1300 2400 3000 I 1000 1700 5000 500 10,000 I I o 20,000 X (A) 200 v (10 12 Hz) Figure 4 12 Top: The energy-level diagram for hydrogen with the quantum number n for each level and some of the transitions that appear in the spectrum. An infinite number of levels is crowded in between the levels marked n = 4 and n = GO. Bottom: The corresponding spectral lines for the three series indicated. Within each series the spectral lines follow a regular pattern, approaching the series limit at the shortwave end of the series. As drawn here, neither the wavelength nor frequency scale is linear, being chosen as they are merely for clarity of illustration. A linear wavelength scale would more nearly represent the actual appearance of the photographic plate obtained from a spectroscope. The Brackett and Pfund series, which are not shown, lie in the far infared part of the spectrum. - E = hv, where E is one of the discrete amounts of energy which can be absorbed by the atom. The process of absorbing electromagnetic radiation is then just the inverse of the normal emission process, and the lines of the absorption spectrum will have exactly the same wavelengths as the lines of the emission spectrum. Normally the atom is always initially in the ground state n = 1, so that only absorption processes from n = 1 to n > 1 can occur. Thus, only the absorption lines which correspond (for hydrogen) to the Lyman series will normally be observed. However, if the gas containing the absorbing atoms is at a very high temperature, then, owing to collisions, some of the atoms will initially be in the first excited state n = 2, and absorption lines corresponding to the Balmer series will be observed. Example 4-7. Estimate the temperature of a gas containing hydrogen atoms at which the Balmer series lines will be observed in the absorption spectrum. ■ The Boltzmann probability distribution (see Appendix C) shows that the ratio of the number n2 of atoms in the first excited state to the number n 1 of atoms in the ground state, in a large sample in thermal equilibrium at temperature T, is n2 e -E2/kT n1 a -Ei/kT where k is Boltzmann's constant, k = 1.38 x 10 -23 joule/°K = 8.62 x 10 -5 eV/°K. For hydrogen atoms the energies of these two states are given in the energy-level diagram of Fig- ure 4-11: E 1 = —13.6 eV, n2 = e E2 = -(- — 3.39 eV. Hence 0 01 3.39+13.6) eV/(8.62 x 10 -5 eV /°K)T = e - 1.18 x 10 5 °K/T 4-7 CORRECTION FOR FINITE NUCLEAR MASS In the previous section we assumed the mass of the atomic nucleus to be infinitely large compared to the mass of the atomic electron, so that the nucleus remains fixed in space. This is a good approximation even for hydrogen, which contains the lightest nucleus, since the mass of that nucleus is about 2000 times larger than the electron mass. However, the spectroscopic data are so very accurate that before we make a detailed numerical comparison of these data with the Bohr model we must take into account the fact that the nuclear mass is actually finite. In such a case the electron and the nucleus move about their common center of mass. However, it is not difficult to show that in such a planetarylike system the electron moves relative to the nucleus as though the nucleus were fixed and the mass m of the electron were slightly reduced to the value µ, the reduced mass of the system. The equations of motion of the system are the same as those we have considered if we simply substitute µ for m, where mM (4-20) m+M is less than m by a factor 1/(1 + m/M). Here M is the mass of the nucleus. To handle this situation Bohr modified his second postulate to require that the total orbital angular momentum of the atom, L, is an integral multiple of Planck's constant divided by 221. This is achieved by generalizing (4-15) to µvr = nh n = 1, 2, 3, ... (4-21) Using instead of m in this equation takes into account the angular momentum of the nucleus as well as that of the electron. Making similar modifications to the rest of Bohr's derivation for the case of finite nuclear mass, we find that many of the equations are identical with those derived before, except that the electron mass m is replaced by the reduced mass µ. In particular, the formula for the reciprocal wavelengths of the spectral lines becomes l (4-22) Ro = R I where R M -m K=RMZ2( 4 MM m \\ .r — n j The quantity R M is the Rydberg constant for a nucleus of mass M. As M/m --+ co , it is apparent that R M —+ R., the Rydberg constant for an infinitely heavy nucleus which appears in (4-19). In general, the Rydberg constant R M is less than R oe by the factor 1/(1 + m/M). For the most extreme case of hydrogen, M/m = 1836 and R M is less than R oe by about one part in 2000. If we evaluate RH from (4-22), using the currently accepted values of the quantities m, M, e, c, and h, we find RH = 10968100 m7 1 . Comparing this with the experimental value of RH given in Section 4-4, we see that the Bohr model, corrected for finite nuclear mass, agrees with the spectroscopic data to within three parts in 100,000! CORRECT ION FOR FI NI TE N UCLEAR MASS n1 Therefore, a significant fraction of the hydrogen atoms will initially be in the first excited state only when T is not too much smaller than 10 5 °K; and only when they absorb from that state can they produce absorption lines of the Balmer series. The situation is complicated by the fact that the n = co level is not far above the n = 2 level. This proximity makes the probability that hydrogen atoms will initially be ionized increase with increasing temperature about as rapidly as the probability that the atoms will initially be in their first excited state. But no absorption lines at all can be produced by initially ionized hydrogen atoms. Detailed calculations predict that the maximum amount of Balmer absorption should be observed when the temperature is about 10 4 °K. Balmer absorption lines are actually observed in the hydrogen gas of some stellar atmo• spheres. This gives us a way of estimating the temperature of the surface of a star. 0 ^ BOHR 'S MODEL OF THE ATOM ^ Example 4 8. In Chapter 2 we spoke of the positronium "atom," consisting of a positron and - an electron revolving about their common center of mass, which lies halfway between them. (a) If such a system were a normal atom, how would its emission spectrum compare to that of the hydrogen atom? ■ In this case the "nuclear" mass M is that of the positron, which equals m, the mass of the electron. Hence, the reduced mass (4-20) is mM m 2m 2 The corresponding Rydberg constant R M is, according to (4-22) R, R a, 2 m m+m Rc° The energy states of the positronium atom then would be given by R M hcZ 2 n2 Epositronium = ro L m2 = m+M R cc hcZ 2 2n2 and the reciprocal wavelengths of the emitted spectral lines by 1 v h= _ â c R ao 2 ^ 1 - 1 Z 2 of n i = 1 The frequencies of the emitted lines would then be half, and the wavelengths double, that of a hydrogen atom (with infinitely heavy nucleus), Z being equal to one for positronium and for hydrogen. • (b) What would be the electron-positron separator, D, in the ground state orbit of positronium? • In (4-16) we merely replace m by p = m/2 and we find D positronium — 471E Dn 2 h 2 2 pZe2 47iE^ n 2 h 2 mZe2 2r hydrogen Hence, for any quantum state n the distance of the electron from the "nucleus" is twice as great in the positronium atom as in the hydrogen atom (with infinitely heavy nucleus). 4 A muonic atom contains a nucleus of charge Ze and a negative muon, p - , moving about it. The p - is an elementary particle with charge —e and a mass that is 2197 times as large as an electron mass. (a) Calculate the muon-nucleus separation, D, of the first Bohr orbit of a muonic atom with Z = 1. • The reduced mass of the system, with m u _ = 207m, and M = 1836m e , is, from (4-20) Example 4 9. - 207m, x 1836m e = 186me 207m,+1836 e Then, from (4-16), with n = 1, Z = 1, and m = 186me , we obtain 2 Di 5.3 x 10 -11 m = 2.8 x 10 -13 m = 2.8 x 10 -3 86m e 2 186 x A Therefore the it - is much closer to the nuclear (proton) surface than is the electron in a hydrogen atom. It is this feature which makes such muonic atoms interesting, information about nuclear properties being revealed from their study. • (b) Calculate the binding energy of a muonic atom with Z = 1. ^ From (4-18), with Z = 1, n = 1, and m = p = 186m e , we have et 4 E = —186 m (47r€0)22h2 = — —186 x 13.6 eV = —2530 eV as the ground state energy. Hence, the binding energy is 2530 eV. (c) What is the wavelength of the first line in the Lyman series for such an atom? 10.-From (4-22), with Z = 1, we have ^ K = RM 1 11 nz — nz /I f t • For the first Lyman line, ni = 2 and n î = 1. In this case, RM = (µ/m e)R c = 186R„,. Hence K= = 186R 03 (1- )=139.5R^ 4 With R oe, = 109737 cm -1 we obtain 6.5 A so that the Lyman lines lie in the x-ray part of the spectrum. X-ray techniques are necessary, • therefore, to study the spectrum of muonic atoms. Ordinary hydrogen contains about one part in 6000 of deuterium, or heavy hydrogen. This is a hydrogen atom whose nucleus contains a proton and a neutron. How does the doubled nuclear mass affect the atomic spectrum? ■ The spectrum would be identical if it were not for the correction for finite nuclear mass. For a normal hydrogen atom -1 RH _ µZ - R oo - 109737 = 109678 cm -1 R^ // Example 4 10. - ï I1+ M ^ (1 + 1836) For an atom of heavy hydrogen, or deuterium RD _ R µi = R 109737 -1 = = 109707 cm -1 1 C1 + M) (1 + 2 x 1836) Hence, RD is a bit larger than RH, so that the spectral lines of the deuterium atom are shifted to slightly shorter wavelengths compared to hydrogen. Indeed, deuterium was discovered in 1932 by H. C. Urey following the observation of these shifted spectral lines. By increasing the concentration of the heavy isotope above its normal value in a hydrogen discharge tube, we now can enhance the intensity of the deuterium lines which, ordinarily, are difficult to detect. We then readily observe pairs of hydrogen lines; the shorter wavelength members of the pair correspond exactly to those predicted from RD OE-line pair being separated by about earli.Thsoutnedialyb,theH 1.8 A, for example, several thousand times greater than the minimum resolvable separation. 1 • 4-8 ATOMIC ENERGY STATES The Bohr model predicts that the total energy of an atomic electron is quantized. For example, (4-18) gives the allowed energy values for the electron in a one-electron atom. Although we have not attempted to derive similar expressions for the electrons in a multielectron atom, it is clear that according to the model the total energy of each of the electrons will also be quantized and, consequently, that the same must be true of the atom's total energy content. The Planck theory of blackbody radiation had also predicted that in the process of emission and absorption of radiation, the atoms in the cavity wall behaved as though they had quantized energy states. Hence, according to the old quantum theory every atom can have only certain discretely separated energy states. Direct confirmation that the internal energy states of an atom are quantized came from a simple experiment performed by Franck and Hertz in 1914. The type of apparatus used by these investigators is indicated in Figure 4-13. Electrons are emitted thermally at low energy from the heated cathode C. They are accelerated to the anode A by a potential V applied between the two electrodes. Some of the electrons pass through holes in A and travel to plate P, providing their kinetic energy upon leaving A is enough to overcome a small retarding potential V,. applied between P and A. The entire tube is filled at a low pressure with a gas or vapor of the atoms to be investigated. The experiment involves measuring the electron current S31b'1SA01:13 ■ 3JIW Ol`d 2 co 0 Gas or vapor of atoms being investigated BOHR 'S MO DEL O F THEATO M T Heater 1 A P —C T vr o + Schematic of the apparatus used by Franck and Hertz to prove that atomic energy states are quantized. Figure 4-13 reaching P (indicated by the current I flowing through the meter) as a function of the accelerating voltage V. The first experiment was performed with the tube containing Hg vapor. The nature of the results are indicated in Figure 4-14. At low accelerating voltage, the current I is observed to increase with increasing voltage V. When V reaches 4.9 V, the current abruptly drops. This was interpreted as indicating that some interaction between the electrons and the Hg atoms suddenly begins when the electrons attain a kinetic energy of 4.9 eV. Apparently a significant fraction of the electrons of this energy excite the Hg atoms and in so doing entirely lose their kinetic energy. If V is only slightly more than 4.9 V, the excitation process must occur just in front of the anode A, and after the process the electrons cannot gain enough kinetic energy in falling toward A to overcome the retarding potential Vr, and reach plate P. At somewhat larger V, the electrons can gain enough kinetic energy after the excitation process to overcome Vr, and reach P. The sharpness of the break in the curve indicates that electrons of energy less than 4.9 eV are not able to transfer their energy to an Hg atom. This interpretation is consistent with the existence of discrete energy states for the Hg atom. Assuming the first excited state of Hg to be 4.9 eV higher in energy than the ground state, an Hg atom would simply not be able to accept energy from the bombarding electrons unless these electrons had at least 4.9 eV. 300 ^ E co E 200 a^ 100 5 10 15 Volts Figure 4-14 experiment. The voltage dependence of the current measured in the Franck-Hertz Continuum t= -10.4eV E=0 2nd excited state 1st excited state Ground state Figure 4-15 A considerably simplified energy-level diagram for mercury. Lying above the highest discrete energy level at E = 0 is a continuum of levels. S31`d1SA01:13N3011A1 01`d Now, if the separation between the ground state and the first excited state is actually 4.9 eV, there should be a line in the Hg emission spectrum corresponding to the atom's loss of 4.9 eV in undergoing a transition from the first excited state to the ground state. Franck and Hertz found that when the energy of the bombarding electrons is less than 4.9 eV no spectral lines at all are emitted from the Hg vapor in the tube, and when the energy is not more than a few electron volts greater than this value only a single line is seen in the spectrum. This line is of wavelength 2536 A, which corresponds exactly to a photon energy of 4.9 eV. The Franck-Hertz experiment provided striking evidence for the quantization of the energy of atoms. It also provided a method for the direct measurement of the energy differences between the quantum states of an atom—the answers appear on the dial of a voltmeter! When the curve of I versus V is extended to higher voltages, additional breaks are found. Some are due to electrons exciting the first excited state of the atoms on several separate occasions in their trip from C to A; but some are due to excitation of the higher excited states and, from the position of these breaks, the energy differences between the higher excited states and the ground state can be directly measured. Another experimental method of determining the separations between the energy states of an atom is to measure its atomic spectrum and then empirically to construct a set of energy states which would lead to such a spectrum. In practice this is often quite difficult to do since the set of lines constituting the spectrum, as well as the set of energy states, is often very complicated; however, in common with all spectroscopic techniques, it is a very accurate method. In all cases in which determinations of the separations between the energy states of a certain atom have been made, using both this technique and the Franck-Hertz technique, the results have been found to be in excellent agreement. In order to illustrate the preceding discussion, we show in Figure 4-15 a considerably simplified representation of the energy states of Hg in terms of an energylevel diagram. The separations between the ground state and the first and second excited states are known, from the Franck-Hertz experiment, to be 4.9 eV and 6.7 eV. These numbers can be confirmed, and in fact determined with much higher accuracy, by measuring the wavelengths of the two spectral lines corresponding to transitions of an electron in the Hg atom from these two states to the ground state. The energy _ —10.4 eV, of the ground state relative to a state of zero total energy, is not determined by the Franck-Hertz experiment. However, it can be found by measuring the wavelength of the line corresponding to a transition of an atomic electron from ° a state of zero total energy to the ground state. This is the series limit of the series terminating on the ground state. The energy can also be measured by measuring the energy which must be supplied to an Hg atom in order to send one of its electrons from the ground state to a state of zero total energy. Since an electron of zero total energy is no longer bound to the atom, 6' is the energy required to ionize the atom and is therefore called the ionization energy. Lying above the highest discrete state at E = 0 are the energy states of the system consisting of an unbound electron plus an ionized Hg atom. The total energy of an unbound electron (a free electron with E > 0) is not quantized. Thus any energy E > 0 is possible for the electron, and the energy states form a continuum. The electron can be excited from its ground state to a continuum state if the Hg atom receives an energy greater than 10.4 eV. Conversely, it is possible for an ionized Hg atom to capture a free electron into one of the quantized energy states of the neutral atom. In this process, radiation of frequency greater than the series limit corresponding to that state will be emitted. The exact value of the frequency depends on the initial energy E of the free electron. Since E can have any value, the spectrum of Hg should have a continuum extending beyond every series limit in the direction of increasing frequency. This can actually be seen experimentally, although with some difficulty. These comments concerning the continuum of energy states for E > 0, and its consequences, have been made in reference to the Hg atom, but they are equally true for all atoms. BOHR 'S MODELS OF THE ATO M ( 4 9 INTERPRETATION OF THE QUANTIZATION RULES - The success of the Bohr model, as measured by its agreement with experiment, was certainly very striking; but it only accentuated the mysterious nature of the postulates on which the model was based. One of the biggest mysteries was the question of the relation between Bohr's quantization of the angular momentum of an electron moving in a circular orbit and Planck's quantization of the total energy of an entity, such as an electron, executing simple harmonic motion. In 1916 some light was shed upon this by Wilson and Sommerfeld, who enunciated a set of rules for the quantization of any physical system for which the coordinates are periodic functions of time. These rules included both the Planck and the Bohr quantization as special cases. They were also of considerable use in broadening the range of applicability of the quantum theory. These rules can be stated as follows: For any physical system in which the coordinates are periodic functions of time, there exists a quantum condition for each coordinate. These quantum conditions are (4-23) pq dq = nqh where q is one of the coordinates, p q is the momentum associated with that coordinate, nq is a quantum number which takes on integral values, and means that the integration is taken over one period of the coordinate q. The meaning of these rules can best be illustrated in terms of some specific examples. Consider a one-dimensional simple harmonic oscillator. Its total energy can be written, in terms of position and momentum, as E=K+V= k22 2m + or p2 2mE x2 _ 2E/k 1 b = 1/2mE and a = J2E/k Now the area of an ellipse is nab. Furthermore, the value of the integral px dx is just equal to that area. (To see this note that the integral over a complete oscillation equals an integral in which the representative point travels from x = —a to x = + a over the upper half of the ellipse plus an integral in which the point travels back to x = —a over the lower half. In the first integral both px and dx are positive and its value equals the area enclosed between the upper half and the x axis; in the second both px and dx are negative so the value of the integral is positive and equals the area enclosed between the lower half of the ellipse and the x axis.) Thus we obtain px dx = nab In our case px dx = 2nE Vk/m Px Figure 4-16 Top: A phase space diagram of the motion of the representative point for a linear simple harmonic oscillator. Bottom: The allowed energy states of the oscillator are represented by ellipses whose areas in phase space are given by nh. The space between adjacent ellipses (for example the shaded area) has an area h. INTERPRETATION OF THE QUANTIZATIO N R ULES The quantization integral px dx is most easily evaluated, for the relation between px and x that is imposed by this equation, if we consider a geometric interpretation. The relation between px and x is the equation of an ellipse. Any instantaneous state of motion of the oscillator is represented by some point in a plot of this equation on a two-dimensional space having coordinates px and x. We call such a space (the p-q plane) phase space, and the plot is a phase diagram of the linear oscillator, shown in Figure 4-16. During one cycle of oscillation the point representing the position and momentum of the particle travels once around the ellipse. The semiaxes a and b of the ellipse p X /b 2 + x2/a2 = 1 are seen, by comparison with our equation, to be but B OHR 'S MODEL O F THE ATOM Oc/m = 27ry where v is the frequency of the oscillation, so that px dx = E/v If we now use (4-23), the Wilson- Sommerfeld quantization rule, we have ^ px dx= E/v =nxh-nh or E = nhv which is identical with Planck's quantization law. Note that the allowed states of oscillation are represented by a series of ellipses in phase space, the area enclosed between successive ellipses always being h (see Figure 4-16). Again we find that the classical situation corresponds to h —* 0, all values of E and hence all ellipses being allowed if that were true. The quantity 4 p x dx is sometimes called a phase integral; in classical physics it is the integral of the dynamical quantity called the action over one oscillation of the motion. Hence, the Planck energy quantization is equivalent to the quantization of action. We can also deduce the Bohr quantization of angular momentum from the WilsonSommerfeld rule, (4-23). An electron moving in a circular orbit of radius r has an angular momentum, mer = L, which is constant. The angular coordinate is 8, which is a periodic function of the time. That is, B versus t is a saw-tooth function, increasing linearly from zero to 27r rad in one period and repeating this pattern in each succeeding period. The quantization rule n pq dq = nqh becomes, in this case LdB=nh ^ and 2n ^ so that L dB= L d9 = 27rL o 27rL = nh or L = nh/27r - nh which is identical with Bohr's quantization law. A more physical interpretation of the Bohr quantization rule was given in 1924 by de Broglie. The Bohr quantization of angular momentum can be written as in (4-15) as mvr = pr = nh/27r n = 1, 2, 3, .. . where p is the linear momentum of an electron in an allowed orbit of radius r. If we substitute into this equation the expression for p in terms of the corresponding de Broglie wavelength p=h/.l the Bohr equation becomes or n = 1, 2, 3, ... (4-24) Thus the allowed orbits are those in which the circumference of the orbit can contain exactly an integral number of de Broglie wavelengths. 2irr = n2 Imagine the electron to be moving in a circular orbit at constant speed, with the associated wave following the electron. The wave, of wavelength A, is then wrapped repeatedly around the circular orbit. The resultant wave that is produced will have zero intensity at any point unless the wave at each traversal is exactly in phase at that point with the wave in other traversals. If the waves in each traversal are exactly in phase, they join on perfectly in orbits that accommodate integral numbers of de Broglie wavelengths, as illustrated in Figure 4-17. But the condition that this happens is just the condition that (4-24) be satisfied. If this equation were violated, then in a large number of traversals the waves would interfere with each other in such a way that their average intensity would be zero. Since the average intensity of the waves, `Pa, is supposed to be a measure of where the particle is located, we interpret this as meaning that an electron cannot be found in such an orbit. This wave picture gives no suggestion of progressive motion. Rather, it suggests standing waves, as in a stretched string of a given length. In a stretched string only certain wavelengths, or frequencies of vibration, are permitted. Once such modes are excited, the vibration goes on indefinitely if there is no damping. To get standing waves, however, we need oppositely directed traveling waves of equal amplitude. For the atom this requirement is presumably satisfied by the fact that the electron can traverse an orbit in either direction and still have the magnitude of angular momentum required by Bohr. The de Broglie standing wave interpretation, illustrated in Figure 4-17, therefore provides a satisfying basis for Bohr's quantization rule and, for this case, of the more general Wilson-Sommerfeld rule. There is another example of a system in which the origin of the Wilson-Sommerfeld quantization rule can be understood in terms of the requirement that the de Broglie Figure 4-17 Illustrating standing de Broglie waves set up in the first three Bohr orbits. The locations of the nodes can, of course, be found anywhere on each orbit provided that their spacings are as shown. S3i fla N OIlt/ZI1Nb'f1 03H1d ONOIlb'13ada31NI hr/A = nh/2R BOHR 'S MODEL O F THE ATO M waves associated with a particle undergoing periodic motion form a set of standing waves. Consider a particle which moves freely along the x axis from x = — a/2 to x = + a/2, but which does not penetrate into the regions outside these limits. This system can be thought of as representing approximately the motion of a conduction electron in a one-dimensional piece of metal that extends from — a/2 to + a/2. The particle bounces back and forth between the ends of the region with momentum px p. So the thacngesi bouc,tmains gtude Wilson-Sommerfeld equation reads = p2a = nh ^ px dx = or n h = 2a p (4-25) But hip is just the de Broglie wavelength A of the particle, so we have nA = 2a Thus an integral number of de Broglie wavelengths just fits into the distance covered by the particle in one traversal of the region, and this allows the waves associated with successive traversals to be in phase and so set up a standing wave. We shall see in the following chapters that the properties of standing waves are equally important in the quantization conditions of Schroedinger's quantum mechanics. And the time-independent features of the standing wave associated with an electron in the ground state of an atom will make it possible to understand in a simple way the fundamental question of why the electron does not emit electromagnetic radiation and spiral into the nucleus. 4 10 SOMMERFELD'S MODEL - One of the important applications of the Wilson-Sommerfeld quantization rules is to the case of a hydrogen atom in which it was assumed that the electron could move in elliptical orbits. This was done by Sommerfeld in an attempt to explain the fine structure of the hydrogen spectrum. The fine structure is a splitting of the spectral lines, into several distinct components, which is found in all atomic spectra. It can be observed only by using equipment of very high resolution since the separation, in terms of reciprocal wavelength, between adjacent components of a single spectral line is of the order of 10 -4 times the separation between adjacent lines. According to the Bohr model, this must mean that what we had thought was a 'single energy state of the hydrogen atom actually consists of several states which are very close together in energy. Sommerfeld first evaluated the size and shape of the allowed elliptical orbits, as well as the total energy of an electron moving in such an orbit, using the formulas of classical mechanics. Describing the motion in terms of the polar coordinates r and 0, ^ L d9 = neh ^pr dr=n,h The first condition yields the same restriction on the orbital angular momentum L = not/ = 1, 2, 3, . . . that it does for the circular orbit theory. The second condition (which was not applicable in the limiting case of purely circular orbits) leads to the following relation heaplidtwoqunmcis 1 2 µZ2 e4 (4-26c) 4rtE0 2n2 h2 where it is the reduced mass of the electron, and where the quantum number n is defined by n - ne +nr Since no = 1, 2, 3, ... and nr = 0, 1, 2, 3, ... , n can take on the values n= 1,2,3,4,... For a given value of n, no can assume only the values no = 1, 2,3,...,n The integer n is called the principal quantum number, and no is called the azimuthal quantum number. Equation (4-26b) shows that the shape of the orbit (the ratio of the semimajor to the semiminor axes) is determined by the ratio of no to n. For no = n the orbits are circles of radius a. Note that the equation giving a in terms of n is identical with (4-16), the equation giving the radius of the circular Bohr orbits. (Remember that (4-16) will have m replaced by p if proper account is taken of the finite nuclear mass.) Figure 4-18 shows, to scale, the possible orbit§ corresponding to the first three values of the principal quantum number. Corresponding to each value of the principal quantum number n there are n different allowed orbits. One of these, the circular orbit, is just the orbit described by the original Bohr model. The others are elliptical. But despite the very different paths followed by an electron moving in the different possible orbits for a given n, (4-26c) tells us that the total energy of the electron is the same. The total energy of the electron depends only on n. The several orbits characterized by a common value of n are said to be degenerate. The energies of different states of motion "degenerate" to the same total energy. E_— ng =2 Figure 4-18 Some elliptical Bohr-Sommerfeld orbits. The nucleus is located at the common focus of the ellipses, indicated by the dot. 130 OWS, 01 3d /J3 W W OS between L and a/b, the ratio of the semimajor axis to the semiminor axis of the ellipse nr = 0, 1, 2, 3, .. . L(a/b — 1) = n th By applying the condition of mechanical stability analogous to (4-14), a third equation is obtained. From these equations Sommerfeld evaluated the semimajor and semiminor axes a and b, which give the size and shape of the elliptical orbits, and also the total energy E of an electron in such an orbit. The results are 4nEOn2h2 (4-26a) a= µZe e no (4-26b) b=a— n D BOHR 'SMODEL O F THE ATO M C This degeneracy in the total energy of an electron, following the orbits of very different shape but common n, is the result of a very delicate balance between potential and kinetic energy, which is characteristic of treating the inverse square Coulomb force by the methods of classical mechanics. Exactly the same phenomenon is found in planetary or satellite motion, which is governed by the inverse square gravitational force. For instance, a satellite may be launched into any one of a whole family of elliptical orbits, all of which correspond to the same total energy and have the same semimajor axis. Of course there is effectively no quantization of the orbit parameters in these macroscopic cases, but as far as degeneracy is concerned they are completely analogous to the case of a hydrogen atom. Sommerfeld "removed the degeneracy" in the hydrogen atom by next treating the problem relativistically. In the discussion following (4-17) we showed that, for an electron in a hydrogen atom, v/c 10 -2 or less. Thus we would expect the relativistic corrections to the total energy, due to the relativistic variation of the electron mass whit% will be of the order of (v/c) 2, to be only of the order of 10 -4; however, this is just the order of magnitude of the splitting in the energy states of hydrogen that would be needed to explain the fine structure of the hydrogen spectrum. The actual size of the correction depends on the average velocity of the electron which, in turn, depends on the ellipticity of the orbit. After a calculation which is much too tedious to reproduce here, Sommerfeld showed that the total energy of an electron in an orbit characterized by the quantum numbers n and no is equal to (4-27a ) e42 h 2 L 1+ a2Z2 µZ22n (4ir€0) n 1ne— 4n 3/J The quantity a is a pure number called the fine structure constant. Its value is E a 7.297 x 10-3 (4-27b) ^ 137 In Figure 4-19 we represent the first few energy states of the hydrogen atom in terms of an energy-level diagram. The separation between the several levels with a common value of n has been greatly exaggerated for the sake of clarity. Arrows indicate transitions between the various energy states which produce the lines of the atomic spectrum. Lines corresponding to the transitions represented by the solid arrows are observed in the hydrogen spectrum. The wavelengths of these lines are in very good agreement with the predictions derived from (4-27a). However, the lines corresponding to the transitions represented by dashed arrows in Figure 4-19 are not found in the spectrum. The transitions concerned do not take place. Inspection of the figure will demonstrate that transitions only occur if nei — no f = ±1 (4-28) a =4 E 0 c = n=4 - r Figure 4-19 r r TV n=3, ne =3 —3, ne =2 n= 3, ne =1 =2 n= 2, ne =1 .^n = 2, n8 n = 1, no = 1 The fine-structure splitting of some energy levels of the hydrogen atom. The splitting is greatly exaggerated. Transitions which produce observed lines of the hydrogen spectrum are indicated by solid arrows. This is called a selection rule. It selects from all the transitions those that actually OMIT. A justification of selection rules could sometimes be found with the aid of an auxiliary postulate known as the correspondence principle. This principle, enunciated by Bohr in 1923, consists of two parts: 1. The predictions of the quantum theory for the behavior of any physical system must correspond to the prediction of classical physics in the limit in which the quantum numbers specifying the state of the system become very large. 2. A selection rule holds true over the entire range of the quantum number concerned. Thus any selection rules which are necessary to obtain the required correspondence in the classical limit (large n) also apply in the quantum limit (small n). Concerning the first part, it is obvious that the quantum theory must correspond to the classical theory in the limit in which the system behaves classically. The only question is: Where is the classical limit? Bohr's assumption is that the classical limit is always to be found in the limit of large quantum numbers. In making this assumption he was guided by certain evidence available at the time. For instance, the classical Rayleigh-Jeans theory of the blackbody spectrum agrees with experiment in the limit of small v. Since Planck's quantum theory agrees with experiment everywhere, we see that correspondence between the quantum and classical theories is found, in this case, in the limit of small v. But it is easy to see that as y becomes small the average value n, of the quantum number specifying the energy state of blackbody electromagnetic waves of frequency y, will become large. (Since g = nhv, we have = nhv. But as y -+ 0, I -* kT, so in this limit nhv = kT, which is a constant. Thus n —+ co as 0 in the classical limit Note also that if we fix y in the relation nhv = kT = const, and take h -i 0 as we frequently have in considering the classical limit, we again find n —+ co in that limit.) The second part of the correspondence principle was purely an assumption, but certainly a reasonable one. Let us illustrate the correspondence principle by applying it to a simple harmonic oscillator, such as a pendulum oscillating at frequency v. One prediction of quantum theory for this system is that the allowed energy states are given by E = nhv. In the discussion in Chapter 1, we saw that, in the limit of large n, this prediction is not in disagreement with what we actually know about the energy states of a classical pendulum. In this case of a simple harmonic oscillator, the quantum and classical theories do correspond for n —> co insofar as the energy states are concerned. Next assume that the pendulum bob carries an electric charge, so that we can compare the predictions of the two theories concerning the emission and absorption of electromagnetic radiation by such a system. Classically the system would emit radiation due to the accelerated motion of the charge, and the frequency of the emitted radiation would be exactly v. According to the quantum physics, radiation is emitted as a result of the system making a transition from quantum state ni to quantum state nf . The energy emitted in such a transition is equal to Ei — E f = (ni — n f )hv. This energy is carried away by a photon of frequency (Ei — E f )/h = (ni — n f )v. Thus, in order to obtain correspondence between the classical and quantum predictions of the frequency of the emitted radiation, we must require that the selection rule n i — n f = 1 be valid in the classical limit of large n. A similar argument concerning the absorption of radiation by the charged pendulum shows that in the classical limit there is also the possibility of a transition in which ni — n f = —1. The validity of these selection rules in the quantum limit of small n can be tested by investigating the spectrum of radiation emitted by a vibrating diatomic molecule. The vibrational energy states for such a system are just those of a simple harmonic oscillator, since the force which 31dIONIdd 3 0N3 GNO dS3ba O03H1 4-11 THE CORRESPONDENCE PRINCIPLE CO BOHR 'S M OD EL OF THE ATOM T Table 4-2 The Correspondence Principle for Hydrogen n v0 5 10 100 1,000 10,000 5.26x10 13 6.57 x 10 12 6.578 x 10 9 6.5779 x 106 6.5779 x 10 3 % Difference y 29 14 1.5 0.15 0.015 7.38 x 10 13 7.72 x 10 12 6.677 x 109 6.5878 x 10 6 6.5789 x 10 3 leads to the equilibrium separation of the two atoms has the same form as a harmonic restoring force. From the vibrational spectrum it can be determined that the selection rule ni — nf = ± 1 actually is in operation in the limit of small quantum numbers, in agreement with the second part of the correspondence principle. A number of other selection rules were discovered empirically in the analysis of atomic and molecular spectra. Sometimes, but not always, it was possible to understand these selection rules in terms of a correspondence principle argument. Example 4-11. Apply the correspondence principle to hydrogen atom radiation in the classical limit •The frequency of revolution v o of an electron in a Bohr orbit follows from (4-16) and (4-17) and is given by v 1 2 me 4 2 v0 2rcr = ( 471€0 ) 4Tih 3 n 3 According to classical physics the frequency of the light emitted in such a case is equal to v0 , the frequency of revolution. Quantum physics predicts that the frequency v of the emitted light is, from (4-19) C u=—= 1 )2 me4 ri 1 47ch 3 n f n?] But, if this is to agree with v o , we must have ni — o f = 1 as a selection rule for large quantum numbers. To see this, take ni — n f = 1 and obtain CK=47CE0 2 me4 1 1 1 2 me4 [ 2n— 1 v=(1 41110) 42Th 3 [(n — 1)2 n2 ] — \471E0 )47ih 3 (n — 1) 2n2 where ni = n and n f = n — 1. Then as n —* co the expression in the square brackets above approaches 2/n 3 so that v -* v0 as n —> co. In Table 4-2 we illustrate the correspondence for large n. 4 It is instructive to note that although both parts of the correspondence principle lead to agreement with experiment for the simple harmonic oscillator, only the first part agrees with experiment in the hydrogen atom considered in the preceding example. For experiment shows that the selection rule n i — n f = 1, which was necessary to satisfy the first part of the principle for large n, does not apply to the hydrogen atom for small n. Transitions are observed to occur between states of low n, in which the quantum numbers differ in value by more than one. This illustrates the fact that the old quantum theory cannot always be made to agree with experiment, however it is patched up. 4 12 A CRITIQUE OF THE OLD QUANTUM THEORY - In the past four chapters we have discussed some of the developments which led to modern quantum mechanics. These developments are now referred to as the old quantum theory. In many respects this theory was very successful, even more so than may be apparent to the student because we have not mentioned a number of success- QUESTIONS 1. In a collision between an a particle and an electron, what general considerations limit the momentum transfer? Does the fact that the force is Coulombic play any role in this respect? 2. How does the Thomson atom differ from a random distribution of protons and electrons in a spherical region? 3. List objections to the Thomson model of the atom. 4. Why do we specify that the foil be thin in experiments intended to check the Rutherford scattering formula? 5. The scattering of a particles at very small angles disagrees with the Rutherford formula for such angles. Explain. 6. How does the deduction of (4-3), which gives the trajectory of a particle moving under the influence of a repulsive inverse square Coulomb force, differ from the deduction of srvoils3 no ful applications of the old quantum theory to phenomena, such as the heat capacity of solids at low temperature, which were inexplicable in terms of the classical theories. However, the old quantum theory certainly was not free of criticism. To complete our discussion of this theory we must indicate some of its undesirable aspects: 1. The theory only tells us how to treat systems which are periodic, by using the Wilson-Sommerfeld quantization rules, but there are many systems of physical interest which are not periodic. And the number of periodic systems for which a physical basis of these rules can be found in the de Broglie relation is very small. 2. Although the theory does tell us how to calculate the energies of the allowed states of certain systems, and the frequency of the photons emitted or absorbed when a system makes a transition between allowed states, it does not tell us how to calculate the rate at which such transitions take place. For example, it does not tell us how to calculate the intensities of spectral lines. And we have seen that the theory cannot always tell us even which transitions actually are observed to occur and which are not. 3. When applied to atoms, the theory is really only successful for one-electron atoms. The alkali elements (Li, Na, K, Rb, Cs) can be treated approximately, but only because they are in many respects similar to a one-electron atom. The theory fails badly even when applied to the neutral He atom, which contains only two electrons. 4. Finally we might mention the subjective criticism that the entire theory seems somehow to lack coherence—to be intellectually unsatisfying. That some of these objections are really of a very fundamental nature was realized by everyone concerned, and much effort was expended in attempts to develop a quantum theory which would be free of these and other objections. The effort was well rewarded. In 1925 Erwin Schroedinger developed his theory of quantum mechanics. Although it is a generalization of the de Broglie postulate, the Schroedinger theory is in some respects very different from the old quantum theory. For instance, the picture of atomic structure provided by quantum mechanics is the antithesis of the picture, used in the old quantum theory, of electrons moving in well-defined orbits. Nevertheless, the old quantum theory is still frequently employed as a first approximation to the more accurate description of quantum phenomena provided by quantum mechanics. The reasons are that the old quantum theory is often capable of giving numerically correct results with mathematical procedures which are considerably less complicated than those used in quantum mechanics, and that the old quantum theory is often helpful in visualizing processes which are difficult to visualize in terms of the rather abstract language of quantum mechanics. 0 N BOHR 'S MODEL O F THE ATOM T 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. the trajectory of a planet moving under the influence of the gravitational field of the sun? Could a differential scattering cross section, defined as in (4-8), be used to describe very small angle a-particle scattering? Did Bohr postulate the quantization of energy? What did he postulate? For the Bohr hydrogen atom orbits, the potential energy is negative and greater in magnitude than the kinetic energy. What does this imply? If only lines in the absorption spectrum of hydrogen need to be calculated, how would you modify (4-19) to obtain them? On emitting a photon, the hydrogen atom recoils to conserve momentum. Explain the fact that the energy of the emitted photon is less than the energy difference between the energy levels involved in the emission process. Can a hydrogen atom absorb a photon whose energy exceeds its binding energy, 13.6 eV? Is it possible to get a continuous emission spectrum from hydrogen? What minimum energy must a photon have to initiate the photoelectric effect in hydrogen gas? (Careful!) Would you expect to observe all the lines of atomic hydrogen if such a gas were excited by electrons of energy 13.6 eV? Explain. Assume that electron-positron annihilation takes place from the ground state of positronium. How, if at all, does this alter the y-ray energies of the two-photon decay calculated in Chapter 2 by ignoring the bound system? Is the ionization energy of deuterium different from that of hydrogen? Explain. Why is the structure of the Franck-Hertz current versus voltage curve, Figure 4-14, not sharp? Is the peak in Figure 4-14 just below 10 eV due to two consecutive excitations of the first excited state of mercury or to one excitation of the second excited state? What examples of degeneracy in classical physics, other than planetary motion, can you think of? The fine-structure constant a is dimensionless and relates e, h, and c, three of the fundamental constants of physics. Is any other combination of these constants dimensionless (other than powers of the same combination, of course)? How can the correspondence principle be applied to the phase diagram of a linear oscillator, Figure 4-16? According to classical mechanics, an electron moving in an atom should be able to do so with any angular momentum whatever. According to Bohr's theory of the hydrogen atom, however, the angular momentum is quantized to L = nh/2m. Can the correspondence principle reconcile these two statements? PROBLEMS 1. Show, for a Thomson atom, that an electron moving in a stable circular orbit rotates with the same frequency at which it would oscillate in an oscillation through the center along a diameter. 2. What radius must the Thomson model of a one-electron atom have if it is to radiate a spectral line of wavelength 2 = 6000 A? Comment on your results. 3. Assume that the density of positive charge in any Thomson atom is the same as for the hydrogen atom. Find the radius R of a Thomson atom of atomic number Z in terms of the radius RH of the hydrogen atom. 4. (a) Ana particle of initial velocity y collides with a free electron at rest. Show that, assuming the mass of the a particle to be about 7400 electronic masses, the maximum deflection of the a particle is about 10 -4 rad. (b) Show that the maximum deflection of an a particle that interacts with the positive charge of a Thomson atom of radius 1.0 A is 6. 7. 8. CÉ 4 z o/ rcIpt z z (Mv2) \\ cot e (0/2) 9. The fraction of 6.0 MeV protons scattered by a thin gold foil, of density 19.3 g/cm 3, from the incident beam into a region where scattering an gles exceed 60° is equal to 2.0 x 10 -5 . Calculate the thickness of the gold foil, using results of the previous problem. 10. A beam of a-particles, of kinetic energy 5.30 MeV and intensity 10 4 particle/sec, is incident normally on a gold foil of density 19.3 g/cm 3, atomic weight 197, and thickness 1.0 x 10 -5 cm. An a particle counter of area 1.0 cm 2 is placed at a distance 10 cm from the foil. If Co is the angle between the incident beam and a line from the center of the foil to the center of the counter, use the Rutherford scattering differential cross section, (4-9), to find the number of counts per hour for Co = 10° and for Co = 45°. The atomic number of gold is 79. 11. In the previous problem, a copper foil of density 8.9 g/cm 3, atomic weight 63.6 and thickness 1.0 x 10 -5 cm is used instead of gold. When Co = 10° we get 820 counts per hour. Find the atomic number of copper. 12. Prove that Planck's constant has the dimensions of angular momentum. 13. The angular momentum of the electron in a hydrogen-like atom is 7.382 x 10 -34 joulesec. What is the quantum number of the level occupied by the electron? 14. Compare the gravitational attraction of an electron and proton in the ground state of a hydrogen atom to the Coulomb attraction. Are we justified in ignoring the gravitational force? 15. Show that the frequency of revolution of the electron in the Bohr model hydrogen atom is given by y = 2IEI /hn where E is the tot al energy of the electron. 16. Show that for all Bohr orbits the ratio of the magnetic dipole moment of the electronic orbit to its orbital angular momentum has the same value. 17. (a) Show that in the ground state of the hydrogen atom the speed of the electron can be written as y = ac where cc is the fine-structure constant. (b) From the value of a what can you conclude about the neglect of relativistic effects in the Bohr calculations? 18. Calculate the speed of the proton in a ground state hydrogen atom. 19. What is the energy, momentum, and wavelength of a photon that is emitted by a hydrogen atom making a direct transition from an excited state with n = 10 to the ground state? Find the recoil speed of the hydrogen atom in this process. 20. (a) Using Bohr's formula, calculate the three longest wavelengths in the Balmer series. (b) Between what wavelength limits does the Balmer series lie? 21. Calculate the shortest wavelength of the Lyman series lines in hydrogen. Of the Paschen series. Of the Pfund series. In what region of the electromagnetic spectrum does each lie? 22. (a) Using Balmer's generalized formula, show that a hydrogen series identified by the integer m of the lowest level occupies a frequency interval range given by Ay = cR H/(m + 1)2 . (b) What is the ratio of the range of the Lyman series to that of the Pfund series? sw318oad 5. also about 10 -4 rad. Hence, argue that 8 < 10 -4 rad for the scattering of ana particle by a Thomson atom. Derive (4-5) relating the distance of closest approach and the impact parameter to the scattering angle. A 5.30 MeV a particle is scattered through 60° in passing through a thin gold foil. Calculate (a) the distance of closest approach, D, for a head-on collison and (b) the impact parameter, b, corresponding to the 60° scattering. What is the distance of closest approach of a 5.30 MeV a particle to a copper nucleus in a head-on collision? Show that the number of a particles scattered by an angle O or greater in Rutherford scattering is N N OM OF THE AT DEL MO 'S _ HR BO â U 23. In the ground state of the hydrogen atom, according to Bohr's model, what are (a) the quantum number, (b) the orbit radius, (c) the angular momentum, (d) the linear momentum, (e) the angular velocity, (f) the linear speed, (g) the force on the electron, (h) the acceleration of the electron, (i) the kinetic energy, (j) the potential energy, and (k) the total energy? How do the quantities (b) and (k) vary with the quantum number? 24. How much energy is required to remove an electron from a hydrogen atom in a state with n = 8? 25. A photon ionizes a hydrogen atom from the ground state. The liberated electron recombines with a proton into the first excited state, emitting a 466 A photon. What are (a) the energy of the free electron and (b) the energy of the original photon? 26. A hydrogen atom is excited from a state with n = 1 to one with n = 4. (a) Calculate the energy that must be absorbed by the atom. (b) Calculate and display on an energy-level diagram the different photon energies that may be emitted if the atom returns to its n = 1 state. (c) Calculate the recoil speed of the hydrogen atom, assumed initially at rest, if it makes the transition from n = 4 to n = 1 in a single quantum jump. 27. A hydrogen atom in a state having a binding energy (this is the energy required to remove an electron) of 0.85 eV makes a transition to a state with an excitation energy (this is the difference in energy between the state and the ground state) of 10.2 eV. (a) Find the energy of the emitted photon. (b) Show this transition on an energy-level diagram for hydrogen, labeling the appropriate quantum numbers. 28. Show on an energy-level diagram for hydrogen the quantum numbers corresponding to a transition in which the wavelength of the emitted photon is 1216 A. 29. (a) Show that when the recoil kinetic energy of the atom, p 2 /2M, is taken into account the frequency of a photon emitted in a transition between two atomic levels of energy difference AE is reduced by a factor which is approximately (1 — AE/2Mc 2). (Hint: The recoil momentum is p = hv/c.) (b) Compare the wavelength of the light emitted from a hydrogen atom in the 3 -* 1 transition when the recoil is taken into account to the wavelength without accounting for recoil. 30. What is the wavelength of the most energetic photon that can be emitted from a muonic atom with Z = 1? 31. A hydrogen atom in the ground state absorbs a 20.0 eV photon. What is the speed of the liberated electron? 32. Apply Bohr's model to singly ionized helium, that is, to a helium atom with one electron removed. What relationships exist between this spectrum and the hydrogen spectrum? 33. Using Bohr's model, calculate the energy required to remove the electron from singly ionized helium. 34. An electron traveling at 1.2 x 10' m/sec combines with an alpha particle to form a singly ionized helium atom. If the electron combined directly into the ground level, find the wavelength of the single photon emitted. 35. A 3.00 eV electron is captured by a bare nucleus of helium. If a 2400 A photon is emitted, into what level was the electron captured? 36. In a Franck-Hertz type of experiment atomic hydrogen is bombarded with electrons, and excitation potentials are found at 10.21 V and 12.10 V. (a) Explain the observation that three different lines of spectral emission accompany these excitations. (Hint: Draw an energy-level diagram.) (b) Now assume that the energy differences can be expressed as hv and find the three allowed values of v. (c) Assume that y is the frequency of the emitted radiation and determine the wavelengths of the observed spectral lines. 37. Assume, in the Franck-Hertz experiment, that the electromagnetic energy emitted by an Hg atom, in giving up the energy absorbed from 4.9 eV electrons, equals hv, where y is the frequency corresponding to the 2536 A mercury resonance line. Calculate the value of h according to the Franck-Hertz experiment and compare with Planck's value. 38. Radiation from a helium ion He + is nearly equal in wavelength to the H OE line (the first line of the Balmer series). (a) Between what states (values of n) does the transition in the 40. 41. 42. 43. N W sw 318 oad 39. helium ion occur? (b) Is the wavelength greater or smaller than that of the H a line? (c) Compute the wavelength difference. In stars the Pickering series is found in the He + spectrum. It is emitted when the electron in He jumps from higher levels into the level with n = 4. (a) State the exact formula for the wavelength of lines belonging to this series. (b) In what region of the spectrum is the series? (c) Find the wavelength of the series limit (d) Find the ionization potential, if He + isnthegrouda,lcnvts. Assuming that an amount of hydrogen of mass number three (tritium) sufficient for spectroscopic examination can be put into a tube containing ordinary hydrogen, determine the separation from the normal hydrogen line of the first line of the Balmer series that should be observed. Express the result as a difference in wavelength. A gas discharge tube contains H 1 , H2, He 3 , He4, Lib, and Li z ions and atoms (the superscript is the atomic mass), with the last four ionized so as to have only one electron. (a) As the potential across the tube is raised from zero, which spectral line should appear first? (b) Give, in order of increasing frequency, the origin of the lines corresponding to the first line of the Lyman series of H 1 . Consider a body rotating freely about a fixed axis. Apply the Wilson-Sommerfeld quantization rules, and show that the possible values of the total energy are predicted to be E = h2 n2 /2I n=0,1,2,3,... where I is its rotational inertia, or moment of inertia, about the axis of rotation. Assume the angular momentum of the earth of mass 6.0 x 10 24 kg due to its motion around the sun at radius 1.5 x 10 11 m to be quantized according to Bohr's relation L = nh/2n. What is the value of the quantum number n? Could such quantization be detected? 5 SCHROEDINGER'S THEORY OF QUANTUM MECHANICS 5-1 INTRODUCTION 125 role of Schroedinger theory; limitations of de Broglie postulate; need for differential wave equation 5-2 PLAUSIBILITY ARGUMENT LEADING TO SCHROEDINGER'S EQUATION 128 required consistency with de Broglie postulate and classical energy equation; required linearity; assumed sinusoidal solution for free particle; failure of real solution; success of complex solution; postulated generality; relation to Dirac theory; simple harmonic oscillator wave function 5-3 BORN'S INTERPRETATION OF WAVE FUNCTIONS 134 complex character of wave functions; wave functions as computational devices; probability density; Born's postulate; quantum and classical simple harmonic oscillator probability densities; normalization; statistical predictions of quantum mechanics 5-4 EXPECTATION VALUES 141 repeated measurements and position expectation value; simple harmonic oscillator position expectation value; momentum expectation value; differential operators; operator equations; variable-operator associations; general prescription for expectation values; particle in a box 5-5 THE TIME-INDEPENDENT SCHROEDINGER EQUATION 150 separation of variables; time dependence of wave functions; discussion of time-independent equation; eigenfunctions; plausibility argument for timeindependent equation 5-6 REQUIRED PROPERTIES OF EIGENFUNCTIONS 155 finiteness, single valuedness, and continuity of acceptable solutions and their first derivatives; justification 5-7 ENERGY QUANTIZATION IN THE SCHROEDINGER THEORY geometrical properties of differential equation solutions; curvature; difficulty with finiteness of time-independent Schroedinger equation solutions; discrete total energies for bound solutions; continuum for unbound solutions; qualitative forms of simple harmonic oscillator eigenfunctions 124 157 5-8 SUMMARY 165 QUESTIONS 168 PROBLEMS 169 5-1 INTRODUCTION We have presented experimental evidence which shows conclusively that the particles of microscopic systems move according to the laws of some form of wave motion, and not according to the Newtonian laws of motion obeyed by the particles of macroscopic systems. Thus a microscopic particle acts as if certain aspects of its behavior are governed by the behavior of an associated de Broglie wave, or wave function. The experiments considered dealt only with simple cases (such as free particles, or simple harmonic oscillators, etc.) that can be analyzed with simple procedures (involving direct applications of the de Broglie postulate, Planck's postulate, etc.). But we certainly want to be able to treat the more complicated cases that occur in nature because they are interesting and important. To be able to do this we must have a more general procedure that can be used to treat the behavior of the particles of any microscopic system. Schroedinger's theory of quantum mechanics provides us with such a procedure. The theory specifies the laws of wave motion that the particles of any microscopic system obey. This is done by specifying, for each system, the equation that controls the behavior of the wave function, and also by specifying the connection between the behavior of the wave function and the behavior of the particle. The theory is an extension of the de Broglie postulate. Furthermore, there is a close relation between it and Newton's theory of the motion of particles in macroscopic systems. Schroedinger's theory is a generalization that includes Newton's theory as a special case (in the macroscopic limit), much as Einstein's theory of relativity is a generalization that includes Newton's theory as a special case (in the low velocity limit). We shall develop the essential points of the Schroedinger theory and use them to treat a number of important microscopic systems. For instance, we shall use the theory to obtain a detailed understanding of the properties of atoms. These properties form the basis of much of chemistry and solid state physics, and they are closely related to the properties of nuclei. After we have applied Schroedinger's theory to a number of cases, the student should find that he is beginning to develop an intuition concerning the behavior of quantum mechanical systems, just as he has developed an intuitive feeling for classical systems from his study of Newton's theory and its applications to a number of cases. Actually, a better comparison can be made between the Schroedinger theory and Maxwell's theory of electromagnetism. The reason for this is that electromagnetic waves behave in a manner which is very analogous to the behavior of the wave functions of the Schroedinger theory. We shall use this analogy, when appropriate, to show how quantum mechanical results are related to results that are familiar from the study of electromagnetism, or of other forms of classical wave motion. We shall also discuss many experiments which directly confirm the quantum mechanical results that we obtain, just as we have discussed many experiments which set the stage for the theory. But the student will have to exercise a little patience because there NOIlJfl 4OalNI eigenvalues, eigenfunctions, wave functions, quantum numbers, and quantum states; general solution to Schroedinger equation; static or oscillating probability densities and radiation emission by atoms SC HRO EDING ER 'S THEORYOF Q UANTU M MECHANI CS is much to be done in developing the theory, and in working out its consequences, before we can make many comparisons between these consequences and experiment. Now, we have seen that de Broglie's postulate provides a fundamental step in the development of Schroedinger's general theory of the behavior of microscopic particles. However, it is only a step. The postulate says the motion of a microscopic particle is governed by the propagation of an associated wave, but the postulate does not tell us how the wave propagates. The postulate does predict successfully the wavelength of the wave inferred from measurements of the diffraction pattern observed in the motion of the particle, but only in cases in which the wavelength is essentially constant. Furthermore, we must have a quantitative relation between the properties of the particle and the properties of the wave function that describes the wave. That is, we must know exactly how the wave governs the particle. In this chapter we shall first study the equation, developed by Erwin Schroedinger in 1925, which tells us the behavior of any wave function of interest. Then we shall study the relation, developed by Max Born in the following year, which connects the behavior of the wave function to the behavior of the associated particle. Detailed solutions of the Schroedinger equation are deferred to the following chapters, but in this chapter we shall look at its solutions in a general way, and we shall see how they lead very naturally to the quantization of energy and other important phenomena. We can appreciate some of the problems concerning the applicability of the de Broglie postulate, and also get some clues about what will have to be done to remove the problems, by considering again the case of a free particle. In this case we have been successful in doing much with the postulate. When, in Chapter 3, it was necessary to have a mathematical expression for a wave function, we used a simple sinusoidal traveling wave, such as x T(x,t) = sin 2n (— — vt) (5-1) or else a wave function formed by adding several simple sinusoidals. The form in (5-1) was obtained essentially by guessing, with the guess being based on the fact that a free particle has a linear momentum p of constant magnitude, since it is not acted on by a force, and therefore it has an associated de Broglie wavelength 2 = h/p of constant magnitude. Equation (5-1) is just the familiar form for a sinusoidal traveling wave of constant wavelength A. It also has a constant frequency y, which we evaluated from the Einstein relation y = E/h, where E is the total energy of the associated particle. In Chapter 4 we were able to extend the use of a wave function like (5-1) to the case of a particle moving in a circular Bohr orbit by imagining such a sinusoidal wrapped around the orbit. But this was possible only because in a circular orbit the magnitude p of the linear momentum remains constant so that 2 = h/p, the de Broglie wavelength, is also constant, even though the particle is acted on by a force. We shall not be able to make such simple extensions to treat cases where the linear momentum of the particle is of changing magnitude, and, of course, these cases are typical of what happens when a particle is acted on by a force. The point is that the de Broglie postulate, 2 = h/p, says the wavelength 2 will change if p changes; but a wavelength is not even well defined if it changes very rapidly. We illustrate this with the nonsinusoidal wave shown in Figure 5-1. For this wave it is difficult to define even a variable wavelength since the separation between adjacent maxima is not equal to the separation between adjacent minima. To put the point another way, if the linear momentum of a particle is not of constant magnitude because the particle is acted on by a force, functions which are more complicated than the sinusoidal of (5-1) are required to describe the associated wave. We shall need help to find these more complicated wave functions. W(x, t) N ^ NOIlJfIQ OalN I Fixed t x Figure 5-1 A non-sinusoidal wave. Inspection will show that the separation between an adjacent pair of maxima differs from that between the closest adjacent pair of minima. Therefore it is difficult to define a wavelength even for a single oscillation. The Schroedinger equation will provide the required assistance. This is the equation that tells us the form of the wave function `P(x,t), if we tell it about the force acting on the associated particle by specifying the potential energy corresponding to the force. In other words, the wave function is a solution to the Schroedinger equation for that potential energy. The most common type of equation which has a function for a solution is a differential equation. In fact, the Schroedinger equation is a differential equation. That is, the equation is a relation between its solution `P(x,t) and certain derivatives of `Y(x,t) with respect to the independent space and time variables x and t. As there is more than one independent variable, these must be partial derivatives, such as ô2T(x,t) a2W(x,t) a P(x,t) 3'P(x,t) (5-2) ate or r axe o at or ax Example 5-1. Evaluate the partial derivatives listed above of the sinusoidal function, (5-1). ^ A partial derivative is a derivative of a function of several independent variables, which is evaluated by allowing one of the variables to vary, while holding all the others temporarily fixed. This is indicated by using a symbol such as ô"(x,t)/ôx instead of the usual symbol for the ordinary derivative d`II(x,t)/dx. The symbol means, for instance ÔW(x,t) [dtP(xt) 7x dx ] (5-3) evaluated by treating t as a constant or ô`11(x,t)[d(xt) _ dt at ] evaluated by treating x as a constant Before applying this procedure on the sinusoidal function of (5-1), it is convenient to rewrite it in terms of the quantities k = 2701 and co = 27rv. We obtain C `I'(x,t) = sin 27x — vt) = sin (kx — cot) The partial differentiations then yield ô'(x,t)_ ô sin (kx — cot) =k cos (kx — cot) ex ex 02111(x,t) ô cos (kx — wt) k = k2 sin (kx — cot) ax ôx2 ô`F(x,t) ô sin (kx — wt) = w cos (kx — wt) at ô2 (x,t) 2 at — w ô cos (kx - wt) — w2 sin (kx ( — wt) (5-5) SCHROEDINGER 'S THEORY OF QU ANTU M MECHANICS since t can be treated as a constant in the first two differentiations, whereas x can be treated as a constant in the last two. These results will prove to be useful shortly. 4 The Schroedinger equation is a partial differential equation. We shall, in due course, study solutions of this equation, and we shall see that it is generally quite easy to decompose it into a set of ordinary differential equations (i.e., differential equations involving only ordinary derivatives). These ordinary differential equations will then be handled by the application of straightforward techniques. In all this work we shall assume no previous knowledge about differential equations of any type on the part of the student. We shall assume only that he knows how to differentiate and integrate. Of course, the student very probably has had some experience with ordinary differential equations in connection with his study of classical mechanics. He has probably even had a little experience with partial differential equations because the Schroedinger equation is a member of the class of partial differential equations called wave equations, which arise in many fields of classical as well as quantum physics. Examples from the former field are the wave equation for vibrations in a stretched string and the wave equation for electromagnetic radiation. We shall see that the quantum mechanical wave equation has many properties in common with the classical wave equation, and also that it has some very interesting differences. 5-2 PLAUSIBILTY ARGUMENT LEADING TO SCHROEDINGER'S EQUATION Now the first problem at hand is not how to solve a certain differential equation; instead, the problem is how to find the equation. That is, we are in the position of Newton when he was looking for the differential equation F dp dt 2 m dt2 (5-6) which is the basic equation of classical mechanics, or of Maxwell, when he was looking for the differential equations such as 0Ex OEy OEz p + ey + ex ôz = E0 (5-7) 5-7 that form the basis of classical electromagnetism. The wave equation for a stretched string can be derived from Newton's law, and the electromagnetic wave equation can be derived from Maxwell's equations; but we cannot expect to be able to derive the quantum mechanical wave equation from any of the equations of classical physics. However, we can expect to receive some help from the de Broglie-Einstein postulates = h/p and y = E/h (5 - 8) which connect the wavelength 2 of the wave function with the linear momentum p of the associated particle, and also connect the frequency y of the wave function with the total energy E of the particle, for the case of a particle with essentially constant p and E. That is, the quantum mechanical wave equation we seek must be consistent with these postulates, and we shall use this required consistency in our search. Equations (5-8), plus others that we shall have reason to accept, will be woven into an argument that is designed to make the quantum mechanical wave equation seem very plausible, but it must be emphasized that this plausibility argument will not constitute a derivation. In the final analysis, the quantum mechanical wave equation will be obtained by a postulate, whose justification is not that it has been deduced entirely from information already known experimentally, but that it correctly predicts results which can be verified experimentally. We begin our plausibility argument by listing four reasonable assumptions concerning the properties of the desired quantum mechanical wave equation: 1. It must be consistent with the de Broglie-Einstein postulates, (5-8) v=E/h and = h/p 2. It must be consistent with the equation E=p2/2m+V This is just the case of the free particle since the force acting on the particle is given by F = - ôV(x,t)/âx which yields F = 0 if Vo is a constant. In this case Newton's law of motion tells us that the linear momentum p of the particle will be constant, and we also know that its total energy E will be constant. We have here the situation of a free particle with constant values of A = hl p and y = E/h, discussed in Chapter 3. We therefore assume that, in this case, the desired differential equation will have sinusoidal traveling wave solutions of constant wavelength and frequency, similar to the sinusoidal wave function, (5-1), considered in that chapter. Using the de Broglie-Einstein relations of assumption 1 to write the energy equation of assumption 2 in terms of A and y, we obtain h2/2m2 2 + V(x,t) = hv Before proceeding, it is convenient to introduce the quantities k = 2n/A and w = 2nv (5-11) As in Example 5-1, they are useful because they keep variables out of denominators and because they "absorb" a factor of 2n that would otherwise appear every time we write a sinusoidal wave function. The quantity k is called the wave number; the quantity w is called the angular frequency. Introducing them, we obtain (5-12) h2k2/2m + V(x,t) = hw where h - h/2n is Planck's constant divided by 2n. To satisfy assumptions 1 and 2, the wave equation we seek must be consistent with (5-12). PLAUSI BILITY ARGUMENT LEADIN G TO SC HROEDINGER' S EQUATION relating the total energy E of a particle of mass m to its kinetic energy p 2/2m and its potential energy V. 3. It must be linear in `P(x,t). That is, if 'P 1(x,t) and 'P 2(x,t) are two different solutions to the equation for a given potential energy V (we shall see that partial differential equations have many solutions), then any arbitrary linear combination of these solutions,'P(x,t) = c 11P 1(x,t) + c 2'P2(x,t), is also a solution. This combination is said to be linear since it involves the first (linear) power of 'P 1(x,t) and 'I' 2(x,t); it is said to be arbitrary since the constants c 1 and c2 can have any (arbitrary) values. This linearity requirement ensures that we shall be able to add together wave functions to produce the constructive and destructive interferences that are so characteristic of waves. Interference phenomena are commonplace for electromagnetic waves; all the diffraction patterns of physical optics are understood in terms of the addition of electromagnetic waves. But the Davisson-Germer experiment, and others, show that diffraction patterns are also found in the motion of electrons, and other particles. Therefore, their wave functions also exhibit interferences, and so they should be capable of being added. 4. The potential energy V is generally a function of x, and possibly even t. However, there is an important special case where (5-10) V(x,t) = V0 SCHR OED ING ER 'S THEO RY O F QUANTUM MECHANICS In order to satisfy the linearity assumption 3, it is necessary that every term in the differential equation be linear in 'P(x,t), i.e., be proportional to the first power of W(x,t). Note that any derivative of W(x,t) has this property. For instance, if we consider the change in the magnitude of 02'P(x,t)/ax e that results if we change the magnitude of 'P(x,t), say by a factor of c, we see that the derivative increases by the same factor and thus is proportional to the first power of the function. This is true since 02 [c'(x,t)] ax 2 =c a211(x,t) ax2 where c is any constant. In order that the differential equation itself be linear in 'P(x,t), it cannot contain any term which is independent of 'P(x,t), i.e., which is proportional to [LP(x,t)] °, or which is proportional to ['P(x,t)]2 or any higher power. After obtaining the equation, we shall demonstrate explicitly that it is linear in W(x,t), and in the process the validity of these statements will become apparent. Now let us use the assumption 4, which concerns the form of the free particle solution. As suggested by that assumption, we shall first try to write an equation containing the sinusoidal wave function, (5-1), and/or derivatives of that wave function. We have already evaluated some of the derivatives in Examples 5-1. Inspecting these, we see that the effect of taking the second space derivative is to introduce a factor of — k 2, and the effect of taking the first time derivative is to introduce a factor of —w. Since the differential equation we seek must be consistent with (5-12), which contains a factor of k 2 in one term and a factor of w in another, these facts suggest that the differential equation should contain a second space derivative of P(x,t) and a first time derivative of 'P(x,t). But there must also be a term containing a factor of V(x,t) because it is present in (5-12). In order to ensure linearity, this term must contain a factor of 'P(x,t). Putting all these ideas together, we try the following form for the differential equation a a21-1-1(x,t) + V(x t)W(x,t) = l 3 a`P(x,t) (5-13) ax2 at The constants cc and 13 have values which remain to be determined. They are used to provide flexibility which, we might guess, will be needed in fitting (5-13) to the various requirements it must satisfy. The form of (5-13) seems reasonable in general, but will it work in detail? To find out we consider the case of a constant potential, V(x,t) = V° , and evaluate 'P(x,t) and its derivatives from (5-1) and (5-5). We obtain immediately — a sin (kx — wt)k 2 + sin (kx — wt)V ° = — /3 cos (kx — wt)w (5-14) Even though the constants a and fi are at our disposal, we cannot make this agree with (5-12), and thus satisfy assumptions 1 and 2, except for special combinations of the independent variables x and t for which sin (kx — wt) = cos (kx — wt). It is true that we could obtain agreement if a and fi were not constants, but we reject this possibility in favor of the very much simpler one presented next. The difficulty at hand arises because differentiation changes cosines into sines, and vice versa. This fact suggests that we try using for the free particle wave function not the single sinusoidal of (5-1), but instead the combination (x,t) = cos (kx — wt) + y sin (kx — wt) (5-15) where y is a constant, of as yet undetermined value, which is introduced for the purpose of providing additional flexibility. We hope to find the proper mixture of a cosine and a sine that will remove the difficulty. Evaluating the required derivatives, we find 3W(x,t) ex i2 = — k sin (kx — cot) + ky cos (kx — cot) x,t = —k 2 cos (kx — cot) — k ey sin (kx — cot) 8x 2 ( ) (5 16) - Then we try again; substituting (5-15) and (5-16) into the same assumed form, (5-13), for the differential equation, and setting V(x,t) = Vo , we obtain —ak 2 cos (kx — cot) — ak 2y sin (kx — cot) + Vo cos (kx — cot) + V oy sin (kx — cot) = /3co sin (kx — cot) — f3coy cos (kx — cot) or [ — ak 2 + Vo + f3coy] cos (kx — cot) + [ — ak 2y + Voy — /3co] sin (kx — cot) = 0 In order that the last equality hold for all possible combinations of the independent variables x and t, it is necessary that the coefficients of both the cosine and the sine be zero. Thus we obtain (5-17) —ak2 +Vo = — 13Yw and (5-18) ak2 + Vo = /co/Y Now we have a problem that is easily handled; there are three algebraic equations that we must satisfy, (5-12), (5-17), and (5-18), but we have three free constants a, f, and y, at our disposal. Subtracting (5-18) from (5-17), we find 0 = — /3Yw — f3 w/Y or y = — 1 /Y so that y2 = — 1 or y=± /-1-+i (5-19) where i is the imaginary number (see Appendix F). Substituting this result into (5-17) we find —ak2 + Vo = + i f3co This can be compared directly with (5-12) h2k2 /2m + Vo = Pico to yield (5-20) a = —h2/2m and +if3 = h or (5-21) (3 = + iii There are two possible choices of the sign in (5-19). It turns out to be of no significant consequence which choice is made, and therefore we follow conventional usage and choose the plus sign. Then (5-21) yields f3 = + ih and, with (5-20), we finally can evaluate all the constants in the assumed form of the differential equation. Thus — PLA USIB ILITY ARGUM ENT LEADI N G TOSCHROEDIN GER' S EQUATI ON ô'P(x,t) et = co sin (kx — cot) — coy cos (kx — cot) N CO SCHROEDINGER 'S THEORY OF Q UANTUM MECHANICS ^ ci L (5-13) becomes h2 atP(x,t) 02(x't) + V(x,t)T(x,t) = ih (5-22) 2m ôx 2 et This differential equation satisfies all four of our assumptions concerning the quantum mechanical wave equation. — It should be emphasized that we have been led to (5-22) by treating a special case: the case of a free particle where V(x,t) = V0 , a constant. At this point it seems plausible to argue that the quantum mechanical wave equation might be expected to have the same form as (5-22) in the general case where the potential energy V(x,t) does actually vary as a function of x and/or t (i.e., where the force is not zero); but we cannot prove this to be true. We can, however, postulate it to be true. We do this, and therefore take (5-22) as the quantum mechanical wave equation whose solutions W(x,t) give us the wave function which is to be associated with the motion of a particle of mass m under the influence of forces which are described by the potential energy function V(x,t). The validity of the postulate must be judged by comparing its implications with experiment, and we shall make many such comparisons later. Equation (5-22) was first obtained in 1926 by Erwin Schroedinger, and it is therefore called the Schroedinger equation. Schroedinger was led to his equation by an argument different from ours (and more esoteric). We shall see the essential ideas of his argument in Section 5-4. However, he was as strongly influenced by the de Broglie postulate in his work as we have been in ours. This can be seen in the following quotation, in which the physicist Debye describes the circumstances surrounding Schroedinger's development of his equation. "Then de Broglie published his paper. At that time Schroedinger was my successor at the University in Zurich, and I was at the Technical University, which is a Federal Institute, and we had a colloquium together. We were talking about de Broglie's theory and agreed that we did not understand it, and that we should really think about his formulations and what they mean. So I called Schroedinger to give us a colloquium. And the preparation of that really got him started. There were only a few months between his talk and his publications." It should be pointed out that we cannot expect the Schroedinger equation to be valid when applied to particles moving at relativistic velocities. This is the case because the equation has been designed to be consistent with (5-9), the classical energy equation, which is incorrect for velocities comparable to the velocity of light. In 1928 Dirac developed a relativistic theory of quantum mechanics utilizing essentially the same postulates as the Schroedinger theory, except that (5-9) was replaced by its relativistic analogue E = Jc2p2 + (moc2)2 + V The Dirac theory reduces to the Schroedinger theory, of course, in the low-velocity limit Because of the serious complications introduced by the square root in the relativistic energy equation, a quantitative treatment of the Dirac theory would not be appropriate in this book. However, some of the more interesting features of the Dirac theory will be described qualitatively in the following chapters on occasions when relativistic quantum phenomena must be discussed; and one feature, pair production, has already been described. Fortunately, most of the interesting quantum phenomena can be studied in cases which are nonrelativistic. Verify that the Schroedinger equation is linear in the wave function `F(x,t); i.e., that it is consistent with the linearity assumption 3. ■ We must show that, if ' 1 (x,t) and `h 2 (x,t) are two solutions to (5-22) for a particular V(x,t), then tY(x,t) = c i tI' i(x,t) + c2Y`2(x,t) Example 5 2. - is also a solution to that equation, where c l and c2 are constants of arbitrary value. Transposing (5-22), we have for the Schroedinger equation h2 2 + V— ii a^ = 0 2m a Now we check the validity of the linear combination by substituting it into this equation it is supposed to satisfy. We obtain 1 cp — ( \ cl aâ l aâ 2) = 0 + C2 V`Yzi% sz lJ =0 If the linear combination actually is a solution to the Schroedinger equation then the last equality should be satisfied. It is, for all values of c l and c 2 , because the Schroedinger equation says each bracket equals zero since T 1 and `I'2 are solutions to that equation for the same V. A little thought should convince the student that this essential result would not be obtained if the Schroedinger equation contained any terms which are not proportional to the first power of '(x,t). • In following chapters we shall solve in a methodical way Schroedinger's equation for a number of important systems, and we shall obtain thereby the wave functions that describe the systems. But in this chapter we must use some of these wave functions in order to illustrate various properties of the Schroedinger theory. These wave functions will be "pulled out of the hat," as required. However, we shall give the student confidence in their validity by verifying that each is a solution to the Schroedinger equation, for the system it is supposed to describe, by the simple procedure of substituting it into that equation. In Example 5-3 we do this for a wave function which is particularly useful for illustrative purposes. Example 5-3. The wave function `I'(x,t) for the lowest energy state of a simple harmonic oscillator, consisting of a particle of mass m acted on by a linear restoring force of force constant C, can be expressed as Cm/2h)x2 e -(i/2)./C/mt `Y (x,t) = Ae where the real constant A can have any value. Verify that this expression is a solution to the Schroedinger equation for the appropriate potential. (The time-dependent term is a complex exponential; see Appendix F.) ^ The expression applies to the case in which the equilibrium point of the oscillator (the point at which the classical particle would rest if it were not oscillating) is at the origin of the x axis (x = 0). In this case the time-independent potential energy is V(x,t) = V(x) = Cx 2/2 as can be verified by noting that the corresponding force, F = — dV(x)/dx = — Cx, is a linear restoring force of force constant C. The Schroedinger equation for this potential is 2 82ty + x 2'I'= ih 2 atp 2 To check the validity of the solution quoted, we evaluate its derivatives. We find C 4' at = - 2 m and V ,/cm — a2 ^ ax2 _ atp _ — 2^m -\/Cm x — VCm ^` — 2xT — ^ — \/Cm h x^i'^ — xT -,./Cm Cm + ^m x2T PLAUSI BILITY AR G UMENT LEADIN G TO SCH ROEDINGER' S EQUATION C2 ^ + V(ClTl + C2'I'2) — ihI Zm Cl as l + 2) which can be rewritten as r z z z z az +V`Y1— iii a l]+cz at L2 clL-2 as l - iv Substituting into the Schroedinger equation yields S CHRO EDIN GER 'S THE ORYOF QU ANTU M ME CHANICS / Z `h 2 ^m x2T+ Z x2^= ih( 2^ m 2rnhm - h2 or f h fC C Z h C C 2 2 x ^+ 2 x T = —4' 2 m 2^Im w Since the last equality is obviously satisfied, the solution must be valid. The general solution to the simple harmonic oscillator Schroedinger equation is treated in • the following chapter. 5-3 BORN'S INTERPRETATION OF WAVE FUNCTIONS A very interesting and important property of wave functions can be seen by evaluating y = i in (5-15), which specifies the form of the free particle wave function. We obtain tP(x,t) = cos (kx — wt) + i sin (kx — wt) (5-23) The wave function is complex. That is, it contains the imaginary number i. Recall that this behavior was forced upon us. We first tried to find a way of satisfying our four assumptions concerning the Schroedinger equation by using a purely real free particle wave function, (5-1), and we found that there was no reasonable way of doing this. Only when we allowed the free particle wave function to have an imaginary part, by using the free particle wave function of (5-15) in which y turned out to be equal to i, did we succeed. In this process, we also ended up with an i in the Schroedinger equation, (5-22). If the student looks carefully at our plausibility argument, it will become apparent that the equation contains an i because it relates a first time derivative to a second space derivative. This is due, in turn, to the fact that the Schroedinger equation is based on the energy equation which relates the first power of total energy to the second power of momentum. The presence of an i in the Schroedinger equation implies that in the general case (for any potential energy function) the wave functions which are its solutions will be complex. We shall shortly see that this is true. Since a wave function of quantum mechanics is complex, it specifies simultaneously two real functions, its real part and its imaginary part (see Appendix F). This is in contrast to a "wave function" of classical mechanics. For instance, a wave in a string can be specified by one real function which gives the displacement of various elements of the string at various times. This classical wave function is not complex because the classical wave equation does not contain an i since it relates a second time derivative to a second space derivative. The fact that wave functions are complex functions should not be considered a weak point of the quantum mechanical theory. Actually, it is a desirable feature because it makes it immediately apparent that we should not attempt to give to wave functions a physical existence in the same sense that water waves have a physical existence. The reason is that a complex quantity cannot be measured by any actual physical instrument. The "real" world (using the term in its nonmathematical sense) is the world of "real" quantities (using the term in its mathematical sense). Therefore, we should not try to answer, or even pose the question: Exactly what is waving, and what is it waving in? The student will remember that consideration of just such questions concerning the nature of electromagnetic waves led the nineteenth century physicists to the fallacious concept of the ether. As the wave functions are complex, there is no temptation to make the same mistake again. Instead, it is apparent from the outset that the wave functions are computational devices which have a significance only in the context of the Schroedinger theory of which they are a part. These comments should not be taken to imply that the wave functions have If, at the instant t, a measurement is made to locate the particle associated with the wave function T(x,t), then the probability P(x,t) dx that the particle will be found at a coordinate between x and x + dx is equal to q*(x,t)T(x,t) dx. Justification of the postulate can be found in the following considerations. Since the motion of a particle is connected with the propagation of an associated wave function (the de Broglie condition), these two entities must be associated in space. That is, the particle must be at some location where the waves have an appreciable amplitude. Therefore P(x,t) must have an appreciable value where Y'(x,t) has an appreciable value. We attempt to illustrate schematically the situation in Figure 5-2. If the situation were otherwise, there would be serious difficulties with the theory. For instance, if the particle were separated in space from the wave, relativistic problems would arise because of the time required to transmit information between the two entities that are required to follow each other. Since the measurable quantity probability density P(x,t) is real and non-negative, whereas the wave function T(x,t) is complex, it is obviously not possible to equate P(x,t) to `P(x,t). However, since LP*(x,t)T(x,t) is always real and non-negative, Born was not inconsistent in equating it to P(x,t). Prove that 'P*(x,t)`P(x,t) is necessarily real, and either positive or zero. Any complex function, such as P(x,t), can always be written `P(x,t) = R(x,t) + iI(x,t) (5-25a) where R(x,t) and I(x,t) are both real functions that are called, respectively, its real and imaginary parts. The complex conjugate of T(x,t) is defined as `l'*(x,t) = R(x,t) — iI(x,t) (5-25b) Multiplying the two together, we obtain 'P*q = (R — iI)(R + iI) or, since i2 = — 1 'Y*'P =R 2 — i21 2 = R 2 +12 Example 5 4. - Figure 5-2 A very schematic picture of a wave function and its associated particle. The particle must be at some location where the wave function has an appreciable amplitude. BO RN' S I NTERP RETATI ON OFWAVE FUNCTI ONS no physical interest. We shall see in this and the next sections that a wave function actually contains all the information which the uncertainty principle allows us to know about the associated particle. The basic connection between the properties of the wave function W(x,t) and the behavior of the associated particle is expressed in terms of the probability density P(x,t). This quantity specifies the probability, per unit length of the x axis, of finding the particle near the coordinate x at time t. According to a postulate, first stated in 1926 by Max Born, the relation between the probability density and the wave function is (5-24) P(x,t) = q*(x,t)T(x,t) where the symbol T*(x,t) represents the complex conjugate of 'P(x,t) (see Appendix F). For emphasis, and clarification, we shall restate Born's postulate as follows: Thus 11*(x,t)1(x,t) = [R(x,t)] 2 + [I(x,t)] 2 (5-26) That is, it equals the sum of the squares of two real functions. Thus `P*(x,t)`P(x,t) must be real, and either positive or zero. • SCHROEDINGER 'S THEO RY OF QUAN TUM MEC HA NICS 1 Of course, there are other possible functions that can be generated from `P(x,t) that are real. An example is the absolute value, or modulus, I'P(x,t)I. However, all these other possibilities can be ruled out by arguments, too lengthy to reproduce here, which show that they would lead to an unphysical behavior for P(x,t). It is worthwhile for us to consider again an analogy between electromagnetism and quantum mechanics, discussed in Section 3-2. The connection between the density of photons in a field of electromagnetic radiation and the square of the electric field vector is analogous to the connection between the probability density and the wave function multiplied by its complex conjugate. Consider, for instance, that the electric field vector is a solution to the electromagnetic wave equation, while the wave function is a solution to the quantum mechanical wave equation. Both quantities specify the amplitudes of waves, although the electric vector is real whereas the wave function is complex. Therefore, the square of the amplitude of the waves, e2, gives the intensity of the waves in the electromagnetic case, while it is necessary to take the amplitude times its complex conjugate, `P*F, to obtain a real intensity in the quantum mechanical case. In the electromagnetic case the intensity of the waves is proportional to their energy density. Since each photon in the electromagnetic field carries energy hv, the energy density is, in turn, proportional to the density of photons. For one dimension, this is the probability per unit length of finding a photon. In the quantum mechanical case the intensity of the waves gives directly the probability density which is, in one dimension, the probability per unit length of finding a particle. Evaluate the probability density for the simple harmonic oscillator lowest energy state wave function quoted in Example 5-3. ■ The wave function is Example 5-5. `'(x,t) = Ae-( ✓cm/2!)x2e-012),/C/mt The probability density is therefore (see Appendix F for the evaluation of `P*) P = `P*tiP = Ae -( ✓cm12*)x2 e +(i/2),/C/m tAe -(,/Cm/2h)x 2e -(i12)„/C/mt Or P = A2 e -(1h)x 2 Note that the probability density is independent of time, even though the wave function depends on time. We shall see later that this is true in any case in which the particle associated with the wave function is in a single energy state. The probability density P predicted by quantum mechanics is plotted as a function of x by the solid curve in the upper part of Figure 5-3. The probability that a measurement of the location of the oscillating particle will find it in an element of the x axis between x and x + dx is equal to P dx. Since P has a maximum at x = 0, the equilibrium point of the oscillator, quantum mechanics predicts that the particle is most likely found in an element dx located at the equilibrium point. Proceeding in either direction from that location, the chances of finding it in an element of the same length dx decrease rather rapidly, but there are no well-defined limits beyond which the probability of finding the particle in an element of the x axis is precisely zero. In the following example we shall find that these predictions are very different from what would be expected for the oscillating particle according to classical mechanics. • Example 5-6. Evaluate the predictions of classical mechanics for the probability density of the simple harmonic oscillator of Example 5-5, and compare them with the quantum mechanical predictions found in that example. ■ In classical mechanics the oscillating particle has a definite momentum p, and therefore a definite velocity y, at every value of its displacement x from the equilibrium point. The P(x) BO RN' S INTERPRETATION OF WA VEFUNCTIO NS — 2E/C U -4 2E /C Figure 5-3 Quantum mechanical (top) and classical (bottom) probability densities for a particle in the lowest energy state of a simple harmonic oscillator. The quantum mechanical probability density peaks near the equilibrium point and extends beyond the sharp limits of motion predicted by classical physics. The classical probability density is inversely proportional to the classical velocity and is greatest at the endpoints of the motion, where the velocity vanishes. probability of finding it in an element of the x axis of fixed length is proportional to the amount of time it spends in the element, and this is inversely proportional to its velocity when it passes through the element. That is B2 P =— v where B2 is some constant. We obtain an expression for v in terms of x most simply by considering the energy equation mv2 Cx2 E= K+V= + 2 2 where E, K, and V are total, kinetic, and potential energies, and where the latter has been evaluated in terms of x and the oscillator force constant C from an equation justified in Example 5-3. We have then MV 2 or v= =E Cx2 2 CO co So B2 SCHROEDIN GER 'S THEORY OF Q U ANTUM MECHANICS P= ^n ci L U This expression for the classical probability density P is plotted as the curve in the lower part of Figure 5-3. It has a minimum value at the equilibrium point x = 0, and it rises rapidly near the limits of the oscillation. The limits occur at values of x where the particle has no kinetic energy so the potential energy equals its total energy E_ Cx2 2 or x= + 2E C Of course, the classical probability density drops abruptly to zero outside these limits of the particle's motion, as indicated by the straight lines in the figure. Simply put, the probability of finding the oscillating classical particle in an element of the x axis of a given length is smallest near the equilibrium point, where it spends the least time, and it rises rapidly near the limits of its motion, where it lingers. The value of the constant B 2 in the expression for the classical probability density can be determined by imposing the requirement that the total probability of finding the particle somewhere must equal one. The total probability is just the integral over all x of P so the expression +,/2E/C Go B2 dx Pdx= =1 N IE — Cx 2 /2 ,/2E /c ^ can be used to evaluate B 2 . We shall not bother to carry out this so-called normalization procedure for the classical probability density, although it is not difficult to do after expressing E in terms of C; but we shall carry out such a procedure in Example 5-7 to determine the value of the corresponding constant A 2 that occurs in the quantum mechanical probability density. Figure 5-3 shows that the classical prediction for the probability density is very different from the quantum mechanical prediction. According to classical mechanics, measurements of the location of the particle in the simple harmonic oscillator will always find it within two well-defined limits, and they will usually find it near one or the other of these limits According to quantum mechanics, when the simple harmonic oscillator is in the lowest energy state measurements will usually find the particle to be near the equilibrium point, but there are no well-defined limits beyond which the particle will never be found. When the oscillator is in its lowest energy state we are very far from the range of validity of classical physics. Thus we expect that, of the two predictions, the one made by quantum mechanics is correct. As we shall see in Chapter 12, this can be confirmed by measuring properties of diatomic molecules that depend on the interatomic spacing, since in low-energy states the two atoms in such a molecule feel the linear restoring force characteristic of simple harmonic motion. Of course, the trouble with the classical calculation is that it neglects the uncertainty principle in associating a definite value of the velocity, or momentum, of the particle with a definite value of its position. In Example 5-12 we shall make a comparison between the classical and quantum mechanical predictions of the probability density function for a particle in a high-energy state of a simple harmonic oscillator, where the range of validity of classical physics is approached because the uncertainty principle is of no consequence. There we shall find the predictions of the two theories to be very similar, as would be expected from the correspondence principle. • J (' - 00 In Example 5-5 we saw one of the predictions of quantum mechanics concerning the behavior of a particle in a simple harmonic oscillator. The prediction is typical of "We describe the instantaneous state of the system by a quantity W, which satisfies a differential equation, and therefore changes with time in a way which is completely determined by its form at a time t = 0, so that its behavior is rigorously causal. Since, however, physical significance is confined to the quantity `If*`If, and to other similarly constructed quadratic expressions, which only partially define W, it follows that, even when the physically determinable quantities are completely known at time t = 0, the initial value of the `Y-function is necessarily not completely definable. This view of the matter is equivalent to the assertion that events happen indeed in a strictly causal way, but that we do not know the initial state exactly. In this sense the law of causation is therefore empty; physics is in the nature of the case indeterminate, and therefore the affair of statistics." The first point that Born makes, about the space dependence of `P at some initial time being sufficient to completely determine its space dependence at any subsequent time, is a consequence of the fact that W satisfies the Schroedinger equation which contains only a first time derivative. His second point, about not being able to completely define the space dependence of the wave function at the initial time, can be seen by inspecting (5-25a) and (5-26). These show that if we know a probability density from an initial set of measurements on a system, we still cannot determine uniquely an initial wave function to associate with the system. All we can determine is the sum of the squares of the real and imaginary parts of the wave function. We can summarize the ideas of the last few paragraphs by saying that the behavior of a given wave function of a system is predictable in the sense that the Schroedinger equation for the corresponding potential energy will determine exactly its form at some later time in terms of its form at some initial time; but its initial form cannot be specified completely by an initial set of measurements and its final form predicts only the relative probabilities of the results of the final set of measurements. Again quoting Born: "The motion of particles conforms to the laws of probability, but the probability itself is propagated in accordance with the law of causality." BORN' S INTERP RETATI ON OF WAVEFU NCTIONS the type of information that the theory can provide. It cannot tell us that a particle in a given energy state will be found in a precise location at a certain time, but only the relative probabilities that the particle will be found in various locations at that time. The predictions of quantum mechanics are statistical. The uncertainty principle provides the fundamental reason why quantum mechanics expresses itself in probabilities, and not in certainties. For instance, consider investigating a harmonic oscillator in some typical energy state. In order to really know that the system is in a particular state, we must make a measurement of its energy. The measurement necessarily disturbs the system in a way that cannot be completely determined, so it is not surprising that we cannot predict with certainty where the particle will be found when we make a position measurement. In classical mechanics, even though the energy of the system is microscopic, we can make the energy measurement, plus any other measurements, without disturbing the system. So classical mechanics says we can predict precisely where the particle will be found in a subsequent measurement, if we wish. But, when applied to a microscopic system, classical mechanics is wrong. Not only is it impossible to predict from classical mechanics precisely where a particle in a microscopic system will be in a subsequent measurement, it is, as we found in Example 5-6, impossible even to predict accurately from that theory the relative probabilities of finding the particle in various locations. Quantum mechanics does allow us to make accurate predictions about these relative probabilities because it takes into account quantitatively the fundamental fact of life of the microscopic world—the uncertainty principle. Born has expressed the situation as follows: O SCHR OEDING ER 'S THEORY OF QU ANTUM MECHANICS T Normalize the wave function of Example 5-3, by determining the value of the arbitrary constant A in that wave function for which the total probability of finding the associated particle somewhere on the x axis equals one. •The total probability of finding the particle somewhere on the entire range of the x axis is necessarily equal to one if the particle exists. This total probability can be obtained mathematically by integrating the probability density function P over all x. Doing this, and setting the result equal to one, we have Example 5-7. GO f CO ^ *`Fdx = Pdx = A2 f e -( ✓cm /h)xz dx = 1 J `F - Co h)x 2 - Co -^ Since the integrand e -('/ n/ depends on x 2, it is an even function of x. That is, its value for a certain x equals its value for —x, as can be seen in Figure 5-4. Thus the contribution to the total value of the integral obtained in the range — co to 0 equals the contribution obtained in the range 0 to + co, and we have oo 00 A2 e c✓cm/h)x2 dx = 1 e(Nfcm/h)x2 dx = 2A2 The definite integral can be evaluated by consulting appropriate tables, and yields Jr e 00 —( cm/h)x2 d ( x— )1/2 7Ch 2(Cm)1 /4 o Then we find immediately that the required value of A is (Cm) 1 /8 A = 0) 1 /4 With this value of A, the wave function becomes (Cm) 118 = (nh) 1 /4 1P(x,t)—e-Wcm/2$0x2 e_(i12)✓c/mt • The procedure gone through in Example 5-7 is called normalization of a wave function, and the wave function quoted at the end of the example is said to be normalized. Before the procedure is carried out, the amplitude of a wave function is arbitrary because the linearity of the Schroedinger equation allows a wave function to be multiplied by a constant of arbitrary magnitude and still remain a solution to the equation. Normalizing has the effect of fixing the amplitude by fixing the value of the multiplicative constant, such as A in Example 5-7. It is not always necessary to really carry through the calculation that leads to the value of the amplitude constant because useful results can often be obtained in terms of relative probabilities that are independent of the actual values of the amplitudes. But it should always be remembered that f P dx = - Co — J 1JJ*tJi dx _1 (5-27) - Co x1 x Figure 5-4 A plot of the even function a -( imx.2 Since the function depends on x 2 , its value for any particular x 1 equals its value for —x1. since these integrals give the total probability of finding somewhere the particle described by the wave function, and the probability must equal one if there is a particle. 5 4 EXPECTATION VALUES In the previous section we saw that the wave function contains information about ° the behavior of the associated particle in that it specifies the probability density for the particle. In this section we shall see how to extract from the wave function a wide variety of additional information concerning the particle. That is, we shall learn how to obtain from the wave function detailed numerical information not only about the position of the particle but also about its momentum, energy, and all other quantities that characterize its behavior. For instance, we shall find out how to give quantitative evaluations of the terms Ax and Ap in the uncertainty principle. Wave functions are useful because they contain so much information about the behavior of the associated particle. Consider a particle and its associated wave function `P(x,t). In a measurement of the position of the particle in the system described by the wave function, there would be a finite probability of finding it at any x coordinate in the interval x to x + dx, as long as the wave function is nonzero in that interval. In general, the wave function is nonzero over an extended range of the x axis. Thus we are generally not able to state that the x coordinate of the particle has a certain definite value. However, it is possible to specify some sort of average position of the particle in the following way. Let us imagine making a measurement of the position of the particle at the instant t. The probability of finding it between x and x + dx is, according to Born's postulate, (5-24) P(x,t) dx = `Il*(x,t)gI(x,t) dx Imagine performing this measurement a number of times on identical systems described by the same wave function P(x,t), always at the same value of t, and recording the observed values of x at which we find the particle. An example would be a set of measurements of the x coordinates of particles in the lowest energy states of identical simple harmonic oscillators. In three dimensions, an example would be a set of measurements of the positions of electrons in hydrogen atoms, with all the atoms in their lowest energy states. We can use the average of the observed values to characterize the position at time t of a particle associated with the wave function 'Y(x,t). This average value we call the expectation value of the x coordinate of the particle at the instant t. It is easy to see that the expectation value of x, which is written x, will be given by x= f xP(x,t) dx The reason is that the integrand in this expression is just the value of the x coordinate weighted by the probability of observing that value. Therefore, we obtain upon integrating the average of the observed values. Using Born's postulate to evaluate the probability density in terms of the wave function, we obtain r ^ x= J T*(x,t)xlY(x,t) dx - (5-28) ao The terms of the integrand are written in the order shown to preserve symmetry with a notation which will be developed later. S3 tl -Ib'A N OI1b'103dX3 - SCHROEDING ER 'S THEORY OF QUANTUM ME CHANI CS à co ^j Figure 5-5 A plot of the odd function xe -(.i6" )x2 . The value of the function for any particular x 1 equals the negative of its value for —x 1 . Some students may find these equations more familiar if they are written in the form I xP(x,t) dx = J ^*(x,t)xT(x,t) dx - ^ J J 'P *(x,t) 111(x,t) dx P(x,t) dx but these are actually equivalent to the forms we use since (5-27) shows that the denominators equal one. Determine x for a particle in the lowest energy state of a simple harmonic oscillator, using the wave function and probability density considered in the preceding examples. ^^We can see immediately from Figures 5-3 and 5-4 that x = O. The reason is that x is the average value of x, with the average computed using a weighting factor 'P*'P which is symmetrical about x = 0; for every chance of observing a certain positive value of x there is an exactly compensating chance of observing a negative value of x of the same magnitude. The behavior of the particle in the oscillator is symmetrical about its equilibrium point at x = 0, so = O. More formally, we have Example 5-8. `If*x111 dx = - ^ where the factor 111*111 in the integrand is plotted in Figures 5-3 and 5-4. Now this factor is an even function of x, and the remaining factor in the integrand is x itself, which is an odd function of x. So the entire integrand is an odd function of x. That is, its value at a particular x is exactly equal to the negative of its value at — x, as illustrated in Figure 5-5. From this it follows that the integral yields zero since for every contribution to its total value obtained from an element of the x axis at some x there is a compensating contribution of the opposite sign from the corresponding element at — x. From arguments using a coordinate system in which the origin of the x axis is chosen at the equilibrium point of the oscillator, we have concluded that z lies at the equilibrium point, as indicated in Figure 5-6a; but this conclusion is true, independent of the choice of the origin. That is, if the equilibrium point of the oscillator is located to the right of the origin, 1-11*`11 is still centered on the equilibrium point so is still located at that point, as indicated in Figure 5-6b. The reason is that the behavior of the oscillator is still symmetrical about its equilibrium point. If the oscillator is distorted by making the restoring force stronger in one direction than in the other, this symmetry is destroyed. (It will no longer be a simple harmonic oscillator.) Then'11*1I' will lose its symmetry, and will be displaced from the equilibrium point. Examples are shown in Figures 5-6c and 5-6d. • It is apparent that an expression of the same form as (5-28) would be appropriate for the evaluation of the expectation value of any function of x. That is x S3 fl1 t/n N011b'10 3dX3 (a) x (b) x (c) x (d) Figure 5-6 (a) The probability density for the ground state of a harmonic oscillator whose equilibrium point (marked with a triangle) lies at the origin. The expectation value x (marked with an arrow) also lies at the origin. (b) The oscillator is displaced along the x axis, but the expectation value x remains coincident with the equilibrium point. (c) The restoring force is made weaker for positive displacements than for negative displacements, destroying the symmetry of the oscillator. The particle now would more likely be found to the right of the equilibrium point than to left, so the expectation value z now lies to the right of that point. But the equilibrium point is still the location where the particle would most likely be found because it is still where the probablity density maximizes. (d) As the restoring force is made even more asymmetric, x is further displaced to the right. In all figures the short vertical marks on the x axis indicate the limits of the classical oscillation for the appropriate potential, or restoring force, and total energy. co x2 = I T*(x,t)x 2111(x,t) dx - ^ and CO *(x ,t) f (x)41(x,t) dx f(x) = - cc where f (x) is any function of x. Even for a function which may explicitly depend on the time, such as a potential energy V(x,t), we may still write co V(x,t) = J TI*(x,t)V( x,t)V( x ,t) dx (5 29) - because all measurements made to evaluate V(x,t) are made at the same value of t, and so the preceding arguments would still hold. The coordinate x and the potential energy V(x,t) are two examples of the dynamical quantities which can be used to characterize the behavior of the particle. Examples of other dynamical quantities are the momentum p and the total energy E. The expectation value of these quantities is always given by the same type of expression. For example, the expectation value of the momentum is given by Go SCHR OEDIN GER 'S THE ORYOF QUAN TUM ME CHANICS (' = J tI'*(x,t)plY(x,t) dx (5-30) CO However, in order to evaluate the integral in (5-30), the integrand `P*(x,t)pP(x,t) must be expressed as a function of the variables x and t. In classical mechanics, p can always be written as a function of the variables x and/or t. For instance, for a particle moving in a time-independent potential, p can be written as a function of x alone since its momentum is precisely known at every point on its path (after the problem has been solved). A moment's consideration of the behavior of a classical simple harmonic oscillator will verify this. But in quantum mechanics the uncertainty principle tells us that it is not possible to write p as a function of x, because p and x cannot be simultaneously known with complete precision. Nor is it possible to write p as a function of t. We must find some other way of expressing the integrand of (5-30) in terms of x and t. A clue can be found by considering the free particle wave function, (5-23), which is T(x,t) = cos (kx — cot) + i sin (kx — cot) Differentiating with respect to x, we have ax = —k sin (kx — wt) + ik cos (kx — cot) = ik[cos (kx — cot) + i sin (kx — cot)] Since k = p/h, this is ô tP(x,t) = i ^ 'B(x,t) which can be written p[`I`(x,t)] = — ih ôx [`P(x,t)] This indicates that there is an association between the dynamical quantity p and the differential operator — ih(ô/ôx). That is, the effect of multiplying the function ti (x,t) p is the same as the effect of operating on it with the differential operator — ih(ô/ôx) by (that is, of taking —iii times the partial derivative of the function with respect to x). A similar association can be found between the dynamical quantity E and the differential operator ih(a/at) by differentiating the free particle wave function ¶(x,t) with respect to t. We obtain OT(x,t) = + w sin (kx — wt) — ico cos (kx — wt) at = — ico[cos (kx — cot) + i sin (kx — cot)] Since w = E/h, this can be written E[`I`(x,t)] = ih [`I`(x,t)] Are these relations restricted to the case of free particle wave functions? No! Consider (5-9), which relates the total energy E to the momentum p and the potential energy V(x,t) p 2m + V(x,t) = E ^ 2m Since (— ih) 2 = — 2 a x + V(x,t) = ih at (a/ax)(a/ax) = a2/0xe, we obtain h2, and (a/ax) 2 = h2 2 (5-31) + V(x,t)=ih^ 22 — This is an operator equation. It has significance when applied to any wave function 'P(x,t), in tlié sense that identical results are obtained after performing on the wave function the operations indicated on either side of the equal sign. That is, (5-31) implies h 2 02 (x't) + V(x,t)^(x , t) = i^1 01P(x,t) at 2m ax2 where'P(x,t) is any wave function. Of course, this is just the Schroedinger equation. Therefore, we conclude that postulating the associations p 4—)— ih x and E H ih ^t (5-32) is equivalent to postulating the Schroedinger equation. The validity of these associations is unrestricted. The procedure used in the last paragraph is essentially the one originally followed by Schroedinger in obtaining his equation. It provides us with a powerful method for obtaining the quantum mechanical wave equation for more complicated cases than the one-particle, one-dimensional case we treat in this chapter. We shall use it later to treat the systems we ultimately must deal with. Now let us use the first of the operator associations to obtain an integrable expression for the expectation value of the momentum. We take (5-30), which is p = J YJ*(x,t)pT(x,t) dx and replace the p in the integrand by — ih(a/ax). We obtain CO p= f 111 *(x,t)( — ih J x l'P(x,t)dx j - or CO p= — 'Y *(x,t) ih a^(x,t) dx ax (5-33) — oo We thus obtain an expression which can be integrated immediately if we know 'P(x,t). At this point we can see the reason for the ordering of the terms in the integrands of (5-30) and (5-33). It would not be possible to have 'I`*(x,t)tP(x,t) p= —ih - CO 8x -dx ^ o, S3(Ii HANOI l`d1O3dX3V-5'33S Let us replace the dynamical quantities p and E by their associated differential operators. Then we have CO since this is meaningless. Nor would it be possible to have SCHROEDING ER 'S THEORY OF QUANTUM ME CHANICS T J = —ih ax [T * (x,t) 111 (x,t)] dx = — ih [P * (x,t)P(x,t)] `° because the right-hand side of the last equation always equals zero. This is true because, in any realistic situation, the particle would never be found at either x = + co or x = — co, and therefore the probability density vanishes at both these limits It should also be mentioned that using the expression CO = —ih J P(x,t) âW*(x,t) a dx is equivalent to using the minus sign in (5-19), and it adds nothing new to the theory. The ordering of terms is of no consequence in integrands that occur in expressions for the expectation values of quantities that are functions of position and/or time, such as (5-28) and (5-29), because no derivatives are involved. Nevertheless, it is conventional to use the same ordering as is required in the expressions for the expectation value of the momentum. Using the second of the operator associations of (5-32), we can evaluate the expectation value of the total energy E of a particle in a state described by the wave function qi(x,t), as follows - 00 But note that we can also use the energy equation, (5-9), to write E in terms of p and V(x,t), and then employ the first of the operator associations of (5-32) to convert p into an operator, obtaining CO E = J 'P *(x, t) ^ 2m ^x2 + V(x,t) LY(x,t) dx In fact, the expectation value of any dynamical quantity can be evaluated by using only the first of the operator associations of (5-32). That is, if f (x,p,t) is any dynamical quantity which is a function of x, p, and possibly t, useful in describing the state of motion of the particle associated with the wave function T(x,t), then its expectation value f(x,p,t) is given by CO f (x,p,t) = J - `P * (x,t)ffop (x, — ih ax , t)^(x,t) dx (5-34) 00 where the operator fop(x, — ih ô/ax,t) is obtained from the function f(x,p,t) by everywhere replacing p by — ih 0/0x. We have found that the wave function 'P(x,t) contains more information than just the probability density P(x,t) _ `P*(x,t)`P(x,t). The wave function also contains, through (5-34), the expectation value of the coordinate x, the potential energy V, the momentum p, the total energy E, and, in general, the expectation value of any dynamical quantity f(x,p,t). In fact, the wave function contains all the information that the uncertainty principle will allow us to learn about the associated particle. Consider a particle of mass m which can move freely along the x axis anywhere from x = — a/2 to x = + a/2, but which is strictly prohibited from being found outside this region. The particle bounces back and forth between the walls at x = +a/2 of a (one-dimensional) box. The walls are assumed to be completely impenetrable, no matter how energetic is the particle. Of course, this assumption is an idealization, but it is a very useful one. We shall study this problem in the following chapter, and we shall find that the wave function for the lowest energy state of the particle is x A cos e `E`/ —a/2 < x < +a/2 a `F(x,t) = x < —a/2 or x > _ + a/2 0 where A is an arbitrary real constant, and E is the total energy of the particle. This wave function is another one which is convenient for us to use in this chapter for illustrative purposes. Justify its use here by verifying that it is a solution to the Schroedinger equation in the region — a/2 < x < + a/2, and determine the value of E for this lowest energy state. ■ If there are no forces acting on the particle in the region in question, the potential energy function must be constant in the region. As potential energies are always undefined to within an additive constant, we can take the value of the potential energy to be zero in the region. Then the Schroedinger equation in the region reads q a2 , 2 =ih ât —a/2<x<+a/2 We verify the wave function by substituting its derivatives into the equation. With TCx - iEr/^i e `I'=A cos — a we obtain 7L — a âx =— and OT at Substitution yields ( it ) 2 A cos ^^ ^ e ^x — a e - iEt iEt/i; _ /fi —^ â l 22 =— iE iE A os ^ x e- iEt/fl h h a c h 2 n2 + 2m a2 or A sin iE i ^i ^ _— h2n 2 2ma 2 T=ELI' This is satisfied identically, providing E has the value x 2h2 E= 2ma2 Thus we have determined the required value of E corresponding to the wave function we are dealing with, and have also verified that the wave function is a solution of the Schroedinger equation. Figure 5-7 illustrates the wave function by a plot of its space dependence. Note that the interior (inside the box) values of Y'(x,t) join onto the exterior (outside the box) values of zero at the boundaries of the region at x = — a/2 and x = + a/2 (walls of the box) because the S3MItln N OI lt/103 dX3 Example 5-9. Fixed t 00 SCHROEDIN GER 'S T HEORY OF QUANTUM MECHANICS T ^ —a/2 x a/2 Figure 5-7 The x dependence of a wave function for the lowest energy state of a particle strictly confined to a region of length a, but moving freely therein. Everywhere outside the region the value of the wave function is zero. cosine function goes to zero when x approaches ± a/2. The exterior values of T(x,t) are zero, of course, because the wave function describes a particle which is strictly prohibited from being found outside the region. • Example 5 10. Use the "particle-in-a-box" wave function treated in Example 5-9 to evaluate the expectation values of x, p, x 2, and p 2 for the particle associated with the wave function. ^ To evaluate x we must evaluate - , = `F*x`P dx Using the wave function of Example 5-9, this is Ci_ t +a /2 Ex TEX A cos — e + iEr/^i xA cos— x= U a —a/2 a e iEt/^ dx +a/2 = A2 A x cos —a/2 2 ^x — dx a where the integration has been restricted to the region from — a/2 to + a/2 since `I'(x,t) is zero outside this region. Now note that the integrand is a product of cos 2 (nx/a), which is an even function of x, times x itself, which is an odd function of x. The integrand is therefore an odd function of x. From this conclusion it follows that +a/2 (^ J 2 ^x x cos — dx = a —a/2 0 because the integral of an integrand which is an odd function of the variable of integration is zero if the integration is taken over a range which is centered about its origin (see Example5-8). Thus we obtain x=0 A moment's thought should make it clear why measurements of the location of the particle which moves freely between — a/2 and + a/2 would be expected to average cut to zero. To evaluate p, we evaluate fJ CO p= ( q' * — ir~i) atit dx ex Using the given 1(x,t), and its x derivative which has been calculated in Example 5-9, we obtain p= — ifi + a /2 (' J A — a/2 nx cos — TCX a ( e — iEt/n dx 7r A sin a a) — or +a/2 p= iii -a A 2 I COS — x a J a TEX sin — dx a Again, the integrand is, in total, an odd function of the variable of integration since it is the product of an even function cos (xx/a) times an odd function sin (xx/a). Thus we obtain p=0 because the integral is taken over a range centered on the origin, and consequently it yields zero. Physically, the expectation value of the momentum of the particle is zero because, if the particle is confined to the region from —a/2 to + a/2 and moving with total energy E, it must be bouncing back and forth between the ends of the region and constantly reversing the sign (i.e., the direction) of its momentum. That is, the magnitude of its momentum must be such that p2/2m = E but, since it is equally probable that the sign of the momentum will be either positive or negative, measurements of this quantity will average out to zero. In evaluating x 2, we must evaluate the integral co +a/2 X 2 `F * x 2 p = nx e-`Et/1 dx A cos — e+ ZEt/l x2A cos a a dx = — co —a/2 +a/2 = A2 X 2 TCX cos a J a/ 2 dx This will not yield zero because the integrand is an even function of x. For the same reason we may, as in Example 5-7, immediately simplify the integral to obtain +a /2 a X 2 = 2A2 2 cos 2 -x a J x 0 dx If we multiply and divide by (a/x) 3, this can be written 3+ir/2 ( (a) x2=2A2 )2 cos t J 0 ^a d ^ The integral can now be evaluated by consulting appropriate tables. We find 2 2 X2= A 2 7E 6 1 47r 2 ( a3 ^ In order to fully determine x 2, we must also know the value of the constant A that determines the amplitude of the wave function. As in Example 5-7, we can find the proper value by demanding that the wave function be normalized. That is, we adjust A so that the total probability of finding the particle somewhere is equal to one. The condition gives f J + ,t/2 ( +a/2 co tp*`Y dx = A 2 J - Co cost -7cx a 1") = 1 dx = 2A2 a J cos2 7x d(a a E 0 —a/2 Integrating, we obtain 2A 2a =1 n4 or A= Thus we have _ 3 ^2 2 a x2 = a 47c 2 ( 6 a ( a2 7r2 } 2n 2 6 1 = 1) = 0.033a2 S3 fllb'n N OI l`d103dX3 — a/2 o SCHR OEDINGER 'S THEORY OF Q UANTUMMECHANI CS ^ The quantity x 2 is not zero, even though z = 0, because any measurement of x 2 must necessarily yield a positive result. This quantity, or its square root /x2 (the root-mean-square position of statistical theory), can be taken as a measure of the flu ct uations about the average, z = 0, that would be observed in determinations of the position of the particle. The latter quantity has the value ,\/x 2 =0.18a The fluctuations arise because the particle is not always found at the same location, but instead at various locations, since the particle can be found wherever 'I'*' has an appreciable value. (In this case where î = 0, the quantity •,/x 2 is a measure of the fluctuations. In a case where 0 0, the quantity Jx2 — x 2 is a measure of the fluctuations. Analogous comments apply to z the momentum p.) _ Finally, let us evaluate p2 from the expression 00 co 2 2 P = T*(—ifi)2 a dx = —h2 J 'I'* a dx Jf âx2 âx2 - -^ Using the value of 0 2 T/ôx 2 calculated in Example 5-9, we have p 2 = h2 00 2 2 ^ a dx J -co Of course the integral equals one since it is just the probability of finding the particle somewhere. If we were interested only in evaluating p 2, we would not find it necessary to actually carry through the normalization procedure to evaluate A since we can make this statement and immediately conclude that ()2 a The square root of this quantity (the root-mean-square momentum) hn JP — a is a measure of the fluctuations about the average, p = 0, that would be observed in determinations of the momentum of the particle. The fluctuations arise, as discussed above, because the particle can sometimes be found with momentum p = + \/2mE and sometimes with momentum p = — /2mE . If we evaluate 2m7c 2h2 xh 2ma2 = a from Example 5-9, we note that /p2 is just equal to the magnitude of p. If we define Jx2 and N/p 2 as the uncertainties Ax and Ap in the position and momentum of the particle in the energy state we have been dealing with, we obtain .. AxAp = /x 2 J 2 = V 0 . 18 a a nh a = 0.57h This is certainly consistent with the lower limit h/2 set by the uncertainty principle. Note that this is the first time we have been able to become really quantitative when referring to the uncertainty principle. Expectation values calculated from wave functions make it possible to give quantitative definitions to the uncertainties. • 5-5 THE TIME-INDEPENDENT SCHROEDINGER EQUATION The usefulness of wave functions more than justifies the work that is required to obtain them. This is done by solving Schroedinger's equation, (5-22) 2 02 (x't) + h F(x,t) V(x,t)'ll(x,t) = iii 2m ôx 2 ôt where the first term on the right side is a function of x alone and the second term is a function of t alone. We shall assume the existence of solutions of this form, substitute these solutions into the Schroedinger equation that they are supposed to satisfy, and see what happens. If our assumed form is invalid we shall, of course, soon find out. However, we shall actually find that solutions of the assumed form do exist, provided that the potential energy does not depend explicitly on the time t so that the function can be written as V(x). Since in quantum mechanics, as in classical mechanics, almost all systems have potential energies of this form, the condition is not a very serious restriction. Separation of variables will lead to the conclusion that the function i(i(x), which specifies the space dependence of the wave function 'Y(x,t) = >/i(x)(p(t), is a solution to the differential equation h2 d2 11/(x) 2m dx2 + V(x)ifi(x) = Et/i(x) called the time independent Schroedinger equation. Note that this equation is simpler than the Schroedinger equation for the same potential energy because it involves only one independent variable, x, and it is therefore an ordinary differential equation instead of a partial differential equation. The technique will give us even more information about the function çp(t) specifying the time dependence of the wave function. In fact, it will show that cp(t) satisfies a simple ordinary differential equation that can be solved immediately to yield the simple expression - (p(t) = e iEt/t, where E is the total energy of the particle in the system. Separation of variables is such a useful technique that we shall employ it on a number of occasions in the remainder of this book. Let us now carry through the details of its application to the Schroedinger equation. Substituting the assumed form of the solution, 'F(x,t) = tli(x)çp(t), into the Schroedinger equation, and also restricting ourselves to time-independent potential energies that can be written as V(x), we obtain h2 0 20(x)w(t) + V(x)0(x)(P(t) = ih atfr(x)(P(t) at ax2 2m Now et/i(x) a2 ' (x) 021k( x)çP(t ) _ ax2 (Pt) ax2 = (jc' (t) dx2 THE TIME-INDEPENDENT SCHRO EDINGER E QUATION using the potential energy function V(x,t) that properly describes the forces acting on the particle of interest. We shall now take the first step in solving this partial differential equation. As we promised, we shall carefully develop the required mathematical procedures, assuming no previous knowledge of differential equations on the part of the student. The standard technique for solving partial differential equations consists of searching for solutions in the form of products of functions, each of which contains only a single one of the independent variables that are involved in the equation. The technique, called the separation of variables, is used because it immediately reduces the partial differential equation to a set of ordinary differential equations. As we shall see, this is a significant simplification. Here we are dealing with a partial differential equation involving a single space variable x plus the time variable t. Thus the technique consists in searching for solutions in which the wave function ¶(x,t) can be written as the product (5-35) 'P(x,t) = 11i(x)(p(t) SC HROEDING ER 'S THEORY OF Q UANTUM MECHANI CS the notation ô 20(x)/âx 2 being redundant with d20(x)/dx 2 since i/r(x) is a function of x alone. Similarly 0 0(x)w(t) a^(t) d^(t) at _ ^( x) ôt tG( x) dt Therefore, we have + V(x) (x)ço(t) = iht/r(x) d o(t) cp(t) 2 m d — dz(x) dt Dividing both sides of this equation by I(x)(p(t), we obtain h2 1 d2 (x) + V(x)11/(x)1=iii 1 4(0 2m dx 2 rp(t) dt ^i (x) (5-36) Note that the right side of (5-36) does not depend on x, while the left side does not depend on t. Consequently, their common value cannot depend on either x or t. In other words, the common value must be a constant, which we shall call G. The result of this consideration is that (5-36) leads to two separate equations. One equation is obtained by setting the left side equal to the common value h2 z x) + V(x)^Jr(x)] = G (5-37) 0(x) 2m d d 2 The other equation is obtained by setting the right side equal to the common value 1 d(p(t) ih =G (5-38) 9(t) dt The constant G is called the separation constant, for the same reason that this technique for solving partial differential equations is called the separation of variables. In retrospect, we see that the effect of employing the technique has been to convert the single partial differential equation, involving two independent variables x and t, into a pair of ordinary differential equations, one involving x alone and the other involving t alone. These equations are coupled in the sense that they both contain the same separation constant G, but this type of coupling does not lead to any difficulty in obtaining solutions to the equations. We shall find that the time equation, (5-38), has a very simple solution. Furthermore, when we demand that this solution agree with the de Broglie-Einstein postulate, we shall see that the value of the separation constant G becomes determined Substituting this value of G into the space equation, (5-37), we then have an ordinary differential equation, whose solutions can be obtained by employing one of the several standard techniques that have been developed for solving such equations. What we have done, in effect, is to reduce the problem from that of solving the partial differential space-time Schroedinger equation, (5-22), to that of solving the ordinary differential space equation. The product of the solution of that equation and the solution of the time equation is the desired solution of the Schroedinger equation. We can see that the product form 1P(x,t) = >/i(x)cp(t), which we assumed for the wave function, is justified because we shall be able to carry out the procedure just outlined. We can also see that we cannot carry through the separation of (5-36), into the pair of equations that follow from it, if the potential energy function depends on both x and t, as stated earlier. The reason is that we cannot then separate terms so that one side of the equation does not depend on x while the other side does not depend on t. The time equation, (5-38), is a simple first-order ordinary differential equation for (p as a function of t. There are several general techniques available for finding the solutions to such equations. All these techniques have a common feature; they involve assuming a general form for the solution, substituting this form into the differential dcp(t) dt iG ^ yo(t) = (5-39) This differential equation tells us that the function yp(t), which is its solution, has the property that its first derivative is proportional to the function itself. Anyone with much experience in differentiating would not have difficulty in guessing that 9(t) must be an exponential function. Therefore, let us assume that the solution to the differential equation is of the form 9(t) = eŒt where a is a constant that will be determined shortly. We verify this assumed solution by differentiating it, to obtain dcp(t) = ace Œt = ayp(t) dt which we then substitute into (5-39). This yields agq(t) = — i ^P(t) If we set iG =— h the assumed solution obviously satisfies the equation. Therefore 9 (t) = e - ictIh (5-40) is a solution to (5-38) or (5-39). The solution 9(t) is written in (5-40) as a complex exponential, but it can be written as cp(t) = e-iGtth = cos t — i sin t (5-41a) or G 94) = cos 2n — t — i sin 27r G t (`• b) We see that 9(t) is an oscillatory function of time of frequency v = G/h. But, according to the de Broglie-Einstein postulates of (5-8), the frequency must also be given by v = E/h, where E is the total energy of the particle associated with the wave function corresponding to 00. The reason is, of course, that 9(t) is the function that specifies the time dependence of the wave function. Comparing these expressions, we see that the separation constant must be equal to the total energy of the particle. That is (5-42) G=E Using this value of G in the space equation, (5-37), that we obtained from the separation of variables, we have h2 d2 /i(x) + V(x)1//(x) = Et/i(x) (5-43) 2m dx2 THE TIME-INDEPENDENT SC HROEDING ER E QUATI ON equation and, from the resulting equation, determining the specific form required for the solution. After studying these techniques, it is often possible to develop enough intuition to be able to guess the specific form of the solution in the first instance, at least for fairly simple differential equations. This is a time saving and perfectly legitimate procedure, providing the guess is verified by substituting it into the differential equation and showing that the equation is satisfied, and this is the procedure that will usually be employed in this book. Consider (5-38) which, upon transposition, can be written as ^ SCHROEDINGER 'S THEORY OF Q UA NTU M MECHANICS T Using this value of G in the solution (5-40) to the time equation, so that we complete the specification of p(t), the product form of the wave function becomes Lp(x,t) = 0(x)e - 'Et/' (5-44) where E is the total energy of the particle. Equation (5-43) is called the time-independent Schroedinger equation, because the time variable t does not enter the equation. Its time-independent solutions t/i(x) determine, through (5-44), the space dependence of the solutions Y(x,t) to the Schroedinger equation. For the one-dimensional cases that we have been treating in this chapter, the time-independent Schroedinger equation can involve only one independent variable x, and it must, therefore, be an ordinary differential equation. However, if there are more space dimensions, the time-independent Schroedinger equation will involve more independent variables and will therefore be a partial differential equation. (It can usually be reduced to a set of ordinary differential equations, in such cases, by applying the technique of separation of variables.) In all cases the time-independent Schroedinger equation does not contain the imaginary number i, and its solutions 0(x) are therefore not necessarily complex functions. (That is, 0(x) need not be complex, but it can be if convenience dictates.) This equation, and its solutions, are essentially identical to the time-independent differential equation for classical wave motion, and its solutions. The functions qi(x) are called eigenfunctions. The first part, eigen, is the German word for characteristic. We shall subsequently get a better idea of why characteristic is appropriate terminology. Here it will suffice to say that its use is conventional. It is also conventional not to translate it into English, perhaps in honor of the dominant role played by German speaking physicists in the development of quantum mechanics. The student is cautioned to keep clearly in mind the difference between the eigenfunctions 0(x) and the wave functions 'P(x,t), and also the difference between the time-independent Schroedinger equation and the Schroedinger equation itself. Wave functions will always be represented by a capital letter Y'; eigenfunctions will always be represented by a lower case letter 0. Example 5 11. Develop a plausibility argument, similar to the one given in Section 5-2, which leads directly to the time-independent Schroedinger equation. • We assume the equation must be consistent with the classical energy equation - P2 + V = E 2m and also with the de Broglie postulate h p = -, = hk These two relations combine to yield h 2 k2 2m +V =E or k 2 =L m (E—V) Then we assume that the space dependence of the wave function for a free particle is given by the sinusoidal 0(x) (x) = sin 27rx= sin kx The wave number k is constant since the potential energy V is constant for the case of a free particle, and since the total energy is constant also. Differentiating 4i(x) twice with respect to its only independent variable, we obtain d>G(x) dx dd = k cos kx Zx) _ — k 2 sin kx = — k2 ^fi(x) d^ zx) = — h 2 (E — V) (x) or h2 Z d2/p(x) + tj/(x) = EtP(x) 2m dx This is the time-independent Schroedinger equation, but we have obtained it from an argument specific to the case of a free particle where V is a constant. If, as in Section 5-2, we postulate that the equation is valid even in the general case where V = V(x), we obtain the time-independent Schroedinger equation for a particle acted on by a force. We have followed a much longer route in the text to obtain the same equation, but we have, of course, learned much along the way that is not contained in the time-independent Schroedinger equation. For instance, we know about the time dependence of the wave function 1I'(x,t) = >/i(x)e - `E", which is responsible for its necessarily complex character and the many consequences resulting therefrom. • 5 6 - REQUIRED PROPERTIES OF EIGENFUNCTIONS In the following section we shall consider, in a very general way, the problem of finding solutions to the time-independent Schroedinger equation. These considerations will show that energy quantization appears quite naturally in the Schroedinger theory. We shall see that this extremely significant property results from the fact that acceptable solutions to the time-independent Schroedinger equation can be found only for certain values of the total energy E. To be an acceptable solution, an eigenfunction t/i(x) and its derivative dt/i(x)/dx are required to have the following properties: dtli(x)/dx must be finite. iii(x) must be finite. dt/i(x)/dx must be single valued. 4i(x) must be single valued. >/i(x) must be continuous. d>/i(x)/dx must be continuous. These requirements are imposed in order to ensure that the eigenfunction be a mathematically "well-behaved" function so that measurable quantities which can be evaluated from the eigenfunction will also be well-behaved. Figure 5-8 illustrates the meaning of these properties by plotting functions which are not finite, not single valued, or not continuous, at the point x o. If i/i(x) or dijr(x)/dx were not finite, or not single valued, then the same would be true for'(x,t) = e tfr(x) or ô`I'(x,t)/ôx = e - iEt//i d f (x)/dx. Since the general formula for calculating expectation values of position or momentum, etc., (5-34), contains T(x,t) and alP(x,t)/ax, we see that in any of these cases we might not obtain finite and definite values when we evaluate measurable quantities. This would be completely unacceptable because measurable quantities, like the expectation value of position x, or of momentum p, do not behave in unreasonable ways. (In very rare circumstances, which we shall not encounter, 0(x) may actually go to infinity at a point, providing it does so slowly enough to keep finite the integral of ,*(x)t/i(x) over a region containing that point.) In order that dk(x)/dx be finite, it is necessary that t/r(x) be continuous. The reason is that any function always has an infinite first derivative wherever it has a discon- `Et/h REQUIRE D PROPERTIES OFEI GENF UNCTION S since k is a constant. Now we substitute for k 2 the value found above, and obtain SCHROEDINGER 'S THEORY OF Q UANTUM MECHANICS f(x) 0 xo X f(x) Not single valued 0 xo X f(x) Not continuous 0 xo x Figure 5 8 Illustrating functions which are not finite, not single valued, or not continuous at a point x o . - tinuity. The necessity for di/i(x)/dx to be continuous can be demonstrated by considering the time-independent Schroedinger equation, which we write as d2i/r(x) = 2m dx2 h2 [V(x) E3 ox) — For finite V(x), E, and t/i(x), we see that d 2i/i(x)/dx 2 must be finite. This in turn, demands that we require d0(x)/dx to be continuous because any function that has a discontinuity in the first derivative will have an infinite second derivative at the same point. (Note that there are discontinuities in the first derivative of the eigenfunction for the particle in a box, considered in Example 5-9. They occur at the walls of the box, and they arise from the fact that the system is an idealization in which the walls are assumed to be completely impenetrable, no matter how high the energy of the particle. That is, the potential energy is assumed to become infinite at the walls. This is discussed at length in the next chapter.) The importance of these requirements on the properties of acceptable solutions to the time-independent Schroedinger equation cannot be overemphasized. Differential equations have a wide variety of possible solutions. It is only when we select from all-the possible solutions those that conform to these requirements that we obtain energy quantization, or other equally significant properties of the Schroedinger theory that will be treated in the following chapter. The requirements of finiteness and continuity will be used immediately; single valuedness will not be used until later, but it is of equal importance. 5 7 ENERGY QUANTIZATION IN THE SCHROEDINGER THEORY - a* dx 2 = [V(x) — E] t/ (5-45) The properties of this differential equation depend, among other things, upon the form of the potential energy function V(x). This is as it should be since V(x) determines the force acting on the particle whose behavior is supposed to be described by the solutions to the differential equation. We consequently cannot say much about the properties of the differential equation until we say something about V(x), so we shall do this first. In Figure 5-9 we specify the form of V(x) that we shall use in our arguments by plotting V versus its independent variable x. The form has been chosen so that it Equilibrium separation Dissociation separation X Figure 5-9 The potential energy V(x) for an atom that can be bound to a similar atom to form a diatomic molecule, plotted as a function of the separation between the centers of the two atoms. ENER GYQUANTIZATIO N IN TH E SCHROED INGER THEO RY It is educational to study the problem of obtaining acceptable solutions to the timeindependent Schroedinger equation with qualitative arguments that concern the curvatures and slopes of curves obtained by plotting the solution. As we shall see, these arguments are both very general and very simple. They can teach us about many important properties of the time-independent Schroedinger equation, while avoiding any involved mathematics. In fact, the point of view that we shall use in this section is very useful for making a preliminary investigation of the properties of almost any differential equation, and it also provides an intuitive understanding of the behavior of such equations. We shall obtain only qualitative conclusions from these arguments, but they will be quite valuable. A number of quantitative solutions to the time-independent Schroedinger equation for various potentials will be found in the following chapters. We shall obtain those solutions from standard analytical techniques for solving differential equations. A quantitative solution to the time-independent Schroedinger equation will also be found in Appendix G. That solution is obtained by using a numerical technique that is based on the same ideas used in the qualitative arguments of this section, and so the student may wish to read that appendix after reading this section. We begin our arguments by writing the time-independent Schroedinger equation as SCHRO EDINGER 'S THEORY OF QU A NTUM MEC HANICS contains features which will allow us to illustrate several interesting points, but the form also has physical significance. It represents the potential energy for an atom that can be bound to a similar atom and form a diatomic molecule. In this case the x coordinate represents the separation between the centers of the two atoms. The minimum in V(x) occurs at the equilibrium separation, and at the minimum the force acting on the atom is F = — dV(x)/dx = 0. As the separation decreases from the equilibrium value a repulsive force develops in the direction of increasing separation, and it becomes larger as the atoms get closer. As the separation increases from the equilibrium value an attractive force develops in the direction of decreasing separation. But if the separation exceeds the disassociation separation indicated in Figure 5-9, the force drops to zero since the molecule is broken and the atoms no longer interact. With our choice of V(x) the time-independent Schroedinger equation, (5-45), begins to assume a specific form. Since this differential equation contains the total energy E in a crucial location, however, we must also choose its value in order that the equation have properties which are specific enough to make them easy to discuss. The value that we choose is indicated in Figure 5-10 by the horizontal line: energy = E = const. This figure also replots the curve: energy = V(x). We choose the total energy E in such a way that the molecule is bound (classically the separation distance x between the atoms must be between the values x' and x" shown in the figure), but the exact value of E that we choose is, at this stage, arbitrary. We shall not have to say anything about the combination of parameters 2m/h 2, appearing in the differential equation, other than that it has a positive value. Our argument will consider the differential equation, (5-45), as a prescription which determines the value of the second derivative d 2 1///dx2 of the solution, at a certain x, in terms of the values of (2m/h 2)[V(x) — E] and of the solution i/i itself, at that x. This will allow us to study important properties of the equation in terms of the general shape of the curve traced by a plot of ÿr versus x. Thus we shall obtain a geometrical interpretation of the differential equation. We shall be particularly concerned with the sign of d 2 tP/dx 2 because it is a property of second derivatives that a curve, of the dependent variable plotted versus the independent variable, is concave upwards wherever the second derivative is positive and concave downwards wherever the second derivative is negative. Students not already familiar with this property should inspect Figure 5-11, which shows a case in which the slope of the curve of versus x is negative for small x, becomes less negative with increasing x, goes through zero, and then becomes positive as x continues to V(x) E 0 x' x" X Figure 5 10 The potential energy V(x) used in qualitative arguments concerning the solutions to the time-independent Schroedinger equation, and the total energy E chosen for these arguments. - x Figure 5-11 A curve which is concave upwards. The value of the first derivative of the function plotted by the curve increases with increasing x, so the second derivative is positive. increase. The slope, which is equal to dpi/dx, always increases in numerical value with increasing x. Therefore the rate of change of slope, which is equal to d2Ji/dx 2, is always positive. The curve in this figure is said to be concave upwards. Figure 5-12 shows a case in which the curve is said to be concave downwards Similar considerations prove that in this case d 21/r/dx 2 is always negative. Now note that in Figure 5-10 there are two intersections of the line energy = E and the curve energy = V(x). These intersections occur at x = x' and x = x", which divide the x axis into three regions: x < x', x' < x < x", and x > x". In the first and third regions the quantity [V(x) — E] is positive since the value of V(x) is everywhere greater than the value of E in these regions. In the second region [V(x) — E] is negative. Inspection of (5-45) then shows that the sign of d2 t/J/dx2 is the same as the sign of in the first and third regions, and it is opposite to the sign of i/i in the second region, since the sign of 2m/h 2 is positive. This means that in the first and third regions the curve of i/r versus x will be concave upwards if the value of /i itself is positive, and it will be concave downwards if the value of Li is negative. In the second region the curve will be concave downwards if ifi is positive, and it will be concave upwards if t/r is negative. The various possibilities are shown in Figure 5-13. We have now laid the groundwork for our geometrical interpretation of the timeindependent Schroedinger equation. For a given form of the potential energy V(x), the differential equation enforces a relation between d 2 Ji/dx2 and Ii that determines the general behavior of >/i. If we also specify the value of çi and its first derivative ch/i/dx at some value of the independent variable x, then the particular behavior of the dependent va ri able ti is determined for all values of x. The situation is completely analogous to situations found 0 x Figure 5-12 A curve which is concave downwards. The value of the first derivative of the function decreases with increasing x, so the second derivative is negative. A1:IO3H183ON Ia3O 1:1HOS3H1NI NOI1t/ZI1N b'f1 OAJH3N3 0 SCHRO ED ING ER 'S THEORY OF Q UANTUM MEC HANICS 0 CO Region 1 0 Region 2 x' [V(x)—E1> 0 [V(x)—El< 0 Region 3 x" x [V(x) — E] > 0 Illustrating the relation between the sign of >/i and the sign of d 2 >/i/dx2 [V(x) — E]. The relation can be summarized by inthergosdfbytheino stating that /i is concave away from the x axis wherever [V(x) — E] > 0, and concave toward the x axis wherever [V(x) — E] < O. Figure 5 13 - in classical mechanics. Consider the differential equation for a classical simple harmonic oscillator d2x_ Cx dt2 m This is just Newton's law of motion, a = F/m, with a linear restoring force of force constant C. In this case x is the dependent variable, and the independent variable is t, but otherwise the analogy is complete. The differential equation enforces a relation between x and its second derivative, which determines the general behavior of x as a function of t. And if we also specify the value of x and its first derivative dx/dt at some value of t (the initial conditions of the motion), then the particular behavior of x is determined for all values of t. Thus it should be possible to use the time-independent Schroedinger equation, for the V(x) and E we have chosen, to determine the behavior of i/i for all x in terms of assumed values of tfr and dpi/dx for some particular x. Quantitative calculations that do this are found in the next chapters and, particularly, in Appendix G. Here we shall obtain qualitative results from arguments based upon the features of the differential equation just developed. The arguments will be presented as "thought calculations," in the same spirit as the thought experiments of Einstein or Bohr. On curve 1 of Figure 5-14 we indicate qualitatively the results of a thought calculation, which started with assumed values of >/i and di/r/dx at a convenient point x o in the second region, and then traced out the behavior of ÿr in the direction of increasing x. Since we took the initial value of >/i to be positive in the region x' < x < x", we found the curve describing 1,G initially to be concave downwards. It remained concave downwards until it passed into the third region, x > x", where [V(x) — E] changes sign. Although the slope of the curve was negative at x = x", it soon became zero, and then positive. Then tit started to increase in value, and matters rapidly went from bad to worse. The reason is that the differential equation shows that the rate of change of slope, i.e., d2 i/i/dx 2, is proportional to the distance from the curve to the axis, i.e., >/r. This first calculation produced a iÿ that goes to infinity as x becomes large. We found (part of) a solution to the differential equation, but it was not an acceptable solution because an acceptable eigenfunction remains finite. Curve 2 of Figure 5-14 indicates the results of another attempt made to find an acceptable solution. There was no point in changing the assumed initial value of ik as this would only expand or contract the vertical scale of the curve because of the linearity of the differential equation. What was done was to change the assumed initial value of dtÿ/dx. The attempt was not successful because tk became negative in the region where [V(x) — E] is positive. The curve became concave downwards and went to negative infinity. The difficulty in obtaining an acceptable eigenfunction should now be apparent. It should also be apparent that, by making exactly the right choice for the initial value of dtÿ/dx, it is possible to find a tk whose acceptable behavior with increasing x is as indicated by curve 3 of Figure 5-14. For this tk the curve is concave upwards in the third region because it remains above the x axis. Nevertheless, the curve does not turn up because it gets closer and closer to the axis with increasing x, and the closer it gets the less concave upwards it becomes. That is, d2 zfr/dx 2 approaches zero as approaches zero because the differential equation says these two quantities are proportional. In Figure 5-14 we also indicate with a dashed curve the results of extending the tk of curve 3 in the direction of decreasing x. From the preceding discussion we must expect that, in general, tk will go to either positive or negative infinity when extended to decreasing x. This cannot be prevented by adjusting the initial choice of dtÿ/dx, as that would disturb the acceptable behavior for large x. Nor can the infinite value of 1i at small x be prevented by joining two different tk functions with different slopes at x = x o. This is ruled out by the requirement that for an acceptable eigenfunction dtk/dx is everywhere continuous. For a similar reason we cannot try a discontinuity in i/i itself. We are forced to conclude that, for the particular value of the total energy E that was initially chosen, there is no acceptable solution to the time-independent Schroedinger equation. The relation between vi and its second derivative d21/î/dx 2, imposed by the differential equation for the given V(x) and that E, is such that tÿ will approach ± co at either large x or small x (or both). The solution to the equation is unstable, in the sense that it has a pronounced tendency to go to infinity in regions where E < V. By repeating this procedure for many different choices of the energy E, however, it will eventually be possible to find a value E 1 for which the time-independent Schroedinger equation has an acceptable solution tk 1 . In fact, there will, in general, be a number of allowed values of total energy, E 1 , E2, E3, ... for which the time-independent Schroedinger equation has acceptable solutions tÿ1, lÿ2, 03, .... In Figure 5-15 we indicate the form of the first three acceptable solutions. The behavior of tÿ 1 for both small and large x is the same as the behavior of the function shown in curve 3 of ENERGYQUANTIZATION INTHE SC H ROE DINGE R THEORY Figure 5-14 Three attempts at finding an acceptable solution to a time-independent Schroedinger equation for an assumed value of the total energy E. The first two (1,2) failed because the solution became infinite at large x. The third (3) gave the solution with acceptable behavior at large x, but failed because the solution became infinite at small x (dashed curve). SCHROED ING ER 'S THEORY OF Q UANTU M MECHANI CS 1 0 x —1 Figure 5-15 The form of the acceptable eigenfunctions corresponding to the three lowest allowed energy states for a potential with a minimum. At x = x o all three eigenfunctions have the same value, but ÿr3 has the largest curvature because it corresponds to the highest energy of the three. The solutions are for the potential in Figure 5-10, and they are not accurately left-right symmetric because the potential is not symmetric about its minimum. Figure 5-14 for large x. For x < x o, the behavior of 02 is at first similar to the behavior of `V 1, but, since its second derivative is relatively larger in magnitude, 0 2 crosses the axis at some value of x less than x o but greater than x'. When this happens, the sign of the second derivative reverses and the function becomes concave upwards. At x = x' the second derivative reverses again and, for x < x', the function gradually approaches the x axis. From Figure 5-15 we can see that the allowed energy E2 is larger than the allowed energy E 1•. Consider the point x o where both t/i i and 02 have the same value. It is apparent from the figure that at this point the rate of change of the slope for the latter exceeds the same quantity for the former, i.e. d20 1 d2 iP2 dx2 > dx 2 Using this in the time-independent Schroedinger equation, (5-45), we find that 1 17(x) — E21 > 1V(x) — E1 1 Consulting Figure 5-10, it is clear that if this is true at x o then E2 > E 1 since E > V(x) at x o . From a similar argument we can show that E3 > E2. It is also apparent that the energy differences E2 — E1, E3 — E2, etc., are not infinitesimals since, for example, the difference in the first inequality above is not an infinitesimal. Thus the allowed values of energy are well separated and form a discrete set of energies. For a particle moving under the influence of a time-independent potential V(x), acceptable solutions to the time-independent Schroedinger equation exist only if the energy of the particle is quantized, that is, restricted to a discrete set of energies E1, E2, E3, ... . This statement is true as long as the relation between the potential energy V(x) and the total energy E is similar to that shown in Figure 5-10, in the sense that there are two values of the coordinate, x' and x", with [V(x) — E] positive for all x < x' and also positive for all x > x". But for a potential of the type illustrated in Figure 5-9, that is, a potential which has a finite limiting value V as x becomes very large, there is generally room only for a finite number of discrete allowed energy values which satisfy the relation E < V. This is illustrated in Figure 5-16. For E > V, the situation changes. Now the molecule is unbound (classically the separation distance x between the atoms could be any value larger than x'). As far as the time-independent Figure 5-16 Illustrating discretely separated allowed energies E„ lying below the limiting value VI of a potential V(x), and the continuum of E„ lying above. Since En+1 En decreases as V(x) approaches VI , if the approach is gradual enough there can be an infinite number of En < VI . But generally there are only a finite number. — Schroedinger equation is concerned, there are now only two regions of the x axis: x < x' and x > x'. In the second region [V(x) — E] will be negative for all values of x, no matter how large. But, when [V(x) — E] is negative, t/i is concave downwards if its value is positive, and concave upwards if its value is negative. It always tends to return to the axis and is, therefore, an oscillatory function. Consequently, there will be no problem of i/i(x) going to infinity for large values of x. Since we can always make I'(x) gradually approach the axis for small values of x by a proper initial choice of dpi/dx, we shall be able to find an acceptable eigenfunction for any value of E > V. Thus the allowed energy values for El are continuously distributed, and are said to form a continuum. It is evident that if the potential V(x) is restricted in value for small values of x, or for both large and small values of x, then the allowed energies will form a continuum for all energies greater than the lowest V. The conclusion of our arguments can be stated concisely as follows: When the relation between the total energy of a particle and its potential energy is such that classically the particle would be bound to a limited region of space because the potential energy would exceed the total energy outside the region, then Schroedinger theory predicts that the total energy is quantized. When that relation is such that the particle is not bound to a limited region, then the theory predicts the total energy can have any value. Since in classical mechanics a particle bound to a limited region would move periodically between the limits of the region, the Wilson-Sommerfeld rules of the old quantum theory would also predict a quantization of the particle's energy in such circumstances; but these quantization rules were a postulate of the old quantum theory, which had a justification in the de Broglie relation only for certain special cases. In his first paper on quantum mechanics, Schroedinger wrote: "The essential point is the fact that the mysterious `requirement of integralness' no longer enters 'into the quantization rules but has been traced, so to speak, a step further back having been shown to result from the finiteness and single-valuedness of a certain space function (an eigenfunction)." Use the arguments developed in this section to draw qualitative conclusions concerning the form of the eigenfunction for one of the higher energy states of a simple harmonic oscillator. Then compare the corresponding probability density function with what would be predicted for a classical simple harmonic oscillator of the same energy. Example 5 12. - ENER GYQUANTIZATI ON IN THE SCHROEDIN GER THEORY x 0 SCHROEDINGER 'S THEORY OF Q UANTUMMECHANI CS Figure 5 17 The potential energy V(x) and one of the higher allowed values of the total energy E for a simple harmonic oscillator. - ^ The potential V(x) for a simple harmonic oscillator (see Example 5-3) is plotted by the curve in Figure 5-17. In the same figure one of the higher allowed values of the total energy E is plotted by a horizontal line. According to the time-independent Schroedinger equation, (5-45) = [ V(x) — EN/ dx2 m the eigenfunction ' will be an oscillatory function throughout the region where [V(x) — E] is negative since d2 /i/dx 2 will be negative (concave downward) if 0 is positive in that region, while d20/dx 2 will be positive (concave upwards) if 0 is negative in that region. However, 0 will oscillate less rapidly near the ends of the region than it does near the center since the magnitude of d 20/dx2, which determines the rapidity of oscillation of 0, is proportional to the magnitude of [V(x) — E], and the difference between V(x) and E becomes smaller as the ends of the region are approached. Therefore, the separation between the nodes of the oscillatory function increases near the ends of the region, in the manner indicated in Figure 5-18. The figure shows the amplitude of the oscillations in >fi increasing as the ends of the region are approached. The reason is that 0 must become larger in magnitude where it "bends over," if [V(x) — E] becomes smaller in magnitude, in order that d20/dx 2, which is proportional to their product, continue to have a large enough magnitude to make it bend. Note that Figure 5-18 indicates 0 gradually approaches the axis outside the region where [V(x) — E] is negative, as is required for an acceptable bound state eigenfunction. Also note that as rfi crosses the points where [V(x) — E] changes sign, it has no curvature because both that quantity and d20/dx2 are zero at these points. d2111 x" Figure 5 18 The eigenfunction for the thirteenth allowed energy of the simple harmonic oscillator. The classical limits of motion are indicated by x' and x". - P (x) Aa`dww ns A n 1 1 ^ A ^ ^ J ^ V 4 J u 0 J L ^ x" > x Figure 5-19 The solid curve is the probability density function for the thirteenth allowed energy of the simple harmonic oscillator. The dashed curve is the classical probability density function for simple harmonic motion with the same energy, and it follows closely the average value of the fluctuating quantum mechanical function. Compare with these functions for the first allowed energy shown in Figure 5-3. The probability density function is essentially the square of fi, and is indicated in Figure 5-19 by a solid curve. The dashed curve in the same figure indicates the probability density that would be expected in classical mechanics for a particle executing simple harmonic oscillations in the same potential with the same total energy. As we discussed at length in Example 5-6, the classical probability density becomes relatively large near the ends of the region where [V(x) — E] is negative since the particle moves most slowly near the ends. The figure actually shows the classical and quantum mechanical probability densities for a state of only moderately large energy E (actually E 13 ), but it makes quite apparent the nature of the correspondence between the probability densities found in the classical limit of very large values of E(En as n —' oo). In this limit the quantum mechanical probability density fluctuates within such small distances that only its average behavior, which agrees with the classical prediction, can be detected experimentally. Also, in the classical limit the quantum mechanical probability density does not penetrate a measurable distance outside the region where [V(x) — E] is negative because the penetration distance is comparable to the distance in which it fluctuates. This agrees with the sharp cutoff predicted by the classical probability density. For an idealized simple harmonic oscillator, V(x) remains proportional to x 2 even for very 4 large values of x 2, and so all the allowed energies are discretely separated. 5-8 SUMMARY A particular quantum mechanical system is described by a particular potential energy function. We have found that if the potential is time-independent, i.e., can be written V(x), the Schroedinger equation for the potential leads immediately to the corresponding time-independent Schroedinger equation. We have also found that acceptable solutions to the time-independent Schroedinger equation exist only for certain values of the energy, which we list in order of increasing energy as E1, E2,E3,..., En , . These energies are called the eigenvalues of the potential V(x); a particular potential . has a particular set of eigenvalues. The eigenvalues early in the list may be discretely separated in energy. However, unless the potential increases without limit for both SCHRO EDINGER 'S THEO RY OF QUAN TUM MEC HANICS 1.0 -C o very large and very small values of x, the eigenvalues become continuously distributed in energy beyond a certain energy. Corresponding to each eigenvalue is an eigenfunction i (x), Y' 2(x), Y' 3(x), ... , Y' n (x), .. • which is a solution to the time-independent Schroedinger equation for the potential V(x). For each eigenvalue there is also a corresponding wave function 1(x,t),'P2(x,t),'3(x,t), ... ,T , i(x,t), .. . From (5-44) we know that//''these wave functions are iE3t/h , ... ,^ -iEl -iE2t/ft ,u/ iE,,t/h .. • I,n(x)e1V 1 (x)e t/h , Y2 2 (x)e , t 3 (x)eEach wave function is a solution to the Schroedinger equation for the potential V(x). The index n, which takes on successive integral values, and which is employed to designate a particular eigenvalue and its corresponding eigenfunction and wave function, is called the quantum number. If the system is described by the wave function LPn (x,t), it is said to be in the quantum state n. Each of the wave functions iP n(x,t) is a particular solution to the Schroedinger equation for the potential V(x). Since that equation is linear in the wave function, we expect that any linear combination of these functions will also be a solution. This was verified in Example 5-2 for the case of a linear combination of two wave functions, but the proof can clearly be extended to show that an arbitrary linear combination of all wave functions which are solutions to the Schroedinger equation for a particular potential V(x), i.e. kP(x,t) = c1 1P1(x,t) + c2 tP2(x,t) + • • • + cnWPn(x,t) + • • • (5-46) is also a solution to that Schroedinger equation. In fact, this expression gives the most general form of the solution to the Schroedinger equation for a potential V(x). Its generality can be appreciated by noting that it is a function which is composed of a very large number of different functions combined in proportions governed by the adjustable constants cn . It should be noted that the time-independent Schroedinger equation is also a linear equation but, in contrast to the Schroedinger equation, it contains explicitly the total energy E. only Therfo,anbitylcmaonfdiertsluwafyheqtion if they all correspond to the same value of E. We shall see in the next chapter that there are two different solutions to the time-independent Schroedinger equation that do correspond to the same value of E because the equation involves a second derivative. We shall also see that both solutions are not always acceptable, even for an allowed value of E. When a particle is in a state such that a measurement of its total energy can lead only to a single result, the eigenvalue E, it is described by the wave function / -iEt/hi ^(x)e Example 5 13. - = An example (whose three-dimensionality makes no difference here) would be an electron in the ground state of a hydrogen atom. In this case, the probability density function * _ 1 * (x)e + iEt/ii (x)e - iEt/ 1 , — * (x)t (x) does not depend on time, as we have seen before. Consider a particle in a state such that a measurement of its total energy could lead to either of two results, the eigenvalue E 1 or the eigenvalue E2. Then the wave function describing the particle is - iE2t/h -iEttm + c202(x)e = eltkl(x)e An example would be an electron that is in the process of making a transition from an excited state to the ground state of the atom. Show that in this case the probability density function is an oscillatory function of time, and calculate the oscillation frequency. ■ We have for the probability density T*41 = [c1 / (x)e+iEitrn + c / iE /I'' (x)e+2t/i][c1Y'1(x)e - iEit/3r + '//,, c2Y'2(x)e iE2t/fir] - * = ciclt/4(x)I1(x) + cic2Ji(x)k 1/2(x) i(E2 -Ei)t/fi + cic1 i(x)01(x)e -i(E2-Ei)t/* + cic2/4(x)/2(x)e (5-47) The time dependences cancel in the first two, but not in the last two. These two terms contain complex exponentials that oscillate in time at frequency v. By rewriting the complex exponentials as in (5-41a) and (5-41b), we see immediately that V E1 E2 h E22^h 1 (5-48) t Some very interesting comments can be made about the results of Example 5-13. Consider an electron in the ground state of a hydrogen atom. Since the electron could be found at any location where the probability density has an appreciable value, the charge it carries would not be confined to a particular location. Thus, when speaking of average properties of the electron in the atom, it is appropriate to speak of its charge distribution, which is proportional to its probability density. Since the probability density is independent of time in the ground state, the charge dist ri bution is also. But even in classical electromagnetism a static distribution of charge does not emit radiation. We see that quantum mechanics provides a way of resolving the paradox of old quantum theory concerning the stability, against the emission of radiation, of atoms in their ground states. Atoms that are excited do emit radiation, and they eventually return to their ground states. Consider an electron in the process of making a transition from an excited state to the ground state of a hydrogen atom. Its probability density, and therefore the associated charge distribution, are oscillating in time at the frequency given by (5-48) v= E 2 - E1 h where E2 is the energy of the excited state and E 1 is the energy of the ground state. According to classical electromagnetism, this charge distribution would be expected to emit radiation at the same frequency; but this is also precisely the frequency of the photon that Bohr and Einstein say should be emitted, since the energy carried by the photon is E2 - E 1 . Of course this cannot happen for an electron in the ground state of the atom because there is no state of lower energy for the ground state to mix with and produce an oscillatory probability density or charge distribution. In addition to predicting correctly the frequencies of the photons emitted in atomic transitions, quantum mechanics also predicts correctly the probabilities per second that the transitions will take place. We shall obtain these predictions in Chapter 8 by a simple extension of the calculation of Example 5-13. It will be seen there that the perplexing selection rules of old quantum theory follow as an immediate consequence of these predictions. Schroedinger stressed the fact that his theory provides a physical picture of the process of emission of radiation by excited atoms that is very much more appealing than that provided by the Bohr theory. In discussing the advantages of his theory, he wrote: "It is hardly necessary to point out how much more gratifying it would be to conceive of a quantum transition as an energy change from one vibrational mode to another than to regard it as a jumping of electrons." Al:I `dww n s Multiplying the two terms in brackets, we obtain four terms SC HROEDINGER 'S THEORY O F Q UANTUM MECHANICS QUESTIONS 1. Why are there difficulties in applying the de Broglie postulate, A = h/p, to a particle whose linear momentum is of changing magnitude? 2. How does the de Broglie postulate enter into the Schroedinger theory? 3. Is the experimental evidence that the de Broglie-Einstein relation, y = E/h, applies to wave functions for material particles as firm as the evidence that it applies to electromagnetic waves and photons? Is the evidence that it applies to wave functions as firm as the evidence that A = h/p applies to wave functions? 4. What would be the effect on the Schroedinger theory of changing the definition of total energy in the relation y = E/h by adding the constant rest mass energy of the particle? 5. Why is the Schroedinger equation not valid for relativistic particles? 6. Did Newton derive his laws of motion, or did he obtain them from plausibility arguments? 7. Give a reason why the Schroedinger equation is written in terms of the potential energy, and not in terms of the force. 8. Why is it so important for the Schrodinger equation to be linear in the wave function? 9. The mass m of a particle appears explicitly in Schroedinger's equation, but its charge e does not, even though both may effect its motion. Why? 10. The wave equations of classical physics contain a second space derivative and a second time derivative. The Schroedinger equation contains a second space derivative and a first time derivative. Use these facts to explain why the solutions to the classical wave equations can be real functions, while the solutions to the Schroedinger equation must be complex functions. 11. Why does the Schroedinger equation contain a first time derivative? 12. Explain why it is not possible to measure the value of a complex quantity. 13. In electromagnetism we compute the intensity of a wave by taking the square of its amplitude. Why do we not do exactly the same thing with quantum mechanical waves? 14. Consider a water wave traveling across the surface of the ocean. If no one were observing the wave, or even thinking about it, would you say that the wave exists? Would you automatically give the same answer for a quantum mechanical wave? If not, why not? 15. What is the basic connection between the properties of a wave function and the behavior of the associated particle? 16. Why does the probability density function have to be everywhere real, non-negative, and of finite and definite value? 17. Explain in words what is meant by normalization of a wave function. 18. If the normalization condition is not applied, why can a wave function be multiplied by any constant factor and still remain a solution to the Schroedinger equation? 19. Why does Schroedinger quantum mechanics provide only statistical information? In your opinion, does this reflect a failing of the theory, or a property of nature? 20. Since the wave function describing the behavior of a particle satisfies a differential equation, its evolution in time is perfectly predictable. How does this fact fit in with the uncertainty principle? 21. State in words the meaning of the expectation value of x. 22. Why is it necessary to use a differential operator in calculating the expectation value of p? 23. Are there other examples in science, engineering, or mathematics in which differential operators are related to physical quantities? 24. Do you think it is legitimate to say that we have solved a differential equation by guessing the form of the solution and then verifying the guess by substitution? 25. Explain briefly the meaning of a well-behaved eigenfunction in the context of Schroedinger quantum mechanics. PROBLEMS 1. If the wave functions `F 1 (x,t), P 2(x,t), and T 3(x,t) are three solutions to the Schroedinger equation for a particular potential V(x,t), show that the arbitrary linear combination I(x,t) = c 1 P 1(x,t) + c 2T2(x,t) + c 3'P 3(x,t) is also a solution to that equation. 2. At a certain instant of time, the dependence of a wave function on position is as shown in Figure 5-20. (a) If a measurement that could locate the associated particle in an element dx of the x axis were made at that instant, where would it most likely be found? (b) Where would it least likely be found? (c) Are the chances better that it would be found at any positive value of x, or are they better that it would be found at any negative value of x? (d) Make a rough sketch of the potential V(x) which gives rise to the wave function. (e)To which allowed energy does the wave function correspond? 3. (a) Determine the frequency v of the time-dependent part of the wave function, quoted in Example 5-3, for the lowest energy state of a simple harmonic oscillator. (b) Use this value of v, and the de Broglie-Einstein relation E = hv, to evaluate the total energy E of the oscillator. (c) Use this value of E to show that the limits of the classical motion of the oscillator, found in Example 5-6, can be written as x = ± h 112 /(Cm) 1 14• 4. By evaluating the classical normalization integral in Example 5-6, determine the value of the constant B 2 which satisfies the requirement that the total probability of finding the particle in the classical oscillator somewhere between its limits of motion must equal one. 5. Use the results of Examples 5-5, 5-6, and 5-7 to evaluate the probability of finding a particle, in the lowest energy state of a quantum mechanical simple harmonic oscillator, The space dependence of a wave function considered in Problem 2, evaluated at a certain instant of time. Figure 5-20 sw37e oad 26. Why must an eigenfunction be well behaved in order to be acceptable in the Schroedinger theory? 27. Explain in two or three sentences how the quantization of energy is related to the wellbehaved character of acceptable eigenfunctions. 28. Why is i/i necessarily an oscillatory function if V(x) < E? 29. Why does tfi tend to go to infinity if V(x) > E? 30. Is it ever possible for an allowed value of the total energy E of a system to be less than the minimum value of its potential energy V(x)? Give a qualitative argument, along the lines of the arguments in Section 5-7, to justify your answer. 31. We have seen several examples of the general result that the lowest allowed value of the total energy E, for a particle bound in a potential V(x), lies above the minimum value of V(x). Use the uncertainty principle in a qualitative argument to explain why this must be so. 32. If a particle is not bound in a potential, its total energy is not quantized. Does this mean the potential has no effect on the behavior of the particle? What effect would you expect it to have? 0 SCHROEDINGER 'S THEORY O F Q UANTU M MEC HANICS ti 6. 7. 8. 9. within the limits of the classical motion. (Hint: (i) The classical limits of motion are expressed in a convenient form in the statement of Problem 3c. (ii) The definite integral that will be obtained can be expressed as a normal probability integral, or an error function. It can then be evaluated immediately by consulting mathematical handbooks which tabulate these quantities. Or, the integral can easily be evaluated by expanding the exponential as an inifinite series before integrating, and then integrating the first few terms in the series. Alternatively, the definite integral can be evaluated by plotting the integrand on graph paper, and counting squares to find the area enclosed between the integrand, the axis, and the limits ) At sufficiently low temperature, an atom of a vibrating diatomic molecule is a simple harmonic oscillator in its lowest energy state because it is bound to the other atom by a linear restoring force. (The restoring force is linear, at least approximately, because the molecular vibrations are very small.) The force constant C for a typical molecule has a value of about C 103 nt/m. The mass of the atom is about m 10 -26 kg. (a) Use these numbers to evaluate the limits of the classical motion from the formula quoted in Problem 3c. (b) Compare the distance between these limits to the dimensions of a typical diatomic molecule, and comment on what this comparison implies concerning the behavior of such a molecule at very low temperatures. Use the particle in a box wave function verified in Example 5-9, with the value of A determined in Example 5-10, to calculate the probability that the particle associated with the wave function would be found in a measurement within a distance of a/3 from the righthand end of the box of length a. The particle is in its lowest energy state. (b) Compare with the probability that would be predicted classically from a very simple calculation related to the one in Example 5-6. Use the results of Example 5-9 to estimate the total energy of a neutron of mass about 10 -27 kg which is assumed to move freely through a nucleus of linear dimensions of about 10 -14 m, but which is strictly confined to the nucleus. Express the estimate in MeV. It will be close to the actual energy of a neutron in the lowest energy state of a typical nucleus. (a) Following the procedure of Example 5-9, verify that the wave function `h( x,)t = - `Earn A sin tax e 0 10. 11. 12. 13. - a/2 < x < + a/2 x < —a12 or x > + a/2 is a solution to the Schroedinger equation in the region — a/2 < x < +a12for a particle which moves freely through the region but which is strictly confined to it. (b) Also determine the value of the total energy E of the particle in this first excited state of the system, and compare with the total energy of the ground state found in Example 5-9. (c) Plot the space dependence of this wave function. Compare with the ground state wave function of Figure 5-7, and give a qualitative argument relating the difference in the two wave functions to the difference in the total energies of the two states. (a) Normalize the wave function of Problem 9, by adjusting the value of the multiplicative constant A so that the total probability of finding the associated particle somewhere in the region of length a equals one. (b) Compare with the value of A obtained in Example 5-10 by normalizing the ground state wave function. Discuss the comparison. Calculate the expectation value of x, and the expectation value of x 2, for the particle associated with the wave function of Problem 10. Calculate the expectation value of p, and the expectation value of p2, for the particle associated with the wave function of Problem 10. (a) Use quantities calculated in the preceding two problems to calculate the product of the uncertainties in position and momentum of the particle in the first excited state of the system being considered. (b) Compare with the uncertainty product when the particle is in the lowest energy state of the system, obtained in Example 5-10. Explain why the uncertainty products differ. xp = J `I^*x ^ —i^i ^x )`Fdx - CO px = J T* — i^i ^ x`Pdx \ -00 should be used. (In the first expression ô/ôx operates on W; in the second it operates on xW.) (a) Show that neither is acceptable because both violate the obvious requirement that xp should be real since it is measurable. (b) Then show that the expression f ^xp= T* - 16. 17. 18. 19. 20. 21. 22. x — iii a x +—i% ax Ix ^ 2 T dx ^ is acceptable because it does satisfy this requirement. (Hint: (i) A quantity is real if it equals its own complex conjugate. (ii) Try integrating by parts. (iii) In any realistic case the wave function will always vanish at x = + oo.) Show by direct substitution into the Schroedinger equation that the wave function tiP(x,t) = 1/i(x)e satisfies that equation if the eigenfunction 0(x) satisfies the time-independent Schroedinger equation for a potential V(x). (a) Write the classical wave equation for a string of density per unit length which varies with x. (b) Then separate it into two ordinary differential equations, and show that the equation in x is very analogous to the time-independent Schroedinger equation. By using an extension of the procedure leading to (5-31), obtain the Schroedinger equation for a particle of mass m moving in three dimensions (described by rectangular coordinates x, y, z). (a) Separate the Schroedinger equation of Problem 18, for a time-independent potential, into a time-independent Schroedinger equation and an equation for the time dependence of the wave function. (b) Compare to the corresponding one-dimensional equations, (5-37) and (5-38), and explain the similarities and the differences. (a) Separate the time-independent Schroedinger equation of Problem 19 into three timeindependent Schroedinger equations, one in each of the coordinates. (b) Compare them with (5-37). (c) Explain clearly what must be assumed about the form of the potential energy in order to make the separation possible, and what the physical significance of this assumption is. (d) Give an example of a system that would have such a potential. Starting with the relativisitic expression for the energy, formulate a Schroedinger equation for photons, and solve it by separation of variables, assuming V = O. Consider a particle moving under the in fl uence of the potential V(x) = Clxl, where C is a constant, which is illustrated in Figure 5-21. (a) Use qualitative 'arguments, very similar to those of Example 5-12, to make a sketch of the first eigenfunction and of the tenth eigenfunction for the system. (b) Sketch both of the corresponding probability density functions. (c) Then use the classical mechanics to calculate, in the manner of Example 5-6, the probability density functions predicted by that theory. (d) Plot the classical probability density functions with the quantum mechanical probability density functions, and discuss briefly their comparison. ^ SW378 0ad 14. (a) Calculate the expectation values of the kinetic energy and the potential energy for a particle in the lowest energy state of a simple harmonic oscillator, using the wave function of Example 5-7. (b) Compare with the time-averaged kinetic and potential energies for a classical simple harmonic oscillator of the same total energy. 15. In calculating the expectation value of the product of position times momentum, an ambiguity arises because it is not apparent which of the two expressions SC HRO EDINGER 'S THEO RY O F QU ANTUM MECHANICS V(x) Figure 5-21 Problem 22. A potential function considered in 23. Consider a particle moving in the potential V(x) plotted in Figure 5-22. For the following ranges of the total energy E, state whether there are any allowed values of E and if so, whether they are discretely separated or continuously distributed. (a) E < V0 , (b) V0 < E< V1 , (c) V1 < E < V2 , (d) V2 < E < V3 , (e) V3 < E. V(x) V3 co 00 V2 -( Vo x 0 Figure 5-22 Problem 23. A potential function considered in 24. Consider a particle moving in the potential V(x) illustrated in Figure 5-23, that has a rectangular region of depth Vo , and width a, in which the particle can be bound. These parameters are related to the mass m of the particle in such a way that the lowest allowed energy E 1 is found at an energy about V0/4 above the "bottom." Use qualitative arguments to sketch the approximate shape of the corresponding eigenfunction t/i 1 (x). V(x) Vo E1 — a/2 0 +a/2 Figure 5-23 Problem 24. A potential function considered in 25. Suppose the bottom of the potential function of Problem 24 is changed by adding a bump in the center of height about V0/10 and width a/4. That is, suppose the potential now I: z I k a /4 .I A rectangular bump added to the bottom of the potential of Figure 5-23; for Problem 25. Figure 5-24 26. Because the bump in Problem 25 is small, a good approximation to the lowest allowed energy of the particle in the presence of the bump can be obtained by taking it as the sum of the energy in the absence of the bump plus the expectation value of the extra potential energy represented by the bump, taking the `P corresponding to no bump to calculate the expectation value. Using this point of view, predict whether a bump of the same "size," but located at the edge of the bottom as in Figure 5-25, would have a larger, smaller, or equal effect on the lowest allowed energy of the particle, compared to the effect of a centered bump. (Hint: Make a rough sketch of the product of P*'P and the potential energy function that describes the centered bump. Then consider qualitatively the effect of moving the bump to the edge on the integral of this product.) 27. By substitution into the time-independent Schroedinger equation for the potential illustrated in Figure 5-23, show that in the region to the right of the binding region the eigenfunction has the mathematical form x > +a/2 i/i(x) = Ae-W2m(Vo - E) /fi)x 28. Using the probability density corresponding to the eigenfunction of Problem 27, write an expression to estimate the distance D outside the binding region of the potential within which there would be an appreciable probability of finding the particle. (Hint: Take D to extend to the point at which `Y*`Y is smaller than its value at the edge of the binding region by a factor of e - 1 . This e - 1 criterion is similar to one often used in the study of electrical circuits.) 29. The potential illustrated in Figure 5-23 gives a good description of the forces acting on an electron moving through a block of metal. The energy difference V o — E, for the highest energy electron, is the work function for the metal. Typically, V o — E ^ 5 eV. (a) Use this value to estimate the distance D of Problem 28. (b) Comment on the results of the estimate. vo/lo The same rectangular bump as in Figure 5-24, but moved to the edge of the potential; for Problem 26. x Figure 5-25 a/4 SW3 19O 9d looks like the illustration of Figure 5-24. Consider qualitatively what will happen to the curvature of the eigenfunction in the region of the bump, and how this will, in turn, a ffect the problem of obtaining an acceptable behavior of the eigenfunction in the region outside the binding region. From these considerations predict, qualitatively, what the bump will do to the value of the lowest allowed energy E 1 . SCHROEDINGER 'S THEOREY OF Q UANTUM MECHANICS x 0 V(x) x V(x) 0 x An eigenfunction (top curve) and three possible forms (bottom curves) of the potential energy function considered in Problem 30. Figure 5-26 30. Consider the eigenfunction illustrated in the top part of Figure 5-26. (a) Which of the three potentials illustrated in the bottom part of the figure could lead to such an eigenfunction? Give qualitative arguments to justify your answer. (b) The eigenfunction shown is not the one corresponding to the lowest allowed energy for the potential. Sketch the form of the eigenfunction which does correspond to the lowest allowed energy E 1 . (c) Indicate on another sketch the range of energies where you would expect discretely separated allowed energy states, and the range of energies where you would expect the allowed energies to be continuously distributed. (d) Sketch the form of the eigenfunction which corresponds to the second allowed energy E2. (e) To which energy level does the eigenfunction presented in Figure 5-26 correspond? 31. Estimate the lowest energy level for a one-dimensional infinite square well of width a containing a cosine bump. That is, the potential V is V = Vo cos (xx/a) V = infinity — a/2 < x < + a/2 x < —a/2 or x > + a/2 where Vo « n22/2ma2 . 32. Using the first two normalized wave functions `P 1 (x,t) and'F 2(x,t) for a particle moving freely in a region of length a, but strictly confined to that region, construct the linear combination 'P(x,t) = c1'P1(x,t) + c2'V2(x,t) Then derive a relation involving the adjustable constants c1 and c2 which, when satisfied, will ensure that `I (x ,t) is also normalized. The normalized'P 1 (x,t) and'P 2(x,t) are obtained in Example 5-10 and Problem 10. 33. (a) Using the normalized "mixed" wave function of Problem 32, calculate the expectation value of the total energy E of the particle in terms of the energies E 1 and E2 of the two states and of the values c 1 and c2 of the mixing parameters. (b) Interpret carefully the meaning of your result. cn SW31 8Obd 34. If the particle described by the wave function of Problem 32 is a proton moving in a nucleus, it will give rise to a charge distribution which oscillates in time at the same frequency as the oscillations of its probability density. (a) Evaluate this frequency for values of E 1 and E2 corresponding to a proton mass of 10 - 27 kg and a nuclear dimension of 10 -14 m. (b) Also evaluate the frequency and energy of the photon that would be emitted by this oscillating charge distribution as the proton drops from the excited state to the ground state. (c) In what region of the electromagnetic spectrum is such a photon? 6 SOLUTIONS OF TIME-INDEPENDENT SCHROEDINGER EQUATIONS 6-1 INTRODUCTION 177 roles of nonbinding and binding potentials 6-2 THE ZERO POTENTIAL 178 classical motion in potential; general solution of equation; interpretation of sinusoidal traveling wave eigenfunctions and wave functions; box normalization; group traveling waves; Newton's law from Schroedinger's equation 6-3 THE STEP POTENTIAL (ENERGY LESS THAN STEP HEIGHT) 184 classical motion; general eigenfunction for region under step; finiteness; continuity conditions at step; reflection coefficient; penetration under step; classical limit; penetration distances for dust particle and conduction electron 6-4 THE STEP POTENTIAL (ENERGY GREATER THAN STEP HEIGHT) 193 classical motion; absence of reflected wave in region over step; continuity conditions; reflection and transmission coefficients; classical limit; reflection of neutron entering nucleus 6-5 THE BARRIER POTENTIAL 199 classical motions; procedure for solution; barrier penetration probability density and transmission coefficient; tunneling; transmission coefficient for passage over barrier; electron-atom scattering, Ramsauer's effect and size resonances; comparison of barrier and step; frustrated total internal reflection and barrier penetration 6-6 EXAMPLES OF BARRIER PENETRATION BY PARTICLES 205 a particle-nucleus potential; a emission; Gamow-Condon-Gurney a-decay theory; ammonia molecule inversion and atomic clocks; tunnel diodes 6-7 THE SQUARE WELL POTENTIAL classical motion; systems approximated by potential; procedure for solution; eigenvalues and eigenfunctions; classical limit; infinite square well limit 176 209 6-8 THE INFINITE SQUARE WELL POTENTIAL 214 6-9 THE SIMPLE HARMONIC OSCILLATOR POTENTIAL 221 small vibrations; classical motion; procedure for solution; eigenvalues and zero-point energy; eigenfunctions and parity 6-10 SUMMARY 225 tabulated properties of potentials studied QUESTIONS 226 PROBLEMS 228 6-1 INTRODUCTION In this chapter we shall obtain many interesting predictions concerning quantum mechanical phenomena. We shall also discuss some of the experiments confirming the predictions, and some of the important practical applications of the phenomena. The predictions will be obtained by solving the time-independent Schroedinger equation for different forms of the potential energy function V(x), to find the eigenfunctions, eigenvalues, and wave functions, and then using the procedures developed in the previous chapter to interpret the physical significance of these quantities. Our approach will be very systematic. We shall start by treating the simplest possible form of the potential, namely V(x) = O. Then we shall gradually add complexity to the potential. With each new potential treated, the student will obtain new insight into quantum mechanics and into the behavior of microscopic systems. In this process the student should begin to develop an intuition for quantum mechanics, just as he has developed an intuition for classical mechanics by repeated use of that theory. The potentials considered in the first sections of this chapter are not able to bind a particle because there is no region in which they have a depression. Although discrete quantization of energy will not be found for these potentials, other fundamental phenomena will be found. In addition to the fact that they naturally fit in at the beginning of our systematic approach, another reason for treating nonbinding potentials first is that it emphasizes their importance. Probably half of the work currently being done in quantum mechanics concerns unbound particles. It is true, however, that most of the applications of quantum mechanics that were made initially concerned bound particles. Most aspects of the structure of atoms, molecules, and solids are examples of bound particle problems, as are many aspects of nuclear structure. Since these are the topics we shall concentrate on in the following chapters of this book, some students (or instructors) may prefer to go directly to Section 6-7, which is the first to treat binding potentials, or to Section 6-8, which treats an important special case. Those sections are sufficiently self-contained to make such short cuts feasible without too much difficulty. Throughout this chapter we deal only with time-independent potentials, since only for such potentials does the time-independent Schroedinger equation have significance. We further restrict ourselves to a single dimension because this simplifies the . NOI10 f1Q O HlNI systems approximated by potential; solution; eigenvalues; zero-point energy and relation to uncertainty principle; eigenfunctions; direct app lication of de Broglie relation; electron bound in nucleus; parity of eigenfunctions; classical limit SOLUTION SOF TIME- INDEPENDENTSCHROEDING ER EQU ATIONS mathematics while still allowing us to demonstrate most of the interesting quantum phenomena. Obvious exceptions are phenomena involving angular momentum, since this quantity has no meaning in one dimension. Because angular momentum plays a dominant role in atomic structure, the following chapter begins by extending our development of quantum mechanics to three dimensions. 6 2 - THE ZERO POTENTIAL The simplest time-independent Schroedinger equation is the one for the case: V(x) _ const. A particle moving under the influence of such a potential is a free particle since the force acting on it is F = —dV(x)/dx = O. As this is true regardless of the value of the constant, we do not lose generality by choosing the arbitrary additive constant, that always arises in the definition of a potential energy, in such a way as to obtain V(x) = 0 (6-1) We know that in classical mechanics a free particle may be either at rest or moving with constant momentum p. In either case its total energy E is a constant. To find the behavior predicted by quantum mechanics for a free particle, we solve the time-independent Schroedinger equation, (5-43), setting V(x) = O. With this form for the potential, the equation is h2 d2 tŸ(x) = Et11 (x) 2m dx 2 (6-2) The solutions are the eigenfunctions lfr(x), and the wave functions '(x,t) according to (5-44) are -iEtm (6-3) 1P(x,t) = a(x)e The eigenvalues E are equal to the total energy of the particle. From the qualitative discussion of Section 5-7, we know that an acceptable solution of the time-independent Schroedinger equation for this nonbinding potential should exist for any value of E > O. Of course, we already know a form of the free particle wave function from our plausibility argument leading to the Schroedinger equation. That wave function, (5-23), is 'P(x,t) = cos (kx — wt) + i sin (kx — cot) Rewriting it as a complex exponential, we have 'Y(x,t) = e i(kx - wt) (6-4a) The wave number k and angular frequency w are p ^2mE and k== h w= E (6-4b) We break the exponential into the product of two factors W(x,t) = e ikxe - hot = eikxe - iEtm Then we compare with the general form of the wave function quoted in (6-3) P(x,t) = lf/(x)e - iEt/i^ This comparison makes it apparent that 0( x) eikx l where k = V2 ^ E (6-5) That is, the complex exponential of (6-5) gives the form of a free particle eigenfunction corresponding to the eigenvalue E. More specifically, it is a traveling wave free particle eigenfunction because the corresponding wave function, `P(x,t) = ei(kx-wt), represents a traveling wave. This can be seen, for example, from the fact that the nodes of the real part of the oscillatory wave function are located at positions where kx wt = (n + 1/2)7r, with n =0, + 1, ± 2, .... The reason is that the real part of `Y(x,t), which is cos (kx — wt), has the value zero wherever kx wt = (n + 1/2)71. Thus the nodes occur wherever x = (n + 1/2)7r/k + wt/k and, since these values of x increase with increasing t, the nodes travel in the direction of increasing x. The conclusion is illustrated in the top part of Figure 6-1 which shows plots of the real part of'P(x,t) at successively later times. For this wave function, the probability density P*(x,t)T(x,t), illustrated in the bottom of Figure 6-1, conveys no sense of motion. Intuition suggests that, for the same value of E, there should also be a wave function representing a wave traveling in the direction of decreasing x. The preceding argument indicates that this wave function would be written with the sign of kx reversed, that is k(x,t) = erg kx—wt) (6-6) The corresponding eigenfunction would be — 0(x) = e - ikx (6-7) E m It is easy to see that this eigenfunction is also a solution to the time-independent Schroedinger equation for V(x) = 0. In fact, any arbitrary linear combination of the where k = q1 *(x, t) qi (x, t) ^ Rea l part oftY(x, t) -- x Top: The real part, cos (kx — cot), of a complex exponential traveling wave Figure 6-1 function, AV = eqNX--0 , for a free particle. With increasing time the nodes move in the direction of increasing x. Bottom: For this wave funotion a sense of motion is not conveyed by = e-/(kx- ,,,t)egk`_cot) = 1 since it is constant for all plotting the probability density itself, as it is complex. Of course, we cannot plot x). t (and all 1dI1N310d0 1:13Z3 H1 — SOLU TIONSOF TIME- INDEPENDENT SCHROEDING ER EQUATION S two eigenfunctions of (6-5) and (6-7), for the same value of the total energy E, is also a solution to the equation. To prove these statements, we take the linear combination J2mE ll/(x) = Ae ikx + Be - ikx where k = h (6-8) o w T in which A and B are arbitrary constants, and substitute it into the time-independent Schroedinger equation, (6-2). Since 2m i2k2Aeikx + i2k2Be-ikx = — k2 (x) = E i(x) 2 = x (x) = subs itution into the equation yields t d h2 2mE iŸ(x) = Etli(x) 2m Since this is obviously satisfied, the linear combination is a valid solution to the timeindependent Schroedinger equation. The most general form of the solution to an ordinary (i.e., not partial) differential equation involving a second derivative contains two arbitrary constants. The reason is that obtaining the solution from such an equation basically amounts to performing two successive integrations to remove the secônd derivative, and each step yields a constant of integration. Examples familiar to the student are found in general solutions of Newton's equation of motion, which involve two arbitrary constants such as initial position and velocity. Since the linear combination of (6-8) is a solution containing two arbitrary constants to (6-2), it is its general solution. The general solution is useful because it allows us to describe any possible eigenfunction associated with the eigenvalue E. For instance, if we set B = 0, we obtain an eigenfunction for a wave traveling in the direction of increasing x. If we set A = 0, the wave is traveling in the direction of decreasing x. If we set IAI = IBI, there are two oppositely directed traveling waves that combine to form a standing wave. Standing wave eigenfunctions will be used in Section 6-3. Let us consider now the question of giving physical interpretation to the free particle eigenfunctions and wave functions. Take first the case of a wave traveling in the direction of increasing x. The eigenfunction and wave function for this case are ik x i/i(x) = Ae and 'P(x,t) = Aei(kx-mi) (6-9) An obvious guess is that the particle whose motion is described by these functions is also traveling in the direction of increasing x. To verify this, let us calculate the expectation value of the momentum, p, for the particle. According to the general expectation value formula, (5-34) p= J LY*poP`F dx -^ where the operator for momentum is a pop = — ih ax Now, for the wave function in question, we have pops` = — iii x Aei(kx-wr) = —ih(ik)Aei(kx-') = + ikt' = + ,\/2mE `F so =+ J P*V2mETdx = +2mE -^ J x When we operate on `I' with pap, the sign reversal of the kx term in the former leads to a sign reversal in the result. This, in turn, leads to a momentum expectation value of = — V2mE Therefore, we interpret the eigenfunction, and wave function, as describing the motion of a particle which is moving in the direction of decreasing x with negative momentum of the magnitude that would be expected in consideration of its energy. The eigenfunctions and wave functions just considered represent the idealized situations of a particle moving, in one direction or the other, in a beam of infinite length. Its x coordinate is completely unknown because the amplitudes of the waves are the same in all regions of the x axis. That is, the probability densities, for instance ,p ty = A * e -i(kx-wo Ae i(kx-00 = A*A are constants independent of x. Thus the particle is equally likely to be found anywhere, and the uncertainty in its position is Ax = co. The uncertainty principle states that in these situations we may know the value of the momentum p of the particle with complete precision, since ApAx > h/2 can be satisfied for an uncertainty in its momentum of Ap = 0, if Ax = co. Perfectly precise values of p are also indicated by the de Broglie relation, p = hk, because these wave functions contain only a single value of the wave number k. Since there is an infinite amount of time available to measure the energy of a particle traveling through a beam of infinite length, the energy-time uncertainty principle AEA t > h/2 allows its energy to be known with complete precision. This agrees with the presence of a single value of the angular frequency w in these wave functions, because the de Broglie-Einstein relation E = hw shows this means a single value of the energy E. A physical example approximating the idealized situation represented by these wave functions would be a proton moving in a highly monoenergetic beam emerging from a cyclotron. Such beams are used to study the scattering of protons by targets of nuclei inserted in the beam. From the point of view of the target nucleus, and in terms of distances of the order of its nuclear radius r', the x position of a proton in the beam may be for all practical purposes completely unknown. That is Ax » r'. Thus the free particle wave functions of (6-9) and (6-10) can give a good approximation to the description of the beam proton in the region of interest near the nucleus where the scattering takes place. In other words, near a nucleus the wave function of (6-9) — 'Y = Ae i(kx - wt) can be used to describe a proton in a cyclotron beam directed towards increasing x, providing the beam is extremely long compared to the dimensions of the nucleus—a condition which is always satisfied in practice since nuclei are extremely small. The wave function describes a particle moving with momentum precisely p = hk and 1dI1N310d Oa3Z 3H1 The integral on the right is the probability density integrated over the entire range of the x axis. This is just the probability that the particle will be found somewhere, which must equal one. Therefore, we obtain p = +J2mE This is exactly the momentum that we would expect for a particle moving in the direction of increasing x with total energy E in a region of zero potential energy. For the case of a wave traveling in the direction of decreasing x, the eigenfunction and wave function are kx— wt) ÿi(x) = Be' and P(x,t) = (6-10) SO LUTI ONSOF TIME- I NDEPENDENT SCHROEDINGER EQU ATIONS N CO total energy precisely E = hw, where these quantities are related by the equation p = \/2mE appropriate to a particle of mass m moving in a region of zero potential energy. There is a difficulty concerning the normalization of the wave functions of (6-9) and (6-10). In order to have, for instance J 00 qi*qi dx = J A *A dx = A*A J dx = - 00 - 1 Co the amplitude A must be zero as f °° dx has an infinite value. The difficulty arises from the unrealistic statement made by the wave function that the particle can be found with equal probability anywhere in a beam of infinite length. This is never really true since real beams are always of finite length. The proton beam is limited on one end by the cyclotron and on the other end by a laboratory wall. Although the uncertainty Ax in location of a proton is very much larger than a nuclear radius r', it is not larger than the distance L from the cyclotron to the wall. That is, even though Ax » r' , it is also true that Ax < L. This suggests that normalization can be obtained by setting `If = 0 outside of the range —L/2 < x < + L/2, or else by restricting x to be within that range. In either way we obtain a more realistic description of the actual physical situation, and we can also normalize the wave function with a nonvanishing amplitude A. The procedure is called box normalization. Despite the fact that the value of A obtained depends on the length L of the box, it always turns out that the final result of calculation of a measurable quantity is independent of the actual value of L used. Furthermore, we shall see that it is usually not necessary to carry through box normalization in detail because quantities of physical interest can be expressed as ratios in which the value of A cancels. The situation is quite analogous to ones commonly encountered in classical physics. For instance, in solving a problem of electrostatics, a straight charged wire of infinite length is often used to approximate one of finite length in a system where "end effects" are not important. This idealization very much simplifies the geometry of the problem, but it leads to the difficulty that an infinite amount of energy is required to charge the infinitely long wire, unless its charge density is zero. It is usually possible, however, to get around this difficulty simply by expressing the quantities that arise in the problem in terms of ratios. It is possible to obtain a much more realistic sense of motion than is seen in either part of Figure 6-1 by using a large number of wave functions of the form of (6-9) to generate a group of traveling waves. Figure 6-2 shows the probability density 11191 for a particularly simple group, its motion in the direction of increasing x, and the ever increasing width of the group. At any instant the location of the group can be well characterized by the expectation value z calculated from the probability density. The constant velocity of the group, dz/dt, equals the constant velocity of the free particle, y = p/m = N/2mE/m = /2E/m, in agreement with the conclusions of Chapter 3. The spreading of the group is a characteristic property of waves that is intimately related to the uncertainty principle, as discussed in that chapter. Of course the behavior of the group wave function is easier to interpret than the behavior of a purely sinusoidal wave function, such as that of (6-9), because the corresponding probability density is closer to the description of particle motion we are familiar with from classical mechanics. However the mathematics required to describe the group, and treat its behavior analytically, is much more complicated. The reason is that a group must necessarily involve a distribution of wave numbers k, and therefore a distribution of energies E = h2k 2/2m. In order to compose even as simple a group as the one shown in the figure, a very large number of sinusoidal waves, with very small differences in wave numbers or energies, must be summed in the manner described in Chapter 3. These mathematical complications far outweigh any advantages involved in the ease of interpretation. Consequently, groups are rarely used in practical quan, t = to + At >x x t o = to + 2At ^x x Figure 6-2 The probability density 'Y*li for a group traveling wave function of a free particle. With increasing time the group moves in the direction of increasing x, and also spreads. turn mechanical calculations, and most such calculations are performed with wave functions involving a single wave number and energy. Our consideration of the motion of the group in Figure 6-2 leads us to discuss briefly a related case of great interest. If, instead of having the constant value zero, the potential function V(x) changes so slowly that its value is almost constant over a distance of the order of the de Broglie wavelength of the particle, the group wave function will still propagate in a manner similar to that illustrated in the figure, but the velocity of the group will now also change slowly. Calculations, starting from the Schroedinger equation, lead to an expression relating the change in the velocity, dx/dt, of the group to the change in the potential, V(x). The expression is d ddz dt (dt l dx V(x) m or d 2z dV(x) dx m F(x) m dt2 where the bars denote expectation values and F(x) is the force corresponding to the potential V(x). It is unfortunate that the calculations are too complicated to reproduce here. They are very significant because they show that the acceleration of the average location of the particle associated with the group wave function equals the average force acting on the particle, divided by its mass. That is, Schroedinger's equation leads to the result that Newton's law of motion is obeyed, on the average, by a particle of a microscopic system. The fl uctuations from its average behavior reflect the uncertainty principle, and they are very important in the microscopic limit. But these fluctuations become negligible in the macroscopic limit where the uncertainty principle is of no consequence, and it is no longer necessary to speak of averages in talking about locations in that limit Also, in the macroscopic limit any realistic potential changes by only a small amount in a distance as short as a de Broglie wavelength. So it is also not necessary, in that limit, to speak of averages when discussing potentials. Thus, in the macroscopic limit we can ignore the bars 1H I1N310d0 1:13 Z3 H1 x CO SO LUTIONSOF TIME- INDEPENDENT SC HROED INGER EQUATIONS T representing expectation values, or averages, in the equations just displayed. We then conclude that Newton's law of motion can be derived from the Schroedinger equation, in the classical limit of macroscopic systems. Newton's law of motion is a special case of Schroedinger's equation. 6 3 THE STEP POTENTIAL (ENERGY LESS THAN STEP HEIGHT) - In the next sections we shall study solutions to the time-independent Schroedinger equation for a particle whose potential energy can be represented by a function V(x) which has a different constant value in each of several adjacent ranges of the x axis. These potentials change in value abruptly in going from one range to the adjacent range. Of course potentials which change abruptly (i.e., are discontinuous functions of x) do not really exist in nature. Nevertheless, these idealized potentials are used frequently in quantum mechanics to approximate real situations because, being constant in each range, they are easy to treat mathematically. The results we obtain for these potentials will allow us to illustrate a number of characteristic quantum mechanical phenomena. An analogy, that is surely familiar to the student, is found in the procedure used in studying electromagnetism. This involves treating many idealized systems like the infinite wire, the capacitor without edges, etc. These systems are studied because they are relatively easy to handle, because they are excellent approximations to real ones, and because real systems are usually complicated to treat mathematically since they have complicated geometries. The idealized potentials we treat in this chapter are used in the same way and with the same justification. The simplest case is the step potential, illustrated in Figure 6-3. If we choose the origin of the x axis to be at the step, and the arbitrary additive constant that always occurs in the definition of a potential energy so that the potential energy of the particle is zero when it is to the left of the step, V(x) can be written V(x) 0° x < 0 (6-11) where V0 is a constant. We may think of V(x) as an approximate representation of the potential energy function for a charged particle moving along the axis of a system of two electrodes, separated by a very narrow gap, which are held at different voltages. The upper half of Figure 6-4 illustrates this system, and the lower half illustrates the corresponding potential energy function. As the gap decreases, the potential function approaches the idealization illustrated in Figure 6-3. In Example 6-2 we shall see that the potential energy for an electron moving near the surface of a metal is very much like a step potential since it rapidly increases at the surface from an essentially constant interior value to a higher constant exterior value. Assume that a particle of mass m and total energy E is in the region x < 0, and that it is moving toward the point x = 0 at which the step potential V(x) abruptly changes its value. According to classical mechanics, the particle will move freely in that region until it reaches x = 0, where it is subjected to an impulsive force F = — dV(x)/dx acting in the direction of decreasing x. The idealized potential, (6-11), yields an impulsive force of infinite magnitude acting only at the point x = O. However, as it acts on the particle only for an infinitesimal time, the quantity $ F dt (the impulse), which determines the change in its momentum, is finite. In fact, the momentum change is not affected by the idealization. The motion of the particle subsequent to experiencing the force at x = 0 depends, in classical mechanics, on the relation between E and V 0 . This is also true in quantum mechanics. In the present section we treat the case where E < Vo , i.e., where the total energy is less than the height of the potential step as illustrated in Figure 6-5. (The V(x) V(x) = V0 Figure 6-3 X A step potential. r* ♦ V(x) - J ^ — V(x) Illustrating a physical system with a potential energy function that can be approximated by a step potential. A charged particle moves along the axis of two cylindrical electrodes held at different voltages. Its potential energy is constant when it is inside either electrode, but it changes very rapidly when passing from one to the other. Figure 6-4 case where E > Vo is treated in the following section.) Since the total energy E is a constant, classical mechanics says that the particle cannot enter the region x > O. The reason is that in that region E= Zm 2 +V(x)<V(x) or p2 <0 2m Thus the kinetic energy p 2/2m would be negative in the region x > 0, which would lead to an imaginary value for the linear momentum p in the region. Neither is allowed, or even makes physical sense, in classical mechanics. According to classical mechanics, the impulsive force will change the momentum of the particle in such a way that it will exactly reverse its motion, traveling off in the direction of decreasing x with momentum in the direction opposite to its initial momentum. The magnitude of the momentum p will be the same before and after the reversal since the total energy E = p2/2m is constant. V(x) V(x) = V0 E V(x) = 0 0 The relation between total and potential energies for a particle incident upon a potential step with total energy less than the height of the step. Figure 6-5 THE STEP POTENTIAL (ENERGY LESS THAN S TEP H EIG HT) V(x) = 0 SO LUTIO NSOF TIME- INDEPEND ENT SCHRO ED ING ER EQUATIONS To determine the motion of the particle according to quantum mechanics, we must find the wave function which is a solution, for the total energy E < Vo , to the Schroedinger equation for the step potential of (6-11). Since this potential is independent of time, the actual problem is to solve the time-independent Schroedinger equation. From our qualitative discussion of the previous chapter, we know that an acceptable solution should exist for any value of E > 0, since the potential cannot bind the particle to a limited range of the x axis. For the step potential, the x axis breaks up into two regions. In the region where x < 0 (left of the step), we have V(x) = 0, so the eigenfunction that will tell us about the behavior of the particle is a solution to the simple time-independent Schroedinger equation h 2 d20(x) = Eifr(x) x < 0 (6-12) 2m dx2 In the region where x > 0 (right of the step), we have V(x) = V o , and the eigenfunction is a solution to a time-independent Schroedinger equation which is almost as simple h2 d20(x) x>0 (6-13) +V00(x)=Et/i(x) 2m dx 2 The two equations are solved separately. Then an eigenfunction valid for the entire range of x is constructed by joining the two solutions together at x = 0 in such a way as to satisfy the requirements, of Section 5-6, that the eigenfunction and its first derivative are everywhere finite, single valued, and continuous. Consider the differential equation valid for the region in which V(x) = 0, (6-12). Since this is precisely the time-independent Schroedinger equation for a free particle, we take for its general solution the traveling wave eigenfunction of (6-8). We write that eigenfunction as ox) = Aezklx Be -iklx where k 1 = x/2mE h x < 0 (6-14) Next consider the differential equation valid for the region in which V(x) = V 0 , (6-13). From the qualitative considerations of Section 5-7, we do not expect an oscillatory function, such as in (6-14), to be a solution since the total energy E is less than the potential energy Vo in the region of interest. In fact, those considerations tell us that the solution will be a function which "gradually approaches the x axis." The simplest function with this property is the decreasing real exponential, which can be written `Y e -kzx x > 0 (6-15) (x) = Let us find out if this is a solution and, if so, also find the required value of k 2 , by substituting it into (6-13), which it is supposed to satisfy. We first evaluate d20(x) k2) 2e -k2x = kzi(x) dx2 — ( Then the substitution yields h2 2m 14(x) + V0 0(x) (x) = E i (x) This satisfies the equation, and therefore verifies the solution, providing /2m(l o — E) k2 = \ E< V0 (6-16) The solution we have just verified is not a general solution to the time-independent Schroedinger equation, (6-13). The reason is that the equation contains a second should also be a solution to the time-independent Schroedinger equation that we are dealing with. It is equally easy to verify this, by substitution into the equation. But let us instead verify that the arbitrary combination of the two particular solutions V2m(Vo — E) where k2 = x > 0 (6-18) ti(x) = Cek2x + De - k2x and where C and D are arbitrary constants, is a solution to (6-13). We calculate 2m V E) d 2 J(x) _ z2 ek2x+ D(— k2) 2 k2x _ z kz^(x) ( 2 ) tk(x) dx2 — Ck e — _ and substitute the result into the equation. We obtain h2 E)/(x) + VoV(x) = Et/i(x) 2 2 (Vo — — Since this is obviously satisfied, we have verified that (6-18) is a solution. Since it contains two arbitrary constants, it is the general solution to the time-independent Schroedinger equation for the region of the step potential where V(x) = V0 , with E < Vo . Although the increasing exponential part will not actually be used in the present section, it will be used in a subsequent section. The arbitrary constants A, B, C, and D of (6-14) and (6-18) must be so chosen that the total eigenfunction satisfies the requirements concerning finiteness, single valuedness, and continuity, of 0(x) and d0(x)/dx. Consider first the behavior of ti(x) as x —> + co. In this region of the x axis the general form of 0(x) is given by (6-18). Inspection shows that it will generally increase without limit as x —+ + co, because of the presence of the first term, Cek2 x. In order to prevent this, and keep iii(x) finite, we must set the arbitrary coefficient C of the first term equal to zero. Thus we find C = 0 (6-19) Single valuedness is satisfied automatically by these functions. To study their continuity, we consider the point x = 0. At this point the two forms of 0(x), given by (6-14) and (6-18), must join in such a way that i/i(x) and dk(x)/dx are continuous. Continuity of 0(x) is obtained by satisfying the relation D(e -k2x) x0 = A(e`k 'x) x=o + B(e - `k, x) x=0 which comes from equating the two forms at x = 0. This relation yields D=A+B (6-20) Continuity of the derivative of the two forms dtP(x) dx = — k 2 De k2 x x>0 and dr/r(x) = ik Ae` k 'x — ik i Be ^k,x i dx x <0 THE STEP POTENTIAL (ENE RGY LESS TH AN STEP HEIG HT) derivative, so the general solution must contain two arbitrary constants. However, if we can find a solution to the equation for the same value of E, which is different in form from the one we have just found, we can make an arbitrary linear combination of these two so-called particular solutions. The linear combination will also be a solution and, since it will contain two arbitrary constants, it will be a general solution. A clue to the form of another particular solution is found by noting that k 2 enters as a square in the equation preceding (6-16). Therefore, its sign is immaterial, and the increasing exponential /2m(Vo — E) where k2 = N 0(x) = e +k2x x > 0 (6-17) SOLUTIONS OF TIME- INDEPEND ENT SCHROEDING ER EQ UATIO NS is obtained by equating these derivatives at x = 0. Thus we set —k 2D(e -k2x)x0 — ik 1 A(e ik1 x)x=0 — ik1B(e iktx)x=0 This yields (6-21) k2 D =A—B ^ Adding (6-20) and (6-21) gives A =— ^\ Subtracting gives 1+ (6-22) k i2 / 1 -(6-23) ^ \ k l2 / We have now determined A, B, and C in terms of D. Thus the eigenfunction for the step potential, and for the energy E < V0 , is B =— - (1 + ik2/kl)eiklx + tŸ(x) = 2 De -k2 x D (1 — ik2/k 1)e'lx x < 0 (6-24) x> 0 The one remaining arbitrary constant, D, determines the amplitude of the eigenfunction, but it is not involved in any of its more important characteristics. The presence of this constant reflects the fact that the time-independent Schroedinger equation is linear in ifr(x), and so solutions of any amplitude are allowed by the equation. We shall see that useful results can usually be obtained without bothering to carry through the normalization procedure that would specify D. The reason is that the measurable quantities that we shall obtain as predictions of the theory contain D in both the numerator and the denominator of a ratio, and so it cancels out. The wave function corresponding to the eigenfunction is A e ikix e - iEt/h + Be iklx e iEt/h = Ae i(k ix-Et/h) + Bei(- klx-Et/h) x < O W(x,t) = x > 0 (6-25) De - k2x e - iEt/h - Consider the region x < 0. The first term in the wave function for this region is a traveling wave propagating in the direction of increasing x. This term describes a particle moving in the direction of increasing x. The second term in the wave function for x < 0 is a traveling wave propagating in the direction of decreasing x, and it describes a particle moving in that direction. This information, plus the classical predictions described earlier, suggests that we should associate the first term with the incidence of the particle on the potential step and the second term with the reflection of the particle from the step. Let us use this association to calculate the probability that the incident particle is reflected, which we call the reflection coefficient R. Obviously, R depends on the ratio B/A, which specifies the amplitude of the reflected part of the wave function relative to the amplitude of the incident part. But in quantum mechanics probabilities depend on intensities, such as B*B and A*A, not on amplitudes. Thus, we must evaluate R from the formula = B*B A* (6-26) That is, the reflection coefficient is equal to the ratio of the intensity of the part of the wave that describes the reflected particle to the intensity of the part that describes the incident particle. We obtain R— (1 — ik2 /kl) *( 1 — ik2/kl) A*A (1 + ik 2/k i)*(1 + ik2/ki) B*B — A \ ^ A^ À o x Figure 6-6 Illustrating schematically the combination of an incident and a reflected wave of equal intensities to form a standing wave. The wave function is reflected from a potential step at x = O. Note that the nodes of the traveling waves move to the right or left, but those of the standing wave are stationary. or R— (1 + ik 2/k i )(1 — ik2/ki) _ 1 (1 — ik2/ki)( 1 + ik2/ki) E < Vo (6-27) The fact that this ratio equals one means that a particle incident upon the. potential step, with total energy less than the height of the step, has probability one of being reflected—it is always reflected. This is in agreement with the predictions of classical mechanics. Consider now the eigenfunction of (6-24). Using the relation eiktx = cos k i x + i sin kix (6-28) it is easy to show that the eigenfunction can be expressed as D cos k l x—Dk2 sin k ix k1 t//(x) = De-k2x x<0 x> 0 (6-29) - If we generate the wave function by multiplying /i(x) by a - `E:m, we see immediately that we actually have a standing wave because the locations of the nodes do not change in time. In this problem the incident and reflected traveling waves for x < 0 combine to form a standing wave because they are of equal intensity. Figure 6-6 depicts this schematically. In the top part of Figure 6-7 we illustrate the wave function by plotting the eigenfunction, (6-29), which is a real function of x if we take D real. The wave function can be thought of as oscillating in time according to e - iEt/J, with an amplitude whose space dependence is given by 0(x). Here we find a feature which is in sharp contrast to the classical predictions. Although in the region x > 0 the probability density ^*^ = D* e k 2 x e +iEt^^ e -k2x e -iEtlh = D*De 2k2x (6-30) illustrated in the bottom of Figure 6-7, decreases rapidly with increasing x, there is a finite probability of finding the particle in the region x > 0. In classical mechanics it would be absolutely impossible to find the particle in the region x > 0 because there the total energy is less than the potential energy, so the kinetic energy p 2/2m is negative and the momentum p is imaginary. This phenomenon, called penetration THE STE P P OTENTIAL (ENERGY LESS THAN STEP H EIGHT) y 4\ 7 ^ SO LUTIONS O F TIME- I NDEPEN DENT S CHRO EDI NG ER EQUATIO NS kY *(x, t) kJ, (x, t) f ^ V1J r e J J 1 J All t A x 0 Figure 6-7 Top: The eigenfunction iii(x) for a particle incident upon a potential step at x= 0, with total energy less than the height of the step. Note the penetration of the eigenfunction into the classically excluded region x > O. Bottom: The probability density T*`I` = 2 02 is i/i*0 = Y corresponding to this eigenfunction. The spacing between the peaks of twice as close as the spacing between the peaks of Li. of the classically excluded region, is one of the more striking predictions of quantum mechanics. We shall discuss later certain experiments which confirm this prediction, but here we should like to make several points about it. One is that penetration does not mean that the particle is stored in the classically excluded region. Indeed, we have seen that the incident particle is definitely reflected from the step. Another point is that penetration of the excluded region, which obeys (6-30), is not in conflict with the experiments of classical mechanics. It is apparent from the equation that the probability of finding the particle with a coordinate x > 0 is only appreciable in a region starting at x = 0 and extending in a penetration distance Ax, which equals 1/k 2 . The reason is that e - 2kzx goes very rapidly to zero when x is very much larger than 1/k 2 . Since k2 = V2m(Vo — E)/h, we have Ox = h J2m(Vo — E) In the classical limit, the product of m and (V0 — E) is so large, compared to h 2, that Ax is immeasurably small. Example 6 1. Estimate the penetration distance Ax for a very small dust particle, of radius r = 10 -6 m and density p = 104 kg/m 3 , moving at the very low velocity y = 10 -2 m/sec, if the particle impinges on a potential step of height equal to twice its kinetic energy in the region to the left of the step. •The mass of the particle is - m= 4 nr 3 p^_ 4x 10 - 18 m 3 x 104 kg/m 3 =4x 10 -14 kg Its kinetic energy before hitting the step is 2 mv2 2 x 4 x 10 -14 kg x 10 -4 m2/sec 2 = 2 x 10 -18 joule 2 x 10 -19 m Of course, this is many orders of magnitude smaller than could be detected in any possible measurement. For the more massive particles and higher energies typically considered in • classical mechanics, Ax is even smaller. Furthermore, we should like to point out that the uncertainty principle shows the wavelike properties exhibited by an entity in penetrating the classically excluded region are really not in conflict with its particlelike properties. Consider an experiment capable of proving that the particle is located somewhere in the region x > O. Since the probability density for x > 0 is appreciable only in a range of length Ax, the experiment amounts to localizing the particle within that range. In doing this, the experiment necessarily leads to an uncertainty Ap in the momentum, which must be at least as large as AP ^ h ^ /2m(Vo V — E) Consequently, the energy of the particle is uncertain by an amount Vo — E AE ^ (0)a and it is no longer possible to say that the total energy E of the particle is definitely less than the potential energy Vo . This removes the conflict alluded to. Penetration of the classically excluded region can lead to measurable consequences. We shall see this later for a potential that steps up to a height V o > E, but remains up only for a distance not much larger than the penetration distance Ax, and then steps down. In fact, the phenomenon has significant practical consequences. One example, which we shall refer to soon, is the tunnel diode used in modern electronics. A conduction electron moves through a block of Cu at total energy E under the influence of a potential which, to a good approximation, has a constant value of zero in the interior of the block and abruptly steps up to the constant value Vo > E outside the block. The interior value of the potential is essentially constant, at a value that can be taken as zero, since a conduction electron inside the metal feels little net Coulomb force exerted by the approximately uniform charge distributions that surround it. The potential increases very rapidly at the surface of the metal, to its exterior value V 0 , because there the electron feels a strong force exerted by the nonuniform charge distributions present in that region. This force tends to attract the electron back into the metal and is, of course, what causes the conduction electron to be bound to the metal. Because the electron is bound, V0 must be greater than its total energy E. The exterior value of the potential is constant, if the metal has no total charge, since outside the metal the electron would feel no force at all. The mass of the electron is m = 9 x 10 -31 kg. Measurements of the energy required to permanently remove it from the block, i.e., measurements of the work function, show that Vo — E = 4 eV. From these data estimate the distance Ax that the electron can penetrate into the classically excluded region outside the block. ■ In the mks system 1.6 x 10 -19 joule . ^ 6 x 10 joule Vo — E = 4eV x Example 6 2. - 1eV THE STEP POTENTIAL (ENERGY LESS THAN STEP HEIGHT ) and this is also the value of (V0 — E). The penetration distance is h 10 -34 joule-sec Ax = J2 x 4 x 10 -14 kg x 2 x 10 -18 joule V2m(V o — E) SOLUTIO NSOF TIME- INDEPENDENT SCHROEDING ER EQUATIONS N c) r So Ax = h J2m(Vo — E) 10 - 34joule-sec 10 -le m joule The penetration distance is of the order of atomic dimensions. Therefore, the effect can be of consequence in atomic systems. We shall find soon that, in certain circumstances, the effect is very important indeed. • N/2x9x 10 -31 kg x 6 x 10 -19 Let us finally make the point that penetration of the classically excluded region is nonclassical in the sense that an entity that does it is not behaving like a classical particle. But it is behaving like a classical wave since, as we shall see later, the phenomenon has been known to occur with light waves since the time of Newton. Penetration of the classically excluded region by material particles is just another manifestation of the wavelike nature of material particles. Figure 6-8 shows the probability density for a wave function in the form of a group, for the problem of a particle incident in the direction of increasing x upon a potential step with an average value of the total energy less than the step height. The wave function can be obtained by summing, over the total energy E, a very large number of wave functions of the form we have obtained in (6-25). It can also be obtained by a direct numerical solution of the Schroedinger equation. Either way involves a large amount of work on a high-speed computer, as can be guessed from the complications t =0 t =5At t = 60t ^ t = 7At t = 90t t = 110t t = 12At t = 140t t = 20At . Figure 6 8 A A potential step, and the probability density `1" for a group wave function describing a particle incident on the step with total energy less than the step height. As time evolves, the group moves up to the step, penetrates slightly into the classically excluded region, and then is completely reflected from the step. The complications of the mathematical treatment using a group are indicated by the complications of its structure during reflection. - 6 4 THE STEP POTENTIAL (ENERGY GREATER THAN STEP HEIGHT) - In this section we consider the motion of a particle under the influence of a step potential, (6-11), when its total energy E is greater than the height Vo of the step. That is, we take E > Vo , as illustrated in Figure 6-9. In classical mechanics, a particle of total energy E traveling in the region x < 0, in the direction of increasing x, will suffer an impulsive retarding force F = — dV(x)/dx at the point x = O. But the impulse will only slow the particle, and it will enter the region x > 0, continuing its motion in the direction of increasing x. Its total energy E remains constant; its momentum in the region x < 0 is p i , where pi/2m = E; its momentum in the region x > 0 is p2 , where p/2m = E — Vo . We shall see that the predictions of quantum mechanics are not so simple. If E is not too much larger than V0 , the theory predicts that the particle has an appreciable chance of being reflected at the step back into the region x < 0, even though it has enough energy to pass over the step into the region x > O. One example of this is found in the case of an electron in the cathode of a photoelectric cell, which has received energy from absorbing a photon, and which is trying to escape the surface of the metallic cathode. If its energy is not much higher than the height of the step in the potential that it feels at the surface of the metal, it may be reflected back and not succeed in escaping. This leads to a significant reduction in the efficiency of photocells for light of frequencies not far above the cutoff frequency. A more important example of reflection occurring when a particle tries to pass over a potential step is found in the motion of a neutron in a nucleus. To a good approximation, the potential acting on the neutron near the nuclear surface is a step potential. The potential rises very rapidly at the nuclear surface because a nucleus tends to bind a neutron. If the neutron has received energy, in one way or another, and is trying to escape the nucleus, it will probably be reflected back into the nucleus at the surface if its energy is only a little greater than the step height. This has the effect of inhibiting the emission of lower energy neutrons from nuclei, and thereby considerably increases the stability of nuclei in low-lying excited states. The effect is a manifestation of the wavelike properties of neutrons that is very significant in the processes taking place in nuclear reactions, as we shall see near the end of this book. V(x) E V(x) = Vo 1 The relation between total and potential energies for a particle incident upon a potential step with total energy greater than the height of the step. Figure 6-9 V(x) = 0 0 THE STEPPOTE NTIAL(ENERGYG REATER THAN STEP HEIGHT ) indicated in the figure. The results of the calculations certainly convey a realistic sense of the particle motion; but note that these results show, again, that the particle associated with the wave function is reflected from the step with probability one, and that there is some penetration of the classically excluded region. The fact that we have been able to learn these basic results from simple calculations, involving only the wave function of (6-25) which contains a single value of E, is an example of the fact that it is generally not necessary in quantum mechanics to use wave functions in the form of groups. Of course, we must be willing to learn how to interpret the simple wave functions. SOLUTIO NSOF TIME- INDEPENDENT SCHROEDIN GEREQU ATI ON S cc) ci Û In quantum mechanics, the motion of the particle under the influence of the step potential is described by the wave function (x,t) = (x)e Ufa, where the eigenfunction ÿr(x) satisfies the time-independent Schroedinger equation for the potential. This equation has different forms in the regions to the left and right of the potential step, namely ^ h2 d2 tŸ(x) 2m dx2 = EVI(x) x < 0 (6-31) and h 2 d2t//(x) 2m dx 2 _ (E — Vo)(P(x) x > 0 (6-32) The eigenfunction ti/(x) also satisfies the conditions requiring finiteness, single valuedness, and continuity, for it and its derivative, particularly at the joining point x = 0. Equation (6-31) describes the motion of a free particle of momentum p i . Its general solution is -iklx x < 0 (6-33) ifi(x) = Ae iklx + Be where V2mE Pi k i =h h Equation (6-32) describes the motion of a free particle of momentum p 2 . Its general solution is x > 0 (6-34) If/ (x) = Ce`kzx + De ikzx where k2 = V2m(E— Vo) pz h h E > Yo The wave function specified by these two forms consists of traveling waves of de Broglie wavelength ;L i = h/p i = 2i/ki in the region x < 0, and of longer de Broglie wavelength /1 2 = h/p 2 = 27r/k2 in the region x > O. Note that the functions we deal with here already satisfy the requirements of finiteness and single valuedness; but we must explicitly consider their continuity, and we shall do so shortly. A particle initially in the region x < 0, and moving towards x = 0 would, in classical mechanics, have probability one of passing the point x = 0 and entering the region x > O. This is not true in quantum mechanics. Because of the wavelike properties of the particle, there is a certain probability that the particle will be reflected at the point x = 0, where there is a discontinuous change in the de Broglie wavelength. Thus we need to take both terms of the general solution of (6-33) to describe the incident and reflected traveling waves in the region x < 0. We do not, however, need to take the second term of the general solution of (6-34). This term describes a wave traveling in the direction of decreasing x in the region x > O. Since the particle is incident in the direction of increasing x, such a wave could arise only from a reflection at some point with a large positive x coordinate (well beyond the discontinuity at x = 0). As there is nothing out there to cause a reflection, we know that there is only a transmitted traveling wave in the region x > 0, and so we take the arbitrary constant D to have the value D=0 (6-35) The arbitrary constants A, B, and C must be chosen to make fi(x) and dtr(x)/dx continuous at x = O. The first requirement, that the values of Ji(x) expressed by (6-33) and (6-34) be the same at x = 0, is satisfied if A(e iki x )x + B(e -ikix)x -0 ^ cn C (6-36) The second requirement, that the values of the derivatives of the two expressions for tji(x) be the same at x = 0, is satisfied if A+B= ik 1A(eikix) x=0 — ik 1B(e -ik2x) ik2x) x0 = ik2C(e x=o or (6-37) k l (A—B)=k 2 C From the last two numbered equations, we find B C= k and k il + k22 A (6-38) i +ik 2 A Thus the eigenfunction is Ae ikix + A A kl - k2 kl + k2 2k1 kl + k2 e iklx xÇ 0 (6-39) eikzx As before, it will not be necessary to evaluate the arbitrary constant the amplitude of the eigenfunction. x A > 0 that determines It is clear that an eigenfunction satisfying the two continuity conditions could not have been found if we had initially set the coefficient B of the reflected wave equal to zero. We would then have had only two arbitrary constants to satisfy the two continuity conditions, and we would not have had one left over to play the role, demanded by the linearity of the time-independent Schroedinger equation, of an arbitrary constant that determines the amplitude of the eigenfunction. By analogy with our interpretation of the eigenfunction of (6-24), we recognize that the first term in the expression of (6-39) valid for x < 0 (left of the discontinuity) represents the incident traveling wave; the second term in the expression valid for x < 0 represents the reflected traveling wave; and the expression valid for x > 0 (right of the discontinuity) represents the transmitted traveling wave. Figure 6-10 illustrates the probability density `F*(x,t)T(x,t) = *(x) /,(x) for the wave function T(x,t) corresponding to the eigenfunction t/i(x) of (6-39) (in the representative case k l = 2k2). We do not plot either the eigenfunction or wave function, as both are complex. In the region x > 0 the wave function is a pure traveling wave (of amplitude 4A/3 in this case) traveling to the right, and so the probability density is 'Y * (x, t) W (x, t) r ^ All t (16/9) AM (4/9) AM 0 Figure 6-10 x The probability density 'I' * P for the eigenfunction of (6-39), when k l = 2k2 . Sec . 6-4 THE STE P POTENTIAL(ENERGY GREATER THAN STEP HEIGHT ) = 0 - C(e ik2x)x =0 or SOLUTION S OF TIME- INDEPENDENT SCHRO EDING ER EQUATIONS constant as in the bottom part of Figure 6-1. In the region x < 0 the wave function is a combination of the incident traveling wave (of amplitude A) moving to the right, and a reflected traveling wave (of amplitude A/3) moving to the left. As the amplitude of the reflected wave is necessarily smaller than that of the incident wave, the two cannot combine to yield a pure standing wave. Their sum `Y(x,t) in that region is, instead, something between a standing wave and a traveling wave. This is seen in the behavior of `I'*(x,t) 11'(x,t) for x < 0, which looks like something between the pure standing wave probability density of Figure 6-7 and the pure traveling wave probability density of Figure 6-1 in that it oscillates but has minimum values greater than zero. The ratio of the intensity of the reflected wave to the intensity of the incident wave gives the probability that the particle will be reflected by the potential step back into the region x < 0. This probability is the reflection coefficient R. That is B*B R _ A*A _ k1 — k * (k1 — k 2 k1 — k 2 2 k 1 + k2) E > V° (6-40) \k1 + k2) \k1 + k2) We see from this result that R < 1 when E > Vo , i.e., when the total energy of the 2l particle is greater than the height of the potential step. This is in contrast to the value R = 1 when E < V° , that we obtained from the result of Section 6-3. Of course, the thing that is surprising about the present result is not that R < 1, but that R > 0. It is surprising because a classical particle would definitely not be reflected if it had enough energy to pass the potential discontinuity. On the other hand, at a corresponding discontinuity a classical wave would be reflected, as we shall discuss shortly. Also of interest is the transmission coefficient T, which specifies the probability that the particle will be transmitted past the potential step from the region x < 0 into the region x > 0. The evaluation of T is slightly more complicated than the evaluation of R because the velocity of the particle is different in the two regions. According to accepted convention, transmission and reflection coefficients are actually defined in terms of the ratios of probability fluxes. A probability flux is the probability per second that a particle will be found crossing some reference point traveling in a particular direction. The incident probability flux is the probability per second of finding a particle crossing a point at x < 0 in the direction of increasing x; the reflected probability fl ux is the probability per second of finding a particle crossing a point at x < 0 in the direction of decreasing x; and the transmitted probability flux is the probability per second of finding a particle crossing a point at x > 0 in the direction of increasing x. Since the probability per second that a particle will cross a given point is proportional to the distance it travels per second, the probability flux is proportional not only to the intensity of the appropriate wave but also to the appropriate velocity of the particle. (A more detailed discussion of this point is given in connection with Figure L-2 in Appendix L.) Thus, according to the strict definition, the reflection coefficient R is R_ v1 B B — B B v1A*A A*A (6-41) where v 1 is the velocity of the particle in the region x < 0. Since the velocities cancel, what remains is identical to the formula we have used previously for R. For T, the velocities do not cancel, and we have _ v2 C*C _ v2 ( 2k1 )2 T 1 A*A v l I\ k l + k2 )v v2 is the velocity of the particle in the region x > 0. Now pi hk1 v 1 =—= mm p2 hk2 and v2 =—_ mm So the above expression gives k 2 (2k 1 ) 2 4ki k 2 E > Vo (6-42) = k 1 (k1 + k2)2 k2)2 k2) (k1 + k2) It is easy to show by evaluating R and T from (6-40) and (6-42) that T_ (6-43) This useful relation is the motivation for defining the reflection and transmission coefficients in terms of probability fluxes. The probability flux incident upon the potential step is split into a transmitted flux and a reflected flux. But (6-43) says their sum equals the incident flux; i.e., the probability that the particle is either transmitted or reflected is one. The particle does not vanish at the step; nor does the particle itself split at the step. In any particular trial the particle will go one way or the other. For a large number of trials, the average probability of going in the direction of decreasing x is measured by R, and the average probability of going in the direction of increasing x is measured by T. Note that R and T are both unchanged in value if k 1 and k2 are exchanged in (6-40) and (6-42). A moment's consideration should convince the student that this means the same values of R and T would be obtained if the particle were incident upon the potential step in the direction of decreasing x from the region x > 0. The wave function describing the motion of the particle, and consequently the probability flux, is partially reflected simply because there is a discontinuous change in V(x), and not because V(x) becomes larger in the direction of the incidence of the particle. The behavior of R and T when k 1 and k2 are exchanged involves a characteristic property of all waves that, in optics, is sometimes called the reciprocity property. When light passes perpendicularly through a sharp interface between media with different indices of refraction, a fraction of the light is reflected because of the abrupt change in its wavelength, and the same fraction is reflected independent of whether it is incident from one side of the interface or from the other. Exactly the same thing happens when a microscopic particle experiences an abrupt change in its de Broglie wavelength. In fact, the equations governing the two phenomena are identical in form. We see, once again, that a microscopic particle moves in a wavelike manner. In Figure 6-11 the reflection and transmission coefficients are plotted as functions of the convenient ratio E/Vo . By evaluating k 1 and k2 in (6-40) and (6-42), we find that these expressions for the reflection and transmission coefficients can be written in terms of the ratio as R=1 —T— 0.5 1.0 1.5 1 — 1 + — Vo /E 2 ^ 1Vo/ E) — E — Vo >1 (6-44) 2.0 E/V0 Figure 6-11 The reflection and transmission coe ff icients R and T for a particle incident upon a potential step. The abscissa E /Vo is the ratio of the total energy of the particle to the increase in its potential energy at the step. The case k 1 = 2k2 , illustrated in Figure 6-10, corresponds to E /Vo = 1.33. o> THE STE P POTENTIAL (E NE RGY GREATER T HAN STEP HEI GHT) R+T= 1 cn cD rnCO SOLUTIO NSOF TIME- INDEPENDENT SCHROEDING ER EQ UATION S The figure also plots the results R =1—T=1 V <1 o obtained in (6-27) of the preceding section for a step potential when E/Vo < 1. As an example, for E/Vo = 1.33 the transmission coefficient has the value T = 0.88. This E/Vo ratio corresponds to the case k 2 = k1/2 whose probability density pattern is illustrated in Figure 6-10. Note from that figure that the probability of finding the particle in a given length of the x axis, which is long enough to average over the quantum mechanical fluctuations in the probability density, is nearly twice as large to the right of the potential step as it is to the left of the step. From a classical point of view, which is appropriate to discussing an average over quantum mechanical fluctuations, it can be said that the reasons for this are: (a) the probability that the particle will pass the step and proceed into the region to its right is almost equal to one, and (b) the particle's velocity is halved when it enters the region to the right of the step since k = p/h = mv/h and k2 = k 1/2, so it spends twice as much time in any given length of the axis in that region. From Figure 6-11 we see that the energy of the particle must be appreciably higher than the height of the potential step before the probability of reflection becomes negligible. However, the case in which E becomes very large is not necessarily the case of the classical limit for which we know there will be no reflection at all. The point is that (6-44) says R depends only on the ratio E/Vo , so that it will keep the same value if Vo increases as rapidly as E. This seems paradoxical until we realize that, in the limit of large energies, our basic assumption that the change in the value of the step potential V(x) is perfectly sharp can no longer be even an approximation to a real physical situation. If the potential function changes only very gradually with x, then the de Broglie wavelength will change only very gradually. In this case the reflection will be negligible because the change in wavelength is gradual, and reflection arises from an abrupt change in the wavelength. Specifically, if the fractional change in V(x) is very small when x changes by one de Broglie wavelength, then the reflection coefficient will be very small. This gives rise to the classical limit since in that limit the de Broglie wavelength is so short that any physically realistic potential V(x) changes only by a negligible fraction in one wavelength. For particles in atomic or nuclear systems, the de Broglie wavelength can be long relative to the distance in which the potential experienced by the particle changes value significantly. Then the step potential is a very good approximation. For these microscopic particles, the probability of reflection can be large. Example 6-3. When a neutron enters a nucleus, it experiences a potential energy which drops at the nuclear surface very rapidly from a constant external value V = 0 to a constant internal value of about V = — 50MeV. The decrease in the potential is what makes it possible for a neutron to be bound in a nucleus. Consider a neutron incident upon a nucleus with an external kinetic energy K = 5 MeV, which is typical for a neutron that has just been emitted from a nuclear fission. Estimate the probability that the neutron will be reflected at the nuclear surface, thereby failing to enter and have its chance at inducing another nuclear fission. ■ For an estimate, we may take the neutron-nucleus potential to be a one-dimensional step potential, as illustrated in Figure 6-12. Because of the reciprocity property of the reflection coefficient, we may evaluate it from (6-44), using V o = 50 MeV and E = 55 MeV for reasons that can be seen by inspection of the figure. We have 1 — /1- 50/55) 2 , 0.29 — 50/55 = 1+ This estimate gives a correct impression of the great importance of the reflection phenomenon when low-energy neutrons collide with nuclei. But the numerical value we have obtained for the reflection coefficient is not very accurate since the actual neutron-nucleus potential does not R ( 11 Figure 6-12 A neutron of external kinetic energy K incident upon a decreasing potential step of depth Vo , which approximates the potential it feels upon entering a nucleus. Its total energy, measured from the bottom of the step potential, is E. drop quite as rapidly at the nuclear surface, in comparison to the de Broglie wavelength, as a step potential. 6-5 THE BARRIER POTENTIAL In this section we consider a barrier potential, illustrated in Figure 6-13. The potential can be written as follows V(x) = Vo 0<x<a x<Oorx> a (6 45) 0 According to classical mechanics, a particle of total energy E in the region x < 0, which is incident upon the barrier in the direction of increasing x, will have probability one of being reflected if E < Vo , and probability one of being transmitted into the region x > a if E > Vo . Neither of these statements describes accurately the quantum mechanical results. If E is not much larger than Vo , the theory predicts that there will be some reflection, except for certain values of E. If E is not much smaller than V0, quantum mechanics predicts that there is a certain probability that the particle will be transmitted through the barrier into the region x > a. In "tunneling" through a barrier whose height exceeds its total energy, a material particle is behaving purely like a wave. But in the region beyond the barrier it can be detected as a localized particle, without introducing a significant uncertainty in the knowledge of its energy. Thus penetration of a classically excluded region of limited width by a particle can be observed, in the sense that the particle can be observed to be a particle, of total energy less than the potential energy in the excluded region, both before and after it penetrates the region. We shall discuss some consequences of this fascinating effect in the present section, as well as some consequences of the reflection of particles attempting to pass over a barrier. The following section is devoted completely to examples of tunneling through barriers, and considers three of particular importance: (1) the emission of a particles from radioactive r^ lei through the potential barrier they experience in the vicinity of the nuclei, (2) the inversion of the ammonia molecule which provides a frequency standard for atomic clocks, and (3) the tunnel diode used as a switching unit in fast electronic circuits. V(x) Vo 0 Figure 6-13 a A barrier potential. x - 1bI1N3 10d}:11aab'8 3H1 9- 9' 09S E = 55MeV > ^ 0 0 SOLUTION S OF TIME- INDEPENDENT SC HRO ED ING ER EQU ATIO NS N For the barrier potential of (6-45), we know from the qualitative arguments of the last chapter that acceptable solutions to the time-independent Schroedinger equation should exist for all values of the total energy E > O. We also know that the equation breaks up into three separate equations for the three regions: x < 0 (left of the barrier), 0 < x < a (within the barrier), and x > a (right of the barrier). In the regions to the left and to the right of the barrier the equations are those for a free particle of total energy E. Their general solutions are x<0 0(x) = Ae` k, x + Be - `k'x (6-46) kIx + tlf (x) = Ce` De X> a where k1_ ^2mE h In the region within the barrier, the form of the equation, and of its general solution, depends on whether E < Vo or E > Vo . Both of these cases have been treated in the previous sections. In the first case, E < Vo , the general solution is 0 < x < a (6-47) 0(x) = Fe -k " + GekIIx where k11 = J2m(V0 —E) h In the second case, E > Vo , it is 0(x) = Fe` k'nx + Ge - ck1IIx where kII, -\I2m(E — Vo) = h E < Vo 0 < x < a (6-48) E > Vo Note that (6-47) involves real exponentials, whereas (6-46) and (6-48) involve complex exponentials. Since we are considering the case of a particle incident on the barrier from the left, in the region to the right of the barrier there can be only a transmitted wave as there is nothing in that region to produce a reflection. Thus we can set D= 0 In the present situation, however, we cannot set G = 0 in (6-47) since the value of x is limited in the barrier region, 0 < x < a, so 0(x) for E < Vo cannot become infinitely large even if the increasing exponential is present. Nor can we set G = 0 in (6-48) since /i(x) for E > Vo will have a reflected component in the barrier region that arises from the potential discontinuity at x = a. We consider first the case in which the energy of the particle is less than the height of the barrier, i.e., the case: E < Vo In matching ,P(x) and di/i(x)/dx at the points x = 0 and x = a, four equations in the arbitrary constants A, B, C, F, and G will be obtained. These equations can be used to evaluate B, C, F, and G in terms of A. The value of A determines the amplitude of the eigenfunction, and it can be left arbitrary. The form of the probability density corresponding to the eigenfunction obtained is indicated in Figure 6-14 for a typical situation. In the region x > a the wave function is a pure traveling wave and so the probability density is constant, as for x > 0 in Figure 6-10. In the region x < 0 the wave function is principally a standing wave but has a small traveling wave component because the reflected traveling wave has an amplitude less than that of the All t x 0 Figure 6-14 The probability density function 'PT for a typical barrier penetration situation. incident wave. So the probability density in that region oscillates but has minimum values somewhat greater than zero, as for x < 0 in Figure 6-10. In the region 0 < x < a the wave function has components of both types, but it is principally a standing wave of exponentially decreasing amplitude, and this behavior can be seen in the behavior of the probability density in the region. The most interesting result of the calculation is the ratio T, of the probability flux transmitted through the barrier into the region x > a, to the probability flux incident upon the barrier. This transmission coefficient is found to be (ekna — e — k'Ia)2 —1 — -1 sinh2 k„a 1+ T — v1C*C — 1 + ( 6-49) E1 v1A* A 16E 1—E 4E(1 J Vo Vo Vo Vo / _ where J2m J'a2 ( E < Vo kola = 1— V0 If the exponents are very large, this formula reduces to T — 16 Vo ( 1 Vo l e zkIIa kIl a » 1 (6-50) as can be verified with ease. When (6-50) is a good approximation, T is extremely small. These equations make a prediction which is, from the point of view of classical mechanics, very remarkable. They say that a particle of mass m and total energy E, o > E and finite thickness a, actually has a incdetoaplbrifhegtV certain probability T of penetrating the barrier and appearing on the other side. This phenomenon is called barrier penetration, and the particle is said to tunnel through the barrier. Of course, T is vanishingly small in the classical limit because in that limit the quantity 2m Voa2/h2, which is a measure of the opacity of the barrier, is extremely large. We shall discuss barrier penetration in detail shortly, but let us first finish describing the calculations by considering the case in which the energy of the particle is greater than the height of the barrier, i.e., the case: E> Vo In this case the eigenfunction is oscillatory in all three regions, but of longer wavelength in the barrier region, 0 < x < a. Evaluation of the constants B, C, F, and G a, leads to the following byaplictonfhe uyditonsax=0 formula for the transmission coefficient -1 sinz kiria v1C*C = 1 — (eikIlIa — e tkIIIa z 1 + — 1 T— (6-51) 6-51 * ) v1A A 16 Vo (Vo- 1^ Vo (Vo-1 1 1dI1N310d1:131aaV8 3H1 (x, t) ^Y * (x, N SOLUTIONS OF TIME- INDEPENDENTSCHR OEDI NGEREQUATI ON S O N where krna = Example 6 4. - 2mV° a2 h2 ^ E Vo 1) E > V0 An electron is incident upon a rectangular barrier of height V 0 = 10 eV and thickness a = 1.8 x 10 -10 m. This rectangular barrier is an idealization of the barrier encountered by an electron that is scattering from a negatively ionized gas atom in the "plasma" of a gas discharge tube. The actual barrier is not rectangular, of course, but it is about the height and thickness quoted. Evaluate the transmission coefficient T and the reflection coefficient R, as a function of the total energy E of the electron. • From Example 6-2 we can see that if E is a reasonable fraction of V0 the penetration length Ax will be comparable to the barrier thickness a. Thus we can expect appreciable transmission through the barrier. To determine exactly how much, we use the numbers given to evaluate the combination of parameters 2mV0a 2 2 x 9 x 10 -31 kg x 10 eV x 1.6 x 10 -19 joule/eV x (1.8) 2 x 10 -20 m2 _ 9 h2 10 -68 joule 2 -sec 2 ^ which enters (6-49). From this we can plot T, and also R = 1 — T, versus E/V0 , in the range 0 < E/VO < 1. The plot is shown in Figure 6-15. We see that T is very small when E/V O « 1. But, when E/VO is only somewhat smaller than one, so that E is nearly as large as V0 , T is not at all negligible. For instance, when E is half as large as V0 so that E/VO = 0.5, the transmission coefficient has the appreciable value T ^ 0.05. It is apparent that electrons can penetrate this barrier with relative ease. For E/VO > 1, we evaluate T, and R = 1 — T, from (6-51), using the same combination of parameters as before. The results are also shown in Figure 6-15. For E/V O > 1, the transmission coefficient T is in general somewhat less than one, owing to reflection at the discontinuities in the potential. However, from (6-51) it can be seen that T = 1 whenever krna = m, 2n, 3n, .... This is simply the condition that the length of the barrier region, a, is equal to an integral or half-integral number of de Broglie wavelengths )m = 2n/k,,, in that region. For this particular barrier, electrons of energy E 21 eV, 53 eV, etc., satisfy the condition k üja = 2n, etc., and so pass into the region x > a without any reflection. The effect is a result of destructive interference between reflections at x = 0 and x = a. It is closely related to the Ramsauer effect observed in the scattering of low-energy electrons by noble gas atoms, in which electrons of certain energies in the range of a few electron volts pass through these atoms as if they were not there, and so have transmission coefficients equal to one.- Essentially the same effect is seen in scattering of neutrons, with energies of a few MeV, from all nuclei. The nuclear effect, called size resonance, will be discussed later in the book. • 1 .0 0 T J 0 1 R 5 E/V0 10 The reflection and transmission coefficients R and T for a particle incident upon a potential barrier of height VO and thickness a, such that 2mV 0a 2/h 2 = 9. The abscissa E /V O is the ratio of the total energy of the particle to the height of the potential barrier. Figure 6-15 y d 2 0(x) (27-cv , h2 0(x) = 0 (6-52) where the function tfr(x) specifies the magnitude of the electric or magnetic field. When we compare this with the time-independent Schroedinger equation, written in the form d d (x) + hm [E — V(x)]r/i(x) = 0 we see that they are identical if the index of refraction in the former is connected with the potential energy function in the latter by the relation [E — V(x)] (6-53) 2 Thus the behavior of an optical system with index of refraction µ(x) should be identical to the behavior of a mechanical system with potential energy V(x), providing the two functions are related as in (6-53). Indeed, there are optical phenomena which are exactly analogous to each of the quantum mechanical phenomena that arise in considering the motion of an unbound particle. An optical phenomenon, completely analogous to the total transmission of particles over barriers of length equal to an integral or half-integral number of wavelengths, is used in the coating of lenses to obtain very high light transmissions and in thin film optical filters. An optical analogue to the penetration of barriers by particles is found in the imaginary indices of refraction that arise in total internal reflection. Consider a ray of light incident upon a glass-to-air interface at an angle greater than the critical angle O . The resulting behavior of the light ray is called total internal reflection, and it is illustrated in the top of Figure 6-16. A detailed treatment of the process in terms of electromagnetic theory shows that the index of refraction, measured along the line ABC, is real in the region AB but imaginary in the region BC. Note that an imaginary µ(x) is suggested by (6-53) for a region analogous to one in which E < V(x). Furthermore, electromagnetic theory shows that there are electromagnetic vibrations in the region BC of exactly the same form as the decreasing exponential standing wave of (6-29) for the region where E < V(x). The flux of energy (the Poynting vector) is zero in this electromagnetic standing wave, just as the flux of probability is zero in the quantum mechanical standing wave, so the light ray is totally reflected. However, if a second block of glass is placed near enough to the first block to be in the region in ,u(x) = 2TCv 1`dI1N310d1=I3IHEIb B 3 H1 We can bring together the results of the last three sections by comparing the plot of the energy dependence of the reflection coefficient R for a barrier potential, in Figure 6-15, with the plot of the same thing for a step potential, in Figure 6-11. The comparison shows that for both potentials R —* 1 as E/Vo 0, and R —> 0 as E/Vo —> oo, with the decrease in R occurring around E/Vo = 1. But for the barrier potential the reflection coefficient approaches one gradually, at small energies, since the finite thickness of the classically excluded region allows some transmission. Also, the barrier potential reflection coefficient oscillates, at large energies, because of interferences in the reflections from its two discontinuities. As the step potential can be considered to be a limiting case of a barrier of very great width, we can see from our comparison the behavior of the barrier potential reflection coefficient in this limit. Now we shall discuss in some detail the origins of these results. They all involve phenomena which arise from the wavelike behavior of the motion of microscopic particles, and each phenomenon is also observed in other types of wave motion. As we remarked in Chapter 5, the time-independent differential equation governing classical wave motion is of the same form as the time-independent Schroedinger equation. For instance, electromagnetic radiation of frequency propagating through a medium with index of refraction it obeys the equation SO LUTIONSOF TIME- INDEPENDENT SC HRO ED ING EREQ U ATIO NS A C Figure 6-16 Top: Illustrating total internal reflection of a light ray. The angle of incidence is greater than the critical angle. Bottom: Illustrating frustrated total internal reflection. Some of the light ray is transmitted through the air gap if the gap is sufficiently narrow. Figure 6-17 The total internal reflection of water waves. A long vibrating plunger on the left produces a set of waves in a region of shallow water, the waves being illuminated so as to make their crests easily visible. The waves are totally internally reflected at the diagonal boundary of a region where the layer of water abruptly becomes deeper, this reflection occurring because the velocity of water waves depends on the depth of the water. Note that the intensity of the waves decreases rapidly when they try to penetrate into the region of deeper water, but there is some penetration of that region. (Courtesy Film Studio, Education Development Center) which the electromagnetic vibrations are still appreciable, these vibrations are picked up and propagate through the second block. Furthermore, the electromagnetic vibrations in the air gap now carry a flux of energy through to the second block. This phenomenon, called frustrated total internal reflection, is illustrated in the bottom of Figure 6-16. Essentially the same thing happens in the quantum mechanical case when the region in which E < V(x) is reduced from infinite thickness (step potential) to finite thickness (barrier potential). The transmission of light through an air gap, at an angle of incidence greater than the critical angle, was first observed by Newton around 1700. The equation relating the intensity of the transmitted beam to the thickness of the air gap, and other parameters, is identical in form to (6-49), and it has been verified experimentally. It is particularly easy to observe frustrated total internal reflection of electromagnetic waves, using the microwave region of the spectrum and two blocks of paraffin separated by an air gap. Furthermore, careful inspection of the "ripple tank" photographs in Figures 6-17 and 6-18 will show that the phenomenon can even be observed with water waves. Frustrated total internal reflection, or its quantum mechanical equivalent barrier penetration, arises from properties common to all forms of classical or quantum mechanical wave motion. 6-6 EXAMPLES OF BARRIER PENETRATION BY PARTICLES There are a number of interesting, and important, examples of barrier penetration by microscopic particles. A widespread, but not widely recognized, example occurs in aluminum household wiring. The usual way for an electrician to join two wires is to twist them together. Often there is a layer of aluminum oxide between the two wires, and this material is quite an effective S310Il1:1 b'd A8 NOIlH1:113N3d b3I 1:1 1:1t18 d O S31 dWbX3 Figure 6-18 Frustrated total internal reflection of water waves. When the region of deeper water becomes a sufficiently narrow gap, the waves that have penetrated into the deeper water are picked up and transmitted into a second region of shallow water. (Courtesy Film Studio, Education Development Center) CO 0 SOL UTIONS OF TIME- INDEPENDENT SCHRO ED ING ER EQ UATIONS N insulator. Fortunately, the layer is extremely thin so the electrons flowing through the wire are able to tunnel through the layer by barrier penetration. Historically, the first application of the quantum mechanical theory of barrier penetration by particles was to explain a long standing paradox concerning the emission of a particles in the decay of radioactive nuclei. As a typical example, consider the U 238 nucleus. The potential energy V(r) of an a particle at a distance r from the center of the nucleus had been investigated around 1910 by Rutherford, and others, who performed scattering experiments. Using as a probe the 8.8 MeV a particles emitted from the radioactive nuclei of Po212, it was observed that their probability of scattering at various angles from U 238 nuclei agreed with the predictions of Rutherford's scattering formula (see Chapter 4). The student will recall that the formula was based on the assumption that the interaction between the a particle and the nucleus strictly followed the Coulomb law repulsion that would be expected to operate between the two positively charged spherical objects. Thus Rutherford was able to conclude that, for the U 238 nucleus, the potential function V(r) felt by a neighboring a particle followed Coulomb's law, V(r) = 2Ze 2/4ne or, where 2e is the a-particle charge and Ze is the nuclear charge—at least for distances greater than r" = 3 x 10 -14 m where V(r') = 8.8 MeV, the probe a-particle energy. It was also known by scattering a particles from nuclei of light atoms that V(r) eventually departs from a 1/r law when r < r', the nuclear radius, although the exact value of r' was not known for the nuclei of heavy atoms at that time. Furthermore, since a particles are occasionally emitted by U 238 nuclei, it was assumed that they exist inside such nuclei, to which they are normally bound by the potential V(r). From these arguments it was concluded that the form of V(r) in the region r < r" must be qualitatively as depicted in Figure 6-19. This conclusion has been verified by modern experiments involving the scattering of a particles produced by cyclotrons at energies high enough to allow the investigation of the potential over the entire range of r. Th e paradox was connected with the fact that it was also known that the kinetic energy of a particles emitted in radioactive decay by U238 was 4.2 MeV. The kinetic energy was, of course, measured at a very large distance from the nucleus where V(r) = 0 and the kinetic energy equals the total energy E. This value of the constant total energy of the decay a particles emitted by U 238 is also shown in Figure 6-19. From the point of view of classical mechanics, the situation was certainly paradoxical. An a particle of total energy E is initially in the region r < r'. This region is separated from the rest of space by a potential barrier of a height which was known to be at least twice E. Yet it was observed that on occasion the a particle penetrates the barrier and moves off to large values of r. U 238 Kinetic energy at large r E r' r "= 3.Ox 10 -14 m r The potential energy V acting on an a particle at a distance r from the center of a U 238 nucleus, and the total energy E of an a particle emitted from that radioactive nucleus. The solid part of the potential 'curve was known from scattering measurements to follow Coulomb's law into the distance of closest approach r" of an 8.8 MeV a particle. The dashed part of the curve shows that the potential was assumed to continue to follow Coulomb's law into the nuclear radius r', where it must drop very rapidly to form a binding region. A 4.2 MeV a particle emitted from the radioactive nucleus must penetrate the potential barrier from the nuclear radius r' to the point at distance r" from the center where its potential energy V becomes less than its total energy E. Figure 6 19 - T ^ e - 2krna = e - 2 ✓(2m1h2)(Vo— E)n (6-54) This expression was derived for a rectangular barrier of height V 0 and width a, but when the expression is valid it can be applied to the barrier V(r) by considering it to be a set of adjacent rectangular barriers of height V(r z) and very small width Are . This reasoning leads, in the limit, to the expression T e - 2fr, ✓(2m1*2)[V(r)—E]dr (6-55) where the integration is taken from the nuclear radius r', where V(r) rises above E, to the radius r"', where V(r) drops below E. The use of (6-54), which was derived for a one-dimensional case, in (6-55) that concerns a three-dimensional problem, was justified because the a particles are almost always emitted with zero angular momentum. That is, they move out along essentially linear paths emanating from the nuclear center, obeying equations which are essentially one-dimensional. The quantity T gives the probability that in one trial an a particle will penetrate the barrier. The number of trials per second could be estimated to be v N 2r' (6-56) if it were assumed that an a particle is bouncing back and forth with velocity y inside the nucleus of diameter 2r'. Then the probability per second that the nucleus will decay by emitting an a particle, called the decay rate R, would be v e -2fr; " .^/(2m/h 2 )(2Ze 2 /4, eo r—E)d r R (6-57) — 2r ' Today we know that (6-56) is not a very accurate estimate, but this function, or its more correct form, varies so slowly compared to the rapid variation in the exponential that the result expressed by (6-57) is an accurate estimate. In applying (6-57) to a particular radioactive nucleus, Gamow, Condon, and Gurney took all the quantities in the expression as known, except y and r' (r"' can be evaluated from Z and E). Assuming y to be comparable to the velocity of the a particle after emission (i.e., mv2 /2 = E), the decay rate R is then a function only of the nuclear radius r'. Using r' = 9 x 10 -15 m, which was certainly in line with the values obtained from Rutherford's analysis of a-particle scattering from light nuclei, they obtained values of R which were in good agreement with those measured experimentally, although the decay rate varies over â tremendously large range. As an example, for U 238 , the decay rate is R = 5 x 10 -18 sec -1 . An example at the other extreme is Po212, for which R = 2 x 106 sec- 1 . This va ri ation in R is due primarily to the variation, from one radioactive nucleus to the next, of the energy E of the emitted a particles. The height of the barrier and the nuclear radius do not change significantly for nuclei in the limited range of the periodic table in which a-emitting nuclei are found. A comparison between experiment and theory is shown in Figure 6-20. The successful application EXA MP L ES O F BARRIE R P ENET R ATI ONBY PARTI CLES To put it another way, according to classical mechanics an a particle emitted from a region where the potential energy function has the form shown in Figure 6-19 must, necessarily, have a much higher kinetic energy than was actually observed when it is far from the region. The reason is simply that in classical mechanics the total energy must be greater than the maximum value of the potential energy, if the particle is to escape the barrier. Consider the following analogy. You are walking beneath the span of a tall bridge, not looking up. Suddenly a brick hits you on the head, but gently, with a light tap. There is no place for the brick to come from, other than the bridge, but a brick falling from such a height would have developed enough kinetic energy to kill you! In 1928 Gamow, Condon, and Gurney treated a-particle emission as a quantum mechanical barrier penetration problem. They assumed that V(r) = 2Ze 2/47re0r for r > r', where 2e is the a-particle charge and Ze is the charge of the nucleus remaining after the a particle is emitted. They also assumed that V(r) < E for r < r', as shown in Figure 6-19. Equation (6-50) was used to evaluate the transmission coefficient T since the exponent kna, which determines T, has a value large compared to one. In fact, the exponent is so large that the exponential completely dominates the behavior of T, and it was sufficient to take CO o N 21 Po SOLUTIONSOF TIME- INDEPENDE NTSC HRO EDING ER EQ UATIO NS 10 5 1 r. 0° 10-5 U ^ N tz4 10-10 10 -15 ^ u 1 0 -20 03 E -1/2 0.5 0.4 (MeV -1/2 238 ) The probability per second R that a radioactive nucleus will emit an a particle of energy E. The points are experimental measurements and the solid curve is the prediction of (6-57), a result of barrier penetration theory. Figure 6-20 CD ô co L U A schematic illustration of the NH 3 molecule. The light spheres represent the three H atoms arranged in a plane. The dark spheres represent two equivalent equilibrium positions of the single N atom. Figure 6-21 V(x) E9 1 MAb.^ 1111111111111111111WE4 ^I ^ ^^^- 0 8 E Es 5 3 E1 x The potential energy of the N atom in the NH 3 molecule, as a function of its distance from the plane containing the three H atoms, which lies at x = 0. In its lower energy states, the total energy of the molecule lies below the top of the barrier separating the two minima, as indicated by the eigenvalues of the potential shown in the figure. Figure 6-22 6-7 THE SQUARE WELL POTENTIAL In the preceding sections we have treated the motion of particles in potentials which are not capable of binding them to limited regions of space. Although a number of interesting quantum phenomena showed up, energy quantization did not. Of course we know, from the qualitative discussion of the last chapter, that energy quantization can be expected only for potentials which are capable of binding a particle. In this section we shall discuss one of the simplest potentials having this property, the square well potential. The potential can be written V(x) = Ôo x < —a12 or x > + a/2 —a/2 < x < + a/2 (6 -58) The illustration in Figure 6-23 indicates the origin of its name. If the particle has total energy E < V0 , then in classical mechanics it can be only in the region — a/2 < x < + a/2 (within the well). The particle is bound to that region and bounces back and forth between the ends of the region with momentum of constant magnitude but alternating direction. Furthermore, any value E > 0 of the total energy is possible. But in quantum mechanics only certain discretely separated values of the total energy are possible. The square well potential is often used in quantum mechanics to represent a situation in which a particle moves in a restricted region of space under the influence of 1VI1N 310d 113M3b `d11 OS 3Hl of Schroedinger quantum mechanics to the a-particle emission paradox provided one of its earliest, and most convincing, verifications. Barrier penetration of atoms takes place in the periodic inversion of the ammonia molecule, NH 3 . Figure 6-21 illustrates schematically the structure of this molecule. It consists of three H atoms arranged in a plane, and equidistant from the N atom. There are two completely equivalent equilibrium positions for the N atom, one on either side of the plane containing the H atoms. Figure 6-22 indicates the potential energy acting on the N atom, as a function of its distance x from that plane. The potential function V(x) has two minima, corresponding to the two equilibrium positions, which are symmetrically disposed about a low maximum located at x = 0. This maximum, which constitutes a barrier separating the two binding regions, arises from the repulsive Coulomb forces that act on the N atom if it penetrates the plane of the H atoms. The forces are strong enough that in classical mechanics the N atom is not able to cross the barrier, if the molecule is in one of its low-lying energy states; that is, the lower allowed energies of this binding potential are below the top of the barrier, as indicated in the figure. But penetration of the classically excluded region allows the N atom to tunnel through the barrier. If it is initially on one side, it will tunnel through and eventually appear on the other side. Then it will do it again in the opposite direction. The position of the N atom with respect to the plane containing the H atoms actually oscillates slowly back and forth across the plane. (Since the molecule's center of mass remains fixed in an inertial reference frame, in such a reference frame the H atoms must always move in the direction opposite to the direction of motion of the N atom. And since the H atoms have relatively small mass, their motion must be relatively large.) The oscillation frequency is v = 2.3786 x 10 10 Hz, when the molecule is in its ground state. This frequency is much lower than those found in molecular vibrations not involving barrier penetration, or in other atomic or molecular phenomena. Due to the resulting technical simplifications, the frequency was used as a standard in the first atomic clocks which measure time with maximum precision. A recent, and very useful, example of barrier penetration of electrons is found in the tunnel diode. This is a semiconductor device, like a transistor, which is used in fast electronic circuits since its high frequency response is much better than that of any transistor. The operation of a tunnel diode will be explained in Chapter 13, in the context of a discussion of semiconductors. So here we shall say only that the device employs controllable barrier penetration to switch currents on or off so rapidly that it can be used to make an oscillator that can operate at frequencies about 10 i1 Hz. V(x) O ^ SO LUTIO NSOF TIME- INDEPENDENT SCHROEDINGER EQU ATIO NS N CD coc o — Figure 6-23 a/2 0 +a/2 x A square well potential. forces which hold it in that region. Although this simplified potential loses some details of the motion, it retains the essential feature of binding the particle by forces of a certain strength to a region of a certain size. From the discussion in Example 6-2 it is apparent that it is a good approximation to represent the potential acting on a conduction electron in a block of metal by a square well. The depth of the square well is around 10 eV, and its width equals the width of the block. Figure 6-24 indicates, from a point of view different from that used in Example 6-2, how something like a square well can be obtained by superimposing the potentials produced by the closely spaced positive ions in the metal. In Example 6-3, we indicated that the motion of a neutron in a nucleus can be approximated by assuming that the particle is in a square well potential with a depth of about 50 MeV. The linear dimensions of the potential equal the nuclear diameter, which is about 10 -14 m. We begin our treatment by considering, qualitatively, the form of the eigenfunctions which are solutions to the time-independent Schroedinger equation for the square well potential of (6-58). As in the preceding sections, the problem decomposes itself into three regions: x < — a/2 (left of the well), — a/2 < x < + a/2 (within the well), One ion Ix Three ions in line Many closely spaced ions in line (AAMMAAAAAMAA A qualitative indication of how an approximation to a square well potential results from superimposing the potentials acting on a conduction electron in a metal. The potentials are due to the closely spaced positive ions in the metal. Figure 6-24 and x > + a/2 (right of the well). The so-called general solution to the equation for the region within the well is where k I = NI2mE —a12 < x < +a/2 (6-59) The first term describes waves traveling in the direction of increasing x, and the second term describes waves traveling in the direction of decreasing x. (This solution was derived in Section 6-2. If the student has not studied that section, he can easily show that it is a solution to the time-independent Schroedinger equation, for any values of the arbitrary constants A and B, by substituting it into (6-2).) Now, the classical description of the particle bouncing back and forth within the well suggests that the eigenfunction in that region should correspond to an equal mixture of waves traveling in both directions. The two oppositely directed traveling waves of equal amplitude will combine to form a standing wave. We can obtain such behavior by setting the arbitrary constants equal to one another, so that A = B. This yields w (x) = which we write as B(eikIx + / eikix + e -ikIx iP(x)=B 2 where B' is a new arbitrary constant defined by the relation B' = 2B. But this combination of complex exponentials gives us simply ifi(x) = B' cos kIx where k1 = E h (6-60) This eigenfunction describes a standing wave since inspection of the associated wave function `I'(x,t) = fi(x)e - jEtm shows that it has nodes in the fixed locations where cos kIx = 0. We can also obtain a standing wave by setting — A = B. This gives A(eikix _ e - ikIx) I (x) = which we write as e ikIx — e -ikIx IŸ(x) = A' 2i where A' is a new arbitrary constant defined by A' = 2iA. But this is just 111(x) = A' sin kIx where kI = V2 E (6-61) Since both (6-60) and (6-61) specify solutions to the time-independent Schroedinger equation for the same value of E, and since that differential equation is linear in 0(x), their sum ,/2mE — a/2 < x < + a/2 (6-62) where kI = 111(x) = A' sin kIx + B' cos kix is also a solution, as can be verified by direct substitution. In fact, this is a general solution to the differential equation for the region within the well because it contains' two arbitrary constants—it is just as general as the solution (6-59). Mathematically, the two are completely equivalent. However, (6-62) is more convenient to use in problems involving the motion of bound particles. Physically, (6-62) can be thought of as describing a situation in which a particle is moving in such a way that the magnitude of its momentum is known to be precisely p = hk1 = -,,/2mE, but the direction of the momentum could either be in the direction of increasing or decreasing x. 1b'I1N310d 113M 31:1b'f1 OS3H1 tli(x) = Ae ik Ix + Be - ikix N SO LUTIONS OF TIME- INDEPENDENT SCHRO ED ING ER EQU ATION S CTN Now consider the solutions to the time-independent Schroedinger equation in the two regions outside the potential well: x < — a/2 and x > + a/2. In these regions the general solutions have the forms De - kiix where k11 = 0(x) = Fekiix + Ge- kiix where k11 = > fi (x) = CekI'x + ^2m( l^° — E) x < — a/2 (6-63) and 1/2m(o — E) x > +a/2 (6-64) The two forms of 0(x) describe standing waves in the region outside the well, since in the associated wave function 'P(x,t) = 4/i(x)e - `Et/k the x and t dependences occur as separate factors. These standing waves have no nodes, but they will be joined onto the standing waves inside the well which do have nodes. (The general solutions were derived in Section 6-3. Their validity, for any values of the arbitrary constants C, D, F, and G, can easily be verified by students who skipped that section by substitution in (6-13).) Eigenfunctions valid for all x can be constructed by joining the forms assumed, in each of the three regions of x, by the general solutions to the time-independent Schroedinger equation. These three forms involve six arbitrary constants: A', B', C, D, F, and G. Now since an acceptable eigenfunction must everywhere remain finite, we can immediately see that we must set D = 0 and F = O. If this were not done the second exponential in (6-63) would make 4i(x) -4 co as x —* — oo, and the first exponential in (6-64) would make 4i(x) —* co as x —* + oo. Four more equations involving the remaining arbitrary constants can be obtained by demanding that 0(x) and dt/i(x)/dx be continuous at the two boundaries between the regions, x = — a/2 and x = + a/2, as is required for acceptable eigenfunctions. (They are already single valued.) But we cannot allow all four of the remaining arbitrary constants to be specified by these four equations. One of them must remain unspecified so that the amplitude of the eigenfunction can be arbitrary. Arbitrary amplitude is required because the differential equation is linear in the eigenfunction i/i(x). Thus there seems to be a discrepancy between the number of equations to be satisfied and the number of constants that can be adjusted. But it is resolved by treating the total energy E as an additional constant that can be adjusted, as needed. We shall find that this procedure works, but only for certain values of E. That is, there will emerge a certain set of possible values of the total energy E, and so the energy will be quantized to a set of eigenvalues. Only for these values of the total energy does the Schroedinger equation have acceptable solutions. It is not difficult to carry through this procedure, as we shall see shortly in treating a special case. But the general case leads to a solution involving a complicated transcendental equation (an equation in which the unknown is contained in the argument vo E3 E2 —a/2 0 E1 0 +a/2 Figure 6-25 A square well potential and its three bound eigenvalues. Not shown is a continuum of eigenvalues of energy E > Vo. fV 413 W Ii/2 x x 0 + a/2 —a/2 Figure 6-26 The three bound eigenfunctions for the square well of Figure 6-25. of a function such as a sinusoidal), which precludes expressing the solution mathematically in a concise way. Therefore, we relegate the details of the general solution to Appendix H, and here continue for a while with our qualitative discussion. Figures 6-25 and 6-26 show, respectively, the eigenvalues and eigenfunctions for the three bound states of a particle in a particular square well potential. Not shown are a continuum of eigenvalues which extend from the top of the well on up, since any value of total energy E that is greater than the height of the potential walls V o is allowed. Also not shown are the continuum eigenfunctions. Focusing attention first on the region of x within the well, we note that the curvature of the sinusoidal part of the eigenfunction increases as the energy of the corresponding eigenvalue increases. As a consequence, the higher the energy of the eigenvalue the more numerous are the oscillations of the corresponding eigenfunction and the higher is its wave number. These results reflect the fact that the wave number k I, in the solution of (6-62) for the region inside the well, is proportional to E 1/2.The square well potential depicted in the figure does not have a fourth bound eigenvalue because the associated value of k I, and therefore of E 1J2, would be too large to satisfy the binding condition E < Vo . Now consider the parts of the eigenfunctions that extend into the regions outside the well. In classical mechanics a particle could never be found in these regions since its kinetic energy is p2/2m = E — V(x), which is negative where E < V(x). Note that the eigenfunctions go to zero in these classically excluded regions more rapidly the lower the energy of the corresponding eigenvalue. This agrees with the fact that the exponential parameter kII , in the solutions of (6-63) and (6-64) for the region outside the well, is proportional to (V0 — E) 1/2 . It also agrees with the idea that the more serious the violation of the classical restriction, that the total energy E must be at least as large as the potential energy V(x), the more reluctant the eigenfunctions are to penetrate the classically excluded regions. It is instructive to consider the effect on the eigenfunctions of letting the walls of the square well become very high, i.e., letting Vo —> co. Shown in Figure 6-27 is the first cc e — R2m( Vo — Ei)/A]x Î —a/2 Figure 6-27 I x +a/2 0 The first eigenfunction for a square well with walls of moderate height. 1VI1 N310d 113M3 1:1d f10S3IH1 x SOLUTION S OF TIME- INDEPENDENT SCHROEDING ER EQ UATIONS ap à —a/2 Figure 6-28 0 +a/2 The first eigenfunction of a square well with walls of infinite height. eigenfunction for a square well potential. As Vo cc, E 1 will increase, but it will do so very slowly compared to the increase in Vo. This is true because E 1 is determined essentially by the requirement that approximately half an oscillation of the eigenfunction must fit into the length of the well. Therefore, the exponential parameter k1I = \/2m(Vo — E)/h, which determines the behavior of the eigenfunction in the regions outside of the well, will become very large as Vo becomes very large, and the eigenfunction will go to zero very rapidly outside the well. In the limit, 0 1 (x) must be zero for all x < — a/2 and for all x > + a/2. For a square well with infinitely high walls, 0 1 (x) has the form shown in Figure 6-28. It is apparent that this argument holds for all the eigenfunctions of such a potential. That is, for all values of n, in an infinite square well potential >/i,i (x) = 0 x < —a/2 or x > _ + a/2 (6-65) This condition for infinite square well eigenfunctions can only be satisfied by violating at x = + a/2 the requirement of Section 5-6 that the derivative dhi„ (x)/dx of an eigenfunction be continuous everywhere. But if the student will review the argument which was presented to justify the requirement, he will find that the derivative must be continuous only when the potential is finite. 6-8 THE INFINITE SQUARE WELL POTENTIAL The infinite square well potential is written as x < — a/2 or x > + a/2 (6-66 V(x) = oo 0 —a/2 < x < + a/2 ) and is illustrated in Figure 6-29. It has the feature that it will bind a particle with any finite total energy E > O. In classical mechanics, any of these energies are possible, but in quantum mechanics only certain discrete eigenvalues E„ are allowed. We shall see that it is very easy to find simple and concise expressions for the eigenvalues and eigenfunctions of this potential because the transcendental equation that a ri ses in the solution of its time-independent Schroedinger equation happens to have simple solutions. For values of the quantum number n which are not too large, these eigenvalues and eigenfunctions can often be used to approximate the corresponding (same n) eigenvalues and eigenfunctions of a square well potential with V(x) —a/2 0 +a/2 x Figure 6-29 An infinite square well potential. tii (x) = A sin kx + B cos kx where k = E —a12 < x < + a/2 (6-67) (Students who have skipped the preceding sections can see that this i/r(x) represents a standing wave by noting that the associated wave function'Y(x,t) = t/r(x)e -L" has fixed nodes. They can verify that the i/i(x) is actually a solution to the applicable time-independent Schroedinger equation by substituting it into (6-2).) According to the condition of (6-65), tJi(x) has the value zero in the regions outside the well. Of course, this must be true so that the probability density will be zero in these regions, since the particle is strictly confined within the well by its infinitely high potential walls. In particular, at the boundaries of the well x = + a/2 (6-68) i/r(x) = 0 That is, the standing wave has nodes at the walls of the box. Now we develop relations which are satisfied by the arbitrary constants A and B, and by the parameter k. Applying the boundary conditions of (6-68) at x = + a/2, we obtain A sin Za + B cos 2a = 0 (6-69) At x = — a/2, (6-68) yields A sin — or a + B cos (— =0 2 2a) a + Bcos = 0 2a 2 Addition of the last two numbered equations gives —Asin 2B coska =0 2 (6-70) (6-71) Subtraction gives 2A sin a 2 =0 (6-72) Both (6-71) and (6-72) must be satisfied. When this is done, t/i(x) and dt/r(x)/dx will be everywhere finite and single valued, and i/i(x) will be everywhere continuous. As discussed at the end of the preceding section, d>li(x)/dx will be discontinuous at x = + a/2. _N 8-9'08S Ui TdI1N310d112M 3 1:11/2f1 bS31INI dN I 3 H1 large but finite Vo . For instance, we mentioned before that it is a very good approximation to take the potential for a conduction electron in a block of metal to be a finite square well. In Example 6-2 we showed that for the typical metal Cu the eigenfunctions penetrate into the classically excluded regions exterior to the well by a 1 ° m. This distance is so small compared to the width of the distance of about 10' square well, which is the width of the Cu block, that for many purposes it is an equally good approximation to use the corresponding eigenfunctions and eigenvalues for an infinite square well, and we shall do so later. We shall also use infinite square well potentials to discuss the quantum mechanical properties of a system of gas molecules, and other particles, which are strictly confined within a box of certain dimensions. A particle moving under the influence of an infinite square well potential is often called a particle in a box. In the region within the well the general solution to the time-independent Schroedinger equation for the infinite square well potential can be written as the standing wave of (6-62), which we simplify, by dropping the primes, into the form SOLUTIONS OF TIME- INDEPENDENT SCHROED INGER EQUATIONS co There is no value of the parameter k for which both cos (ka/2) and sin (ka/2) are simultaneously zero. And we certainly do not want to satisfy the two equations by setting both A and B equal to zero, for then ÿr(x) = 0 everywhere and the eigenfunction would be of no interest because the associated particle would not be in the box! However, we can satisfy these equations either by choosing k so that cos (ka/2) is zero and also setting A equal to zero, or by choosing k so that sin (ka/2) is zero and also setting B equal to zero. That is, we take either A=0 and cos B=0 and sin =0 (6-73) =0 (6-74) Za or ka Thus there are two classes of solutions. For the first class /i(x) = B cos kx where cos 111(x) = A sin kx where sin Za =0 (6-75) For the second class =0 (6-76) Za The conditions on the wave number k, expressed in (6-75) and (6-76), are in the form of transcendental equations since the unknown, k, occurs in the arguments of the sinusoidals; but these transcendental equations happen to be so simple that their solutions can be written in concise form immediately. The allowed values of k for the first class, (6-75), are ka it 3m 5n 2 2' 2 2 , since cos (n/2) = cos (3n/2) = cos (5n/2) = • • • = O. It is convenient to express this as kn = nrc a n = 1, 3, 5, . . . (6-77) The allowed values of k for the second class, (6-76), are ka 2 since sin it = sin 2g = sin 3rc = = it, 2n,3n,... = O. This can also be expressed as k = nailn a n = 2, 4, 6, . . . (6-78) Knowing the allowed values of k, we can then obtain the solutions to the time-independent Schroedinger equation for the infinite square well from (6-75) and (6-76). We find i/rn(x) .(x) = Bn cos knx where kn = nn 11rn(x) = A n sin knx where kn = a = 1, 3, 5, ... (6-79) and nn a n = 2, 4, 6, . . . (6-80) The solution corresponding to n = 0 is /10 (x) = A sin 0 = 0; it is ruled out because it does not describe a particle in a box. The quantum number n has been used to label the different solutions of the transcendental equations, and the corresponding eigen- h2 k2 7c2 h2 n2 "= 2 n=1,2,3,4,5,... (6-81) 2m 2ma Thus we conclude that only certain values of the total energy E are allowed. The total energy of the particle in the box is quantized. E" = The quantitative treatment of the finite square well, discussed in the preceding section and carried out in Appendix H, is essentially the same as what we have just gone through. But the penetration of the eigenfunction into the regions outside the well, which varies with the energy of the associated eigenvalue, leads to more complicated transcendental equations for k that must_be solved graphically or numerically. Figure 6-30 illustrates the infinite square well potential and its first few eigenvalues specified by (6-81). Of course, all the eigenvalues are discretely separated for an infinite square well potential since the particle is bound for any finite eigenvalue. Note that the pattern formed by the first three eigenvalues of the infinite square well is quite similar to that formed by the three bound eigenvalues of the finite square well shown in Figure 6-25. In this regard, the infinite square well results provide an approximation to the finite square well results. However, in detail each potential energy function V(x) has its own characteristic set of bound eigenvalues En . Of particular interest is the energy of the first eigenvalue. For the infinite square well it is E1 = n 2h2 2ma2 (6-82) This is called the zero-point energy. It is the lowest possible total energy the particle can have if it is bound by the infinite square well potential to the region — a/2 < x < + a/2. The particle cannot have zero total energy. The phenomenon is basically a result of the uncertainty principle. To see this, consider the fact that if the particle is bound by the potential, then we know its x coordinate to within an uncertainty of about Ax ^ a. Consequently, the uncertainty in its x momentum must be at least Ap h/2Ax ^ h/2a. The uncertainty principle cannot allow the particle to be bound by the V(x) — a /2 The first few eigenvalues of an infinite square well potential. Figure 6-30 0 +a /2 -- 1`dI1N310dTOM 3adf1OS 31INIdNI 3 H1 functions. If it is necessary to apply the normalization condition, the constants A" and B", which specify the amplitudes of the eigenfunctions, will thereby be determined (see Example 5-10); but it is not usually necessary to do this. The quantum number n is also used to label the corresponding eigenvalues. Using the relation k = /2mE/h of (6-67), and the expression k" = nt/a in (6-79) and (6-80) for the allowed values of k, we find SOLUTION S OF TIME- INDEPEND ENT SCHRO EDINGER EQ UATIONS w _ N potential with zero total energy since that would mean the uncertainty in the momentum would be zero. For the particular case of eigenvalue E 1 , the magnitude of the momentum is pi ^ -\12mE 1 = ih/a. Since the particle is in a state of motion described by a standing wave eigenfunction, it can be moving in either direction and the actual value of the momentum is uncertain by an amount which is about ap ^ 2p 1 ^ 2nh/a. The uncertainty product AxAp ^ a2rh/a ^ 27th is roughly in agreement with the lower limit h/2 set by the uncertainty principle. (Compare with the accurate calculation of Example 5-10.) We conclude that there must be a zero-point energy because there must be a zeropoint motion. This is in sharp contrast to the idea, of classical physics, that all motion ceases when a system has its minimum energy content at the temperature of absolute zero. The zero-point energy is responsible for several interesting quantum phenomena that are seen in the behavior of matter at very low temperatures. A striking example is the fact that helium will not solidify even at the lowest attainable temperature ( 0.001°K), unless a very high pressure is applied. The first few eigenfunctions of the infinite square well potential are shown in Figure 6-31. Note that the number of half wavelengths of each eigenfunction is equal to its quantum number n, and that therefore the number of nodes is n + 1. By comparing these eigenfunctions with the corresponding eigenfunctions of the finite square well shown in Figure 6-26, the student can see again how the results obtained for the simple potential can be used to approximate those of the more complicated potential (most accurately for eigenfunctions of lowest n value). Students familiar with stringed musical instruments may notice that the eigenfunctions for a particle strictly confined between two points at the ends of the box look like the functions describing the possible shapes assumed by a vibrating string fixed at two points at the ends of the string. The reason is that both systems obey timeindependent differential equations of analogous form, and they satisfy analogous conditions at the two points. Here is yet another example of the relation between quantum mechanics and classical wave motion. Musically inclined students may also notice that the frequencies, v„ = En/h, of the time-dependent factor in the wave functions for the confined particle satisfy the relation v„ cc n2 (since En = r 2h2n2/2ma2), whereas the frequencies of the vibrating string satisfy the "harmonic progression" vn cc n. This difference arises because the two systems obey time-dependent differential equations which are not at all analogous. Example 6 5. Derive the infinite square well energy quantization law, (6-81), directly from the de Broglie relation p = h/I, by fitting an integral number of half de Broglie wavelengths 1/2 into the width a of the well. ^ It is clear from Figure 6-31 that the infinite square well eigenfunctions satisfy the following relation between the de Broglie wavelengths and the length of the well - n 2 =a n= 1,2,3,... 11/3(x) x x —a/2 0 +a/2 x Figure 6 31 The first few eigenfunctions of infinite square well potential. - That is, an integral number of half-wavelengths fits into the well. This means n= 1,2,3,... So according to de Broglie, the corresponding values of the momentum of the particle are _ h_hn n= 1,2,3,... 2a As the potential energy of the particle is zero within the well, its total energy equals its kinetic energy. Thus p2 7r2h2n2 h2n2 E_—= n = 1, 2,3,... 2m 2m4a2 2ma 2 in agreement with (6-81). This trivial calculation can be used only for the simplest case of a bound particle—the case of an infinite square well potential. It cannot be applied to find the eigenvalues or eigenfunctions of a more complicated potential such as a finite square well. (See also the discussion, in connection with (4-25), of the application of the WilsonSommerfeld quantization rule to the infinite square well.) • P Example 6 6. Before the discovery of the neutron, it was thought that a nucleus of atomic number Z and atomic weight A was composed of A protons and (A — Z) electrons, but there was a serious problem concerning the magnitude of the zero-point energy for a particle as light as an electron confined to a region as small as a nucleus. Estimate the zero-point energy E. ■ Setting the electron mass m equal to 10 -3° kg and the width of the well equal to 10 - 14 m (a typical nuclear dimension), from (6-82) we obtain - E _ rc 2 h2 10 x 10 -68 joule 2-sec 2 10-9 joule 2m a2 2 x 10 -30 kg x 10 -28 m 2 — 2 1 eV 10 -9 joule x 109 eV 2 1.6 x 10 -19 joule — = 10 3 MeV For estimating the zero-point energy, we are certainly justified in treating the electron as if it were confined to an infinite square well. We are also justified in ignoring the three-dimensional character of the actual system. But we would not be justified in quoting the value of E just obtained because it is extremely large compared to the electron rest mass energy m o c 2 0.5 MeV. A relativistically valid analogue of (6-82) must be used in this particular problem. The required formula can be obtained from the technique used in Example 6-5. Both of the equations A = 2a/n and p = h/ A. retain their validity in the extreme relativistic range. So, if we replace E = p 2/2m by E = cp (the energy-momentum relation E 2 = c 2p2 + môc 4 in the limit E » mo c2 ), we immediately obtain for n = 1 ch chn 2rch — E = cp = A = 2a = a 3 x 3 x 10 8 m/sec x 10 -34 joule-sec leV — 108 eV = 102 MeV 10 -14 m 1.6 x 10 -19 joule An electron could be found in a nucleus with this zero-point energy, if the magnitude of the depth of the binding potential were greater than the magnitude of the zero-point energy. There is a binding potential acting on the electron due to the Coulomb attraction of the positive charge of the nucleus, but the magnitude of the potential is not great enough. We may estimate this magnitude by setting r = 10-14 m, and Q 1 = Ae, Q2 = — e, where e is the magnitude of the electron charge, in the Coulomb potential formula. We obtain, for a typical value of A = 100 A e2 102 x (1.6 x 10 -19 coul) 2 1 eV Q1Q2 x 47rE0r 47rE0r — 10 -10 cou1 2/nt-m 2 x 10 -14 m 1.6 x 10 -19 joule — —10v eV = —10 MeV This is ten times smaller than the required binding energy. So an electron could not be bound in a nucleus because of the zero-point energy required by the uncertainty principle. Sec . 6-8 THE INFI NITE S QUAREWELL P OT ENTI AL 2a n N SOLUTI ONSOF TIME- INDEPENDENT SCHRO ED ING ER EQ UATION S 0 N In 1932 Chadwick, motivated by a suggestion of Rutherford, discovered the neutron. We now know that a nucleus is composed of Z protons and (A — Z) neutrons. Because neutrons are heavy particles, like protons, their zero-point energy in a nucleus is relatively low so they can be bound without difficulty. Indeed, we shall see in Chapter 15 that some of the most important properties of nuclei can be explained in terms of the quantum states of neutrons, and protons, moving in a (finite) square well potential. • Figure 6-31 makes quite apparent the essential difference between the two classes of standing wave eigenfunctions specified by (6-79) and (6-80). The eigenfunctions of the first class, 0 1 (x), 0 3 (x), 0 5 (x), ... , are even functions of x; that is (6-83) — x) = + li(x) In quantum mechanics, these functions are said to be of even parity. The eigenfunctions of the second class, 0 2(x), 04(x), 0 6 (x), ... , are odd functions of x; that is (6-84) 0( — x) = — 0(x) and are said to be of odd parity. The eigenfunctions have a definite parity, either even or odd, because we have chosen the origin of the x axis so that the symmetrical square well potential V(x) is an even function of x. Note that if we redefine the origin of the x axis in Figure 6-31 to be at, say, the point x = — a/2, the eigenfunctions will no longer have a definite parity. These results are obtained for the square well potential, and for any other symmetrical potential, since measurable quantities describing the motion of a particle in bound states of such potentials must also be symmetrical about the point of symmetry of the potential. If the origin of the x axis is chosen to be at that symmetry point, then the function describing the measurable quantity must be an even function. As an example, this is true for the probability density function P(x,t), for both even and odd parity eigenfunctions, since P(—x,t) = 0*(-x)0(-x) = [+0*(x)][±0(x)] = 0*(x)0(x) = P(x,t) (6-85) This is not true for the wave function itself in the case of an odd parity eigenfunction; such a wave function is an odd function of x, but this is not a contradiction because the wave function itself is not measurable. Eigenfunctions for unbound states of potentials that are even functions of x do not necessarily have definite parities since they do not necessarily describe symmetrical motions of the particle. In one dimension, the fact that standing wave eigenfunctions have definite parities, if V(— x) = V(x), is of importance largely because it simplifies certain calculations. In three dimensions, the property has a deeper significance that will be seen first in Chapter 8 in connection with the emission of radiation by an atom making a transition from an excited state to its ground state. The probability density functions, corresponding to the first few eigenfunctions of the infinite square well, are plotted in Figure 6-32. Also illustrated in the figure is the probability density that would be predicted by classical mechanics for a bound particle bouncing back and forth between — a/2 and + a/2. Since the classical particle I3 1 33 x iŸ2 * lŸ2 X 1P1*;Ÿi — a/2 0 Figure 6 32 The first few probability density functions for an infinite square well potential. The dashed x curves are the predictions of classical mechanics. - -^ +a/2 6-9 THE SIMPLE HARMONIC OSCILLATOR POTENTIAL We have discussed several potentials which are discontinuous functions of position with constant values in adjacent regions. Now we turn to the more realistic cases of potentials which are continuous functions of position. It turns out that there are only a limited number of such potentials for which it is possible to obtain solutions to the Schroedinger equation by analytical techniques. But, fortunately, these potentials include some of the most important cases, such as the Coulomb potential, V(r) cc discussed in the following chapter, and the simple harmonic oscillator potential, V(x) cc x 2, discussed in this section. (In this connection, we should remind the student that solutions to the Schroedinger equations for potentials of any form can always be obtained by the numerical techniques developed in Appendix G.) The simple harmonic oscillator is of tremendous importance in physics, and all fields based on physics, because it is the prototype for any system involving oscillations. For instance, it is used in the study of: the vibration of atoms in diatomic molecules, the acoustic and thermal properties of solids which arise from atomic vibrations, magnetic properties of solids that involve vibrations in the orientation of nuclei, and the electrodynamics of quantum systems in which electromagnetic waves are vibrating. Generally speaking, the simple harmonic oscillator can be used to describe almost any system in which an entity is executing small vibrations about a point of stable equilibrium. At a position of stable equilibrium, the potential function V(x) must have a minimum. Since any realistic potential function is continuous, the function in the region near its minimum can almost always be well approximated by a parabola, as illustrated in Figure 6-33. But for small vibrations the only thing that counts is what V(x) does near its minimum. If we choose the origins of the x axis and the energy axis to be at the minimum, we can write the equation for this parabolic potential function as V(x) = 2 x2 (6-86) V(x) o Figure 6-33 Illustrating the fact that any continuous potential with a minimum (solid curve) can be approximated near the minimum very well by a parax bolic potential (dashed curve). THE SIMPLE HARM ONIC OSC ILLATOR POTENTIAL would spend an equal amount of time in any element of the x axis in that region, it would be equally likely found in any such element. The quantum mechanical probability density oscillates more and more as n increases. In the limit that n approaches infinity, that is for eigenvalues of very high energy, the oscillations are so compressed that no experiment could possibly have the resolution to observe anything other than the average behavior of the probability density predicted by quantum mechanics. Furthermore, the fractional separation of the eigenvalues approaches zero as n approaches infinity, so in that limit their discreteness cannot be resolved. Thus we see that the quantum mechanical predictions approach the predictions of classical mechanics in the large quantum number, or high-energy, limit. This is what would be expected from the correspondence principle of the old quantum theory. N N SOLU TI ON SO F TIME- INDEPENDENT SC HRO EDINGER EQ UATIONS N Figure 6-34 0 The simple harmonic oscillator po- tential. where C is a constant. Such a potential is illustrated in Figure 6-34. A particle moving under its influence experiences a linear (or Hooke's law) restoring force F(x) = —dV(x)/dx = — Cx, with C being the force constant. Classical mechanics predicts that a particle under the influence of the linear restoring force exerted by the potential of (6-86), which is displaced by an amount x o motion about the equilibrium position with frequency C m (6-87) where m is its mass. According to that theory, the total energy E of the particle is proportional to x6, and can have any value since x o is arbitrary. Quantum mechanics predicts that the total energy E can assume only a discrete set of values because the particle is bound by the potential to a region of finite extent. Even in the old quantum theory this was known. The student will recall that Planck's postulate predicts that the energy of a particle executing simple harmonic oscillations can assume only one of the values E„ = nhv n = 0, 1, 2, 3, ... (6-88) What are the allowed energy values predicted by Schroedinger quantum mechanics for this very important potential? To find out, the time-independent Schroedinger equation for the simple harmonic oscillator potential must be solved. The mathematics used in the analytical solution to the equation is not difficult to follow, and it is quite interesting; but since the solution is very lengthy it has been placed in Appendix I. Other than verifying by substitution a typical eigenfunction and eigenvalue obtained from the solution, here we shall concentrate on describing the results of the solution and discussing their physical significance. It is found that the eigenvalues for the simple harmonic oscillator potential are given by the formula E,1 _ (n + 1/2)hv n = 0, 1, 2, 3, ... (6-89) where y is the classical oscillation frequency of the particle in the potential. All the eigenvalues are discrete since the particle is bound for any of them. The potential, and the eigenvalues, are shown in Figure 6-35. If we compare the Schroedinger results with the Planck postulate, we see that in quantum mechanics all the eigenvalues are shifted up by an amount by/2. As a consequence, the minimum possible total energy for a particle bound to the potential fromthequilbpsonadtherl,wioscatnmplehri The first few eigenvalues of the simple harmonic oscillator potential. Note that the classically allowed regions (between the intersections of V(x) and En) expand with increasing values of En . Figure 6-35 is E0 = hv/2. This is the zero-point energy for the potential, the existence of which is required by the uncertainty principle. Therefore, Planck's postulated energy quantization of the simple harmonic oscillator, in the form described in Chapter 1, was actually in error by the additive constant hv/2. (In fairness to Planck, it should be pointed out that in 1914 he published a speculation, based upon entropy considerations, which reads very much like Schroedinger's conclusion concerning hv/2.) This constant cancels out in most applications of Planck's postulate because they involve only differences between two energy values. As an example, consider the electromagnetic radiation emitted by the vibrating charge distribution of a diatomic molecule whose interatomic spacing is executing simple harmonic oscillations. Since the frequencies of the emitted photons depend only on the differences in the allowed energies of the molecule, the additive constant has no effect on the frequencies of the photons. But there are observable quantities that show Planck's original postulate is in error because it does not contain the zero-point energy. The most important example is also connected with the emission of radiation by a vibrating molecule, or atom. When we study this subject in a subsequent chapter, we shall see that the rate of emission of the photons would not agree with experiment unless simple harmonic oscillators have zero-point energies. In fact, we shall find the only reason why the molecule emits any radiation is that its vibrations have been stimulated by a surrounding electromagnetic field whose field strengths are executing simple harmonic oscillations because of the zero-point energy of the field. In addition to providing completely correct eigenvalues, quantum mechanics also provides the eigenfunctions for the simple harmonic oscillator. The eigenfunctions 4k,,, corresponding to the first few eigenvalues En , are listed in Table 6-1 and plotted Table 6-1 Some Eigenfunctions fi(u) for the Simple Harmonic Oscillator Potential, where u is Related to the Coordinate x by the Equation /h u = [(Cm) 1/4 t/2 ]x Quantum Number 0 1 2 3 4 5 Eigenfunctions u2/2 qio = A o e 2 = Alue u2/ (1 — 2u2)e-u2/2 Y' 2 = A 2 I 3= A3(31l — 2u3)e u2 12 = A4(3 — 12u 2 + 4144)e -u2/2 = AA5(15/4 — 20u 3 + 4u5)e-u2 /2 Y' 1VIlN3lOd 1=IOlV-11I0 SO O INO W }Jb H 31dWIS 3H1 V( x) SOLUTI ONSOF TIME- IND EPEND ENTSCHROEDING ER EQUATIONS x x x Figure 6-36 The first few eigenfunctions of the simple harmonic oscillator potential. The vertical ticks on the x axes indicate the limits of classical motion shown in Figure 6-35. in Figure 6-36. The eigenfunctions are expressed in terms of the dimensionless variable u = [(Cm)114/h 112 ]x, which differs from x only by a proportionality constant that depends on the properties of the oscillator. For all values of x, the eigenfunction is given by the product of an exponential, whose exponent is proportional to — x 2 , times a simple polynomial of order x". The polynomial is responsible for the oscillatory behavior of tli,, in the classically allowed region where E" < V(x). The number of oscillations increases with increasing n because there are n values of x for which a polynomial of the order x" has the value zero. These values of x are the locations of the nodes of ,//". The classically allowed regions lie within the vertical marks shown in Figure 6-36. These regions become wider with increasing n because of the shape of the simple harmonic oscillator potential V(x), as can be seen by inspecting Figure 6-35 which also indicates the classically allowed regions for each E. Outside these regions, the eigenfunctions decrease very rapidly because their behavior is dominated by the decreasing exponential. Since the reiation V(— x) = V(x) is satisfied by the potential, we expect that its eigenfunctions should have definite parities. Inspection of Table 6-1 shows this is true, and that the parity is even for even n and odd for odd n. Thus the eigenfunction for the lowest allowed energy is of even parity, as in the case of a square well potential. The multiplicative constants A" determine the amplitudes of the eigenfunctions. If necessary, the normalization procedure can be used to fix their values, as in Example 5-7; but this is usually not necessary. The simple harmonic oscillator eigenfunctions contain a wealth of information about the behavior of the system. Some of this information was extracted in Chapter 5. For instance, Figures 5-3 and 5-18 gave accurate representations of the probability density functions for the n = 0 and n = 12 quantum states of the oscillator. In Chapter 8 we shall show how the eigenfunctions can be used to calculate the rate of emission of radiation by a charged simple harmonic oscillator, and derive the n. — of = ±1 selection rule that had to be introduced in the old quantum theory by arguments based on the rather unreliable correspondence principle. Example 6 7. Because the simple harmonic oscillator eigenfunctions for small n have fairly simple mathematical forms, it is not too difficult to verify by direct substitution that they satisfy the time-independent Schroedinger equation, for the potential of (6-86), and for the eigenvalues of (6-89). Make such a verification for n = 1. (For n = 0 the wave function was verified by direct substitution in the Schroedinger equation in Example 5-3.) •The time-independent Schroedinger equation is - _ h 2 d2 `" + C 2 2x^i = Eli 2m dx2 To verify that the eigenvalue ( Cl 3 1/2 3 3 h E1= 2 hv= 22rc1m/ 2h (C NN CJ1 1/2 ^ CD C) and the eigenfunction rn where u = h112 O x Aadwwns 0 1 = Alue u2 /2 (Cm)1/4 satisfy the equation, we evaluate the derivatives di/i l dx du d0(Cm) 114 __ - u2/2 [A l e -u2/2 h1/ 2 dx du 1/4 A l e-u2/2[1 — u 2] _ ( hm)2 and d2i/i 1 u2/2 ] A l u(—u)e t o (Cm)114 d (Cm)1/4 du d d l A e u2/2 1 — u 2] / 2 l1 dx du dx du h 112 ^ ^ hl 2 1 / (Cm) h A 1 {—ue - u212 [1 — u2] + e - u 212 [ -2u]} (Cm) 1 /2 A l ue - u212 {u 2 — h (Cm) 1 / 2 h 3} {u2 - 3 }Y' 1 = (^ ^ 1/2 1 (C ^ l/2 x 2 3101 Substitution of d 2>li l/dx 2 and E 1 into the equation they are supposed to satisfy yields h h2 (Cm)1/2 {(Gm)/ 2 x2 — 3 ^1+2rŸ1= 2 m ^1 h h 2m } Since inspection shows this is satisfied, the verification is completed. • 6-10 SUMMARY In Table 6-2 we summarize some of the properties of the systems studied in this chapter. The table gives an abbreviated name for each idealized system, and an example of a physical system whose potential and total energies are approximated by the idealization. It also gives sketches of the forms of the potential and total energies, and corresponding probability density functions, for each system. If the particle is not bound, it is incident from the left. We have chosen one significant feature of each system to list in the table, but there are many other significant features that we have discussed, which are not listed. In fact, in this chapter we have obtained most of the important predictions of quantum mechanics for systems involving one particle moving in a one-dimensional potential. In the following chapters we shall obtain predictions from the theory for systems involving three dimensions and several particles. A powerful approximation procedure which extends the techniques used in the later sections of this chapter to solve the time-independent Schroedinger equation for bound particles is given in Appendix J. Appendix K modifies the procedure of Appendix J so that it can be applied directly to Schroedinger equations in cases where time-independent equations cannot be obtained from them by separating variables. And Appendix L uses the results of Appendix K to develop a procedure for extending to three dimensions the treatment of unbound particles given in the earlier sections of this chapter. A student willing to read out of context a few short passages from following chapters will find it quite feasible to study these appendices at this point. But many may prefer to wait until all material prerequisite to the appendices and, more importantly, the motivation to study them, has been developed. For such it is recommended that Appendices J and K be read after Chapter 10 and Appendix L after Chapter 15. Table 6 2. SOLUTION SOF TIME- INDEPEND ENT SCHROED INGER E QUATI ONS - A Summary of the Systems Studied in Chapter 6 Name of System Physical Example Potential and Total Energies Significant Feature Probability Density Zero potential Proton in beam from cyclotron Step potential (energy below top) Conduction electron near surface of metal Step potential (energy above top) Neutron trying to escape nucleus Partial reflection at potential discontinuity Barrier potential (energy below top) a particle trying to escape Coloumb barrier Tunneling Barrier potential (energy above top) Electron scattering from negatively ionized atom Results used for other systems E V (x) Penetration of excluded region x 0 J 0 E a V (x) * ^Y , a E qi * qi a Finite square well potential Neutron bound in nucleus Infinite square well potential Molecule strictly confined to box Simple harmonic oscillator potential Atom of vibrating diatomic molecule No reflection at certain energies x Energy quantization x Approximation to finite square well x Zero-point energy QUESTIONS 1. Can there be solutions with E < 0 to the time-independent Schroedinger equation for the zero potential? 2. Why is it never possible in classical mechanics to have E < V(x)? Why is it possible in quantum mechanics, providing there is some region in which E > V(x)? 3. Explain why the general solution to a one-dimensional time-independent Schroedinger equation contains two different functions, while the general solution to the corresponding Schroedinger equation contains many different functions. 4. Consider a particle in a long beam of very accurately known momentum. Does a wave function in the form of a group provide a more or a less realistic description of the particle than a single complex exponential wavefunction like (6-9)? NN SN OIlS3 flb 5. Under what circumstances is a discontinuous potential function a reasonable approximation to an actual system? 6. If a potential function has a discontinuity at a certain point, do its eigenfunctions have discontinuities at that point? If not, why not? 7. By combining oppositely directed traveling waves of equal amplitudes, we obtain a standing wave. What kind of a wave do we get if the amplitudes are not equal? 8. Just what is a probability flux, and why is it useful? 9. How can it be that a probability flux is split at a potential discontinuity, although the associated particle is not split? 10. Is there an analogy between the splitting of a probability fl ux that characterizes the behavior of an unbound particle in a one-dimensional system, and the alternative paths that can be followed by an unbound particle moving in two dimensions through a diffraction apparatus? Why? 11. Exactly what is meant by the statement that the reflection coefficient is one for a particle incident on a potential step with total energy less than the step height? What is meant by the statement that the reflection coefficient is less than one if the total energy is greater than the step height? Can the reflection coefficient ever be greater than one? 12. Since a real exponential is a nonoscillatory function, why is a complex exponential an oscillatory function? 13. What do you think causes the rapid oscillations in the group wave function of Figure 6-8 as it reflects from the potential step? 14. What is the fallacy in the following statement? "Since a particle cannot be detected while tunneling through a barrier, it is senseless to say that the process actually happens." 15. A particle is incident on a potential barrier, with total energy less than the barrier height, and it is reflected. Does the reflection involve only the potential discontinuity facing its direction of incidence? If the other discontinuity were removed, so that the barrier were changed into a step, is the reflection coefficient changed? 16. In the sun, two nuclei of low mass in violent thermal motion can collide by penetrating the Coulomb barrier which separates them. The mass of the single nucleus formed is less than the sum of the masses of the two nuclei, so energy is liberated. This fusion process is responsible for the heat output of the sun. What would be the consequences to life on earth if it could not happen because barriers were impenetrable? 17. Are there any measurable consequences of the penetration of a classically excluded region which is of infinite length? Consider a bound particle in a finite square well potential. 18. Show from a qualitative argument that a one-dimensional finite square well potential always has one bound eigenvalue, no matter how shallow the binding region. What would the eigenfunction look like if the binding region were very shallow? 19. Why do finite square wells have only a finite number of bound eigenvalues? What are the characteristics of the unbound eigenvalues? 20. What would a standing wave eigenfunction for an unbound eigenvalue of a finite square well look like? 21. Why do the lowest eigenvalues and eigenfunctions of an infinite square well provide the best approximation to the corresponding eigenvalues and eigenfunctions of a finite square well? 22. In the n = 3 state, the probability density function for a particle in a box is zero at two positions between the walls of the box. How then can the particle ever move across these positions? 23. Explain in simplest terms the relation between the zero-point energy and the uncertainty principle. 24. Would you expect the zero-point energy to have much effect on the heat capacity of matter at very low temperatures? Justify your answer. 25. If the eigenfunctions of a potential have definite parities, the one of lowest energy always has even parity. Explain why. SO LUTIONSOF TIME- I NDEPENDENT SC HROEDINGER EQUATIO NS `° ci. v 26. Are there analogies in classical physics to the quantum mechanical concept of parity? 27. Are there unbound states for a simple harmonic oscillator potential? How many bound states are there? How realistic is the potential? 28. Explain all aspects of the behavior of all the probability densities of Table 6-2; in particular explain the probability density for the barrier potential with energy above the top. 29. What are the other significant features of the systems of Table 6-2? 30. Considering separately each system treated in this chapter, state which of its properties agree, and disagree, with classical mechanics in the microscopic limit. Which agree, and disagree, with classical wave motion in that limit? Make the same classifications for the properties of the systems in the macroscopic limit. 31. The eigenvalues in Figure 6-35 are equally spaced, but the lowest eigenvalues in Figure 6-22 come in closely spaced pairs. By considering the effect of a large bump in a potential well on the eigenvalues for symmetric versus antisymmetric eigenfunctions, explain the tendency for the eigenvalues to come in pairs in Figure 6-22. PROBLEMS 1. Show that the step potential eigenfunction, for E < V° , can be converted in form from the sum of two traveling waves, as in (6-24), to a standing wave, as in (6-29). 2. Repeat the step potential calculation of Section 6-4, but with the particle initially in the region x > 0 where V(x) = V° , and traveling in the direction of decreasing x towards the point x = 0 where the potential steps down to its value V(x) = 0 in the region x < 0. Show that the transmission and reflection coefficients are the same as those obtained in Section 6-4. 3. Prove (6-43) stating that the sum of the reflection and transmission coefficients equals one, for the case of a step potential with E > Vo . 4. Prove (6-44) which expresses the reflection and transmission coefficients in terms of the ratio E/Vo . 5. Consider a particle tunneling through a rectangular potential barrier. Write the general solutions presented in Section 6-5, which give the form of l in the different regions of the potential. (a) Then find four relations between the five arbitrary constants by matching and d>y/dx at the boundaries between these regions. (b) Use these relations to evaluate the transmission coefficient T, thereby verifying (6-49). (Hint: First eliminate F and G, leaving relations between A, B, and C. Then eliminate B.) 6. Show that the expression of (6-49), for the transmission coefficient in tunneling through a rectangular potential barrier, reduces to the form quotéd in (6-50) if the exponents are very large. 7. Consider a particle passing over a rectangular potential barrier. Write the general solutions, presented in Section 6-5, which give the form of >/i in the different regions of the potential. (a) Then find four relations between the five arbitrary constants by matching and dpi/dx at the boundaries between these regions. (b) Use these relations to evaluate the transmission coefficient T, thereby verifying (6-51). (Hint: Note that the four relations become exactly the same as those found in the first part of Problem 5, if k 1I is replaced by îk111 . Make this substitution in (6-49) to obtain directly (6-51).) 8. (a) Evaluate the transmission coefficient for an electron of total energy 2 eV incident upon a rectangular potential barrier of height 4 eV and thickness 10 -1° m, using (6-49) and then using (6-50). Repeat the evaluation for a barrier thickness of (b) 9 x 10 -9 m and (c) 10 -9 m. 9. A proton and a deuteron (a particle with the same charge as a proton, but twice the mass) attempt to penetrate a rectangular potential barrier of height 10 MeV and thickness 10 -14 m. Both particles have total energies of 3 MeV. (a) Use qualitative arguments to predict which particle has the highest probability of succeeding. (b) Evaluate quantitatively the probability of success for both particles. 8V° V= 0 5V0 x <0 0 <x< a x>a Find the probability that the particle will be transmitted on through to the positive side of the x axis, x > a. N co sw 3-18 01:1d 10. A fusion reaction important in solar energy production (see Question 16) involves capture of a proton by a carbon nucleus, which has six times the charge of a proton and a radius of r' ^ 2 x 10 -15 m. (a) Estimate the Coulomb potential V experienced by the proton if it is at the nuclear surface. (b) The proton is incident upon the nucleus because of its thermal motion. Its total energy cannot realistically be assumed to be much higher than 10 kT, where k is Boltzmann's constant (see Chapter 1) and where T is the internal temperature of the sun of about 10' °K. Estimate this total energy, and compare it with the height of the Coulomb barrier. (c) Calculate the probability that the proton can penetrate a rectangular barrier potential of height V extending from r' to 2r', the point at which the Coulomb barrier potential drops to V/2. (d) Is the penetration through the actual Coulomb barrier potential greater or less than through the rectangular barrier potential of part (c)? 11. Verify by substitution that the standing wave general solution, (6-62), satisfies the timeindependent Schroedinger equation, (6-2), for the finite square well potential in the region inside the well. 12. Verify by substitution that the exponential general solutions, (6-63) and (6-64), satisfy the time-independent Schroedinger equation (6-13) for the finite square well potential in the regions outside the well. 13. (a) From qualitative arguments, make a sketch of the form of a typical unbound standing wave eigenfunction for a finite square well potential. (b) Is the amplitude of the oscillation the same in all regions? (c) What does the behavior of the amplitude predict about the probabilities of finding the particle in a unit length of the x axis in various regions? (d) Does the prediction agree with what would be expected from classical mechanics? 14. Use the qualitative arguments of Problem 13 to develop a condition on the total energy of the particle, in an unbound state of a finite square well potential, which makes the probability of finding it in a unit length of the x axis the same inside the well as outside the well. (Hint: What counts is the relation between the de Broglie wavelength inside the well and the width of the well.) 15. (a) Make a quantitative calculation of the transmission coefficient for an unbound particle moving over a finite square well potential. (Hint: Use a trick similar to the one indicated in Problem 7.) (b) Find a condition on the total energy of the particle which makes the transmission coefficient equal to one. (c) Compare with the condition found in Problem 14, and explain why they are the same. (d) Give an example of an optical analogue to this system. 16. (à) Consider a one-dimensional square well potential of finite depth V ° and width a. What combination of these parameters determines the "strength" of the well—i.e., the number of energy levels the well is capable of binding? In the limit that the strength of the well becomes small, will the number of bound levels become 1 or 0? Give convincing justification for your answers. 17. An atom of the noble gas krypton exerts an attractive potential on an unbound electron, which has a very abrupt onset. Because of this it is a reasonable approximation to describe the potential as an attractive square well, of radius equal to the 4 x 10 -1° m radius of the atom. Experiments show that an electron of kinetic energy 0.7 eV, in regions outside the atom, can travel through the atom with essentially no reflection. The phenomenon is called the Ramsauer effect. Use this information in the conditions of Problem 14 or 15 to determine the depth of the square well potential. (Hint: One de Broglie wavelength just fits into the width of the well. Why not one-half a de Broglie wavelength?) 18. A particle of total energy 9V0 is incident from the — x axis on a potential given by SO LUTIONSOF TIME- IND EPENDENT SCHROEDING ER EQUATI ONS -a/2 +a/2 Figure 6 37 - -a/2 +a/2 Two eigenfunctions considered in Problem 20. 19. Verify by substitution that the standing wave general solution, (6-67), satisfies the timeindependent Schroedinger equation, (6-2), for the infinite square well potential in the region inside the well. 20. Two possible eigenfunctions for a particle moving freely in a region of length a, but strictly confined to that region, are shown in Figure 6-37. When the particle is in the state corresponding to the eigenfunction O h its total energy is 4 eV. (a) What is its total energy in the state corresponding to gi ll? (b) What is the lowest possible total energy for the particle in this system? 21. (a) Estimate the zero-point energy for a neutron in a nucleus, by treating it as if it were in an infinite square well of width equal to a nuclear diameter of 10 -14 m. (b) Compare your answer with the electron zero-point energy of Example 6-6. 22. (a) Solve the classical wave equation governing the vibrations of a stretched string, for a string fixed at both its ends. Thereby show that the functions describing the possible shapes assumed by the string are essentially the same as the eigenfunctions for an infinite square well potential. (b) Also show that the possible frequencies of vibration of the string are essentially different from the frequencies of the wave functions for the potential. 23. (a) For a particle in a box, show that the fractional difference in the energy between adjacent eigenvalues is AE„ 2n+1 E„ 24. 25. n2 (b) Use this formula to discuss the classical limit of the system. Apply the normalization condition to show that the value of the multiplicative constant for the n = 3 eigenfunction of the infinite square well potential, (6-79), is B3 = V21a. Use the eigenfunction of Problem 24 to calculate the following expectation values, and comment on each result: (a) z, (b) p, (c) x2, (d) p2 (a) Use the results of Problem 25 to evaluate the product of the uncertainty in position times the uncertainty in momentum, for a particle in the n = 3 state of an infinite square well potential. (b) Compare with the results of Example 5-10 and Problem 13 of Chapter 5, and comment on the relative size of the uncertainty products for the n = 1, n = 2, and n = 3 states. (c) Find the limits of Ax and Ap as n approaches infinity. Form the product of the eigenfunction for the n = 1 state of an infinite square well potential times the eigenfunction for the n = 3 state of that potential. Then integrate it over all x, and show that the result is equal to zero. In other words, prove that . 26. 27. GO l (x)>/i 3 (x) dx = 0 - GO (Hint: Use the relation: cos u cos y = [cos (u + y) + cos (u — v)]/2.) Students who have worked Problem 36 of Chapter 5 have already proved that the integral over all x of the n = 1 eigenfunction times the n = 2 eigenfunction also equals zero. It can be proved that the integral over all x of any two different eigenfunctions of the potential equals zero. Furthermore, this is true for any two different eigenfunctions of any other potential. (If the eigenfunctions are complex, the complex conjugate of one is taken in the integrand.) This property is called orthogonality. 28. Apply the results of Problem 20 of Chapter 5 to the case of a particle in a threedimensional box. That is, solve the time-independent Schroedinger equation for a particle 30. 31. 32. 33. . ( .) 34. sw 31 eoad 29. moving in a three-dimensional potential that is zero inside a cubical region of edge length a, and becomes infinitely large outside that region. Determine the eigenvalues and eigenfunctions for the system. Airline passengers frequently observe the wingtips of their planes oscillating up and down with periods of the order of 1 sec and amplitudes of about 0.1 m. (a) Prove that this is definitely not due to the zero-point motion of the wings by comparing the zero-point energy with the energy obtained from the quoted values plus an estimated mass for the wings. (b) Calculate the order of magnitude of the quantum number n of the observed oscillation. The restoring force constant C for the vibrations of the interatomic spacing of a typical diatomic molecule is about 10 3 joules/m2 . Use this value to estimate the zero-point energy of the molecular vibrations. The mass of the molecule is 4.1 x 10 -26 kg. (a) Estimate the difference in energy between the ground state and first excited state of the vibrating molecule considered in Problem 30. (b) From this estimate determine the energy of the photon emitted by the vibrations in the charge distribution when the system makes a transition between the first excited state and the ground state. (c) Determine also the frequency of the photon, and compare it with the classical oscillation frequency of the system. (d) In what range of the electromagnetic spectrum is it? A pendulum, consisting of a weight of 1 kg at the end of a light 1 m rod, is oscillating with an amplitude of 0.1 m. Evaluate the following quantities: (a) frequency of oscillation, (b) energy of oscillation, (c) approximate value of quantum number for oscillation, (d) separation in energy between adjacent allowed energies, (e) separation in distance between adjacent bumps in the probability density function near the equilibrium point. Devise a simple argument verifying that the exponent in the decreasing exponential, which governs the behavior of simple harmonic oscillator eigenfunctions in the classically excluded region, is proportional to x 2 Hint: Take the finite square well eigenfunctions of (6-63) and (6-64), and treat the quantity (V0 — E) as if it increased with increasing x in proportion to x2 Verify the eigenfunction and eigenvalue for the n = 2 state of a simple harmonic oscillator by direct substitution into the time-independent Schroedinger equation, as in Example 6-7. 7 ONE-ELECTRON ATOMS 7-1 INTRODUCTION 233 importance of one electron atom; reduced mass - 7 2 - DEVELOPMENT OF THE SCHROEDINGER EQUATION 234 three dimensional Schroedinger equation; time independent equation - 7-3 - SEPARATION OF THE TIME-INDEPENDENT EQUATION 235 spherical polar coordinates; equations in r, B, and (P 7 4 - SOLUTION OF THE EQUATIONS 237 solution of cp equation; single valuedness and quantum number nh; procedure for solution of 8 equation and quantum number l; procedure for solution of r equation and quantum number n 7 5 - EIGENVALUES, QUANTUM NUMBERS, AND DEGENERACY 239 eigenvalues; comparison with other binding potentials; conditions satisfied by quantum numbers; degeneracy of eigenfunctions; comparison with classical degeneracy 7 6 - EIGENFUNCTIONS 242 comparison of Bohr and Schroedinger treatments; verification of typical eigenfunction and eigenvalue 7 7 - PROBABILITY DENSITIES 244 radial probability density; shells; comparison with Bohr atom; uncertainty principle argument for ground state radius; l dependence of probability density near nucleus; angular dependence of probability density; nodal surfaces; significance of z axis; interpretation of angular dependence in terms of orbital angular momentum 7 8 - ORBITAL ANGULAR MOMENTUM 254 role in quantum physics; classical definition; associated operators; expectation values of z component and magnitude; geometrical description of behavior 7 9 - EIGENVALUE EQUATIONS 259 expectation values of a fluctuating quantity; absence of fl uctuations in z component and magnitude of orbital angular momentum; general eigenvalue equations; Hamiltonian operator 232 QUESTIONS 262 PROBLEMS 263 N Co) 7-1 INTRODUCTION In this chapter we begin our quantum mechanical study of atoms by treating the simplest case, the one electron atom. This is also the most important case. For instance, the one-electron atom hydrogen is of historical importance because it was the first system which Schroedinger treated with his theory of quantum mechanics. We shall see that the eigenvalues which the theory predicts for the hydrogen atom agree with those predicted by the Bohr model and observed by experiment. This provided the first verification of the Schroedinger theory. There is much more to the Schroedinger theory of the one-electron atom than its prediction of the eigenvalues, because it also predicts the eigenfunctions. Using the eigenfunctions, we shall learn about the following properties of the atom: (1) the probability density functions, which give us detailed pictures of the structure of the atom that do not violate the uncertainty principle as do the precise orbits of the Bohr model, (2) the orbital angular momenta of the atom, which were incorrectly predicted by the Bohr model, (3) the electron spin and other effects of relativity on the atom, which were also incorrectly predicted by the Bohr model, and (4) the rates at which the atom makes transitions from its excited states to its ground state— measurable quantities that were not predictable at all by the Bohr model. Above and beyond its historical and intrinsic importance, the Schroedinger theory of the one-electron atom is of great practical importance because it forms the foundation of the quantum mechanical treatment of all multielectron atoms, as well as of molecules and nuclei. In later chapters this will become very apparent. The one-electron atom is the simplest bound system that occurs in nature. But it is more complicated than the systems we have dealt with in the preceding chapters because it contains two particles, and because it is three dimensional. The system consists of a positively charged nucleus and a negatively charged electron, moving under the influence of their mutual Coulomb attraction and bound together by that attraction. The three-dimensional character of the system allows it to have angular momentum. We shall see that interesting new quantum mechanical phenomena arise as a consequence. Quantum mechanical phenomena involving angular momentum could not arise in our earlier considerations, which dealt only with one-dimensional systems. The three-dimensional character of the atom causes difficulty because it complicates the mathematical procedures that must be used in its treatment. However, the procedures are straightforward extensions of the simpler ones we have used on onedimensional systems, so no conceptual problems should arise. We shall avoid practical problems by relegating to appendices the solution of the more difficult equations, as well as other details of interest to some but not all students. We shall present in this chapter enough of the mathematics to make it apparent how it is related to that used in the preceding chapters. But here we shall emphasize the physical considerations underlying the mathematics, the results which it yields, and the interpretation of the results. The fact that the one-electron atom contains two particles causes no difficulty at all, if use is made of the reduced mass technique. This technique, discussed in Section 4-7, models the actual atom by an atom in which the nucleus is infinitely massive and the electron has the reduced mass u given by W _ (mMM)m (7-1) where m is the true mass of the electron and M is the true mass of the nucleus. The reduced mass electron moves about the infinitely massive nucleus with the same electron-nucleus separation as in the actual atom. Since the infinitely massive nucleus NOIl`Jf1OOb1Nl - O NE- E LECTRON ATOMS ^ Actual system Figure 7-1 Left: In an actual one-electron atom, an electron of mass m and nucleus of mass M move about their fixed center of mass. Right: In the equivalent model atom, a particle of reduced mass moves about a stationary nucleus of infinite mass. must be completely stationary, it is necessary to treat only the motion of the reduced mass electron in the model atom, and the problem is therefore simplified from one involving a pair of moving particles to one involving only a single moving particle. In classical mechanics, the motion of the reduced mass electron about the stationary nucleus in the model atom exactly duplicates the motion of the electron relative to the nucleus in the actual atom. Furthermore, the total energy of the model atom, which is just the total energy of its reduced mass electron, equals the total energy of the actual atom in a frame of reference in which its center of mass is at rest. The student may have seen a proof of these results of classical mechanics in connection with the motion of a planet about the sun, or some other system involving the motion of two particles. It is not difficult to prove that the same results are obtained in quantum mechanics, but we shall not bother to do so here. Figure 7-1 indicates the behavior of the electron and the nucleus in the actual atom and in the model atom. In both cases the center of mass of the atom is at rest. 7 2 DEVELOPMENT OF THE SCHROEDINGER EQUATION - We consider, therefore, an electron of reduced mass which is moving under the influence of the Coulomb potential — Ze z (7-2) 47r€0 .\/x 2 + y2 + z2 where x, y, z are the rectangular coordinates of the electron of charge —e relative to the nucleus, which is fixed at the origin. The square root in the denominator is just the electron-nucleus separation distance r. The nuclear charge is +Ze (Z = 1 for neutral hydrogen, Z = 2 for singly ionized helium, etc.). As a first step, we must develop the Schroedinger equation for this three-dimensional system. We do this by using the procedure indicated in Section 5-4. We first write the classical expression for the total energy E of the system V = V(x,y,z) = 2µ (px2 + p;, + pz) + V(x,y,z) = E (7-3) The quantities px , py,, pZ are the x, y, z components of the linear momentum of the electron. Thus the first term on the left is the kinetic energy of the system, while the second term is its potential energy. Now we replace the dynamical quantities px, p,,, pZ, and E by their associated differential operators, using an obvious three-dimensional extension of the scheme in (5-32). This gives us the operator equation Z ax 2 2 µ a2 a (7-4) + + V(x,y,z) = ih 2 + ây2 az2 at Operating with each term on the wave function `P _ W(x,y,z,t) we obtain the Schroedinger equation for the system — (7-5) h2 r a2w(x,Y,z,t) a2'`(x,Y,z,t) a2P(x,.y,z,t) 2µ L axe + + 0)72 aza ^' w Cn w a) V G' + V(x,y,z) (x,Y,z,t) = ih aP(x,Y,z , t) at (7-6) It is often convenient to write this as z - V2' + VT = ih OT 2 where we use the symbol 32 O2= a2 a2 ax2+ a Y2 + az2 (7-7) (7-8) which is called the Laplacian operator, or "del squared," in rectangular coordinates. Many of the properties of the three-dimensional Schroedinger equation, and of the wave functions which are its solutions, can be obtained by obvious extensions of the properties developed in the preceding chapters. For instance, it is easy to show by the technique of separation of variables, used in Section 5-5, that since the potential function V(x,y,z) does not depend on time there are solutions to the Schroedinger equation which have the form -IEtm (7-9) '(x,Y,z,t) = j(x,Y,z)e where the eigenfunction ,/i(x,y,z) is a solution to the time-independent Schroedinger equation — h2 2 V2 i (x,Y,z) + fz V (x,Y,z) (x,Y,z) = Eiji(x,y,z) (7-10) Note that in three dimensions this equation is a partial differential equation because it contains three independent variables, the space coordinates x, y, z. 7 -3 SEPARATION OF THE TIME INDEPENDENT EQUATION - The time-independent Schroedinger equation for the Coulomb potential can be solved by making repeated applications of the technique of separation of variables to split the partial differential equation into a set of three ordinary differential equations, each involving only one coordinate, and then using standard procedures to solve these equations. However, separation of variables cannot be carried out when rectangular coordinates are employed because the Coulomb potential energy is a function V(x,y,z) = —Ze 2/47r€O\/x 2 + y2 + z 2 of all three of these coordinates. Separation of variables will not work in rectangular coordinates because the potential itself cannot be split into terms, each of which involves only one such coordinate. The difficulty is removed by changing to spherical polar coordinates. These are the coordinates r, O, cp, illustrated in Figure 7-2. The length of the straight line connecting the electron with the origin (the nucleus) is r, and 9 and 9 are the polar and azimuthal angles specifying the orientation of that line. Now the distance between the electron and the nucleus is just r. So in spherical polar coordinates the Coulomb potential can be expressed as a function of a single coordinate r = /x2 + y 2 + z 2 , as follows V = V(r) _ —Ze z 47rEOr (7-11) OFTHE TIME-I NDEPENDENT E QUATION — z ^ ONE- E LECTRO N ATOM S N ^ d. o Figure 7-2 The spherical coordinates r, B, 9 of a point P, and its rectangular coordinates x, y, z. x Because of this great simplification in the form of the potential, it then becomes possible to carry out the separation of variables on the time-independent Schroedinger equation, as we shall soon see. The space derivatives in the time-independent Schroedinger equation also change form when the coordinates are changed from rectangular to spherical. A straightforward, but tedious, application of the rules of differential calculus shows that the time-independent Schroedinger equation can be written as h2 (7-12) v 2 11 (r,0 ,9) + V(r)til(r,e,9) = Etfr(r,e,rP) 2iC — where ôz 1 a (7-13) sin e + 1 r2 sin2 B 09 2 r2 ôr r2 sin e ôe ôe Or is the Laplacian operator in the spherical polar coordinates r, B, gyp. For the details of the coordinate transformation leading to (7-12) and (7-13), the student should consult Appendix M. A comparison of the forms of the Laplacian operator in rectangular and spherical polar coordinates, (7-8) and (7-13), shows that we have simplified the expression of the potential energy function at the expense of considerably complicating the expression of the Laplacian operator in the time-independent Schroedinger equation that must be solved. Nevertheless, the change of coordinates is worthwhile because it will allow us to find solutions to the time-independent Schroedinger equation of the form r(r,9,9) = R(r)0(8)c(9) (7-14) That is, we shall show that there are solutions iJr(r,9,9) to (7-12) that split into products of three functions, R(r), 0(9), and b((p), each of which depends on only one of the coordinates. The advantage lies in the fact that these three functions can be found by solving ordinary differential equations. We show this by substituting the product form, t11(r,9,9) = R(r)0(e)cF(9), into the time-independent Schroedinger equation obtained by evaluating the Laplacian operator in (7-12) from (7-13). This yields 1 axon 1 1 ô2 Rool ô h2 l a 2 ôROa. sin 9 2 L r2 ôr \r ôr / + r2 sin e ae ôe J + r2 sin2 8 ô 2 + V(r)ROl) = EROO Carrying outthe partial differentiations, we have R1 d h2 roe, d / 2 dR'\ d® \ RO d211 r sin e dr + r2 sin 9 de 2,u L r2 dr de ) + r2 sin2 8 d9 2 + V(r)R0430 = EROS V' = 1 ô r2 + J d92 R dr r dr d0 O d0 ( h2 As the left side of this equation does not depend on r or 0, whereas the right side does not depend on gyp, their common value cannot depend on any of these variables. The common value must therefore be a constant, which we shall find it convenient to designate as — mi.. Thus we obtain two equations by setting each side equal to this constant d2(1) = - mÎ d92 (7-15) ^ and N _—— 2 _ s i n 0 de — R dr r d } O sin 0 d0 ( 2r2 1 d r2 dR dr C dr + h2 [E — V(r)] = h2 r2 [E — V(r)] = m2 2 1 sin2 0 Bytranspoig,wec thsondquai 1 d ( (sin dd()) 0 d0 sin2 0 O sin 0 d 0 Since we have` here an equation whose left side does not depend on one of the variables and whose right side does not depend on the other, we conclude again that both sides must equal a constant. It is convenient to designate this constant as 1(1 + 1). Thus we obtain, by setting each side equal to 1(1 + 1), two more equations dO d (sin 0 + m O = 1(1 + 1)0 d0 sin2 0 sin 0 d0 1 ` and (7-16) N 2 + 2 [E (7-17) V(r)]R =1(1+ 1) R — 2dr(r l dR We see that the assumed product form of the solution, ÿr(r,0,9) = R(r)0(0)0(9), is valid because it works! We also see that the problem has been reduced to that of solving the ordinary differential equations, (7-15), (7-16), and (7-17), for (l )(T), 0(0), and R(r). In solving these equations, we shall find that the equation for D(T) has acceptable solutions only for certain values of m l. Using these values of m1 in the equation for 0(0), it turns out that this equation has acceptable solutions only for certain values of 1. With these values of 1 in the equation for R(r), this equation is found to have acceptable solutions only for certain values of the total energy E; that is, the energy of the atom is quantized. 7 4 SOLUTION OF THE EQUATIONS - Consider (7-15) for F((p). By differentiation and substitution, the student may easily verify that it has a particular solution (DM = e tmw (The discussion following Example 7-5 explains why this particular solution is used.) Here we must, for the first time, explicitly consider the requirement of Section 5-6 that the eigenfunctions be single valued. This demands that the function OM be single valued, and the demand must be considered explicitly because the azimuthal angles 9 = 0 and 9 = 27c are actually the same angle. Thus, we must require that N w Sec . 7-4 SOLUTIONOF THE EQUATIO NS In this equation we have written the partial derivative ôR/ôr as the total derivative dR/dr since the two are equivalent because R is a function of r alone. The same comment applies to the other derivatives. If we now multiply through by — 2µr2 sin2 0/RO Ih 2, and transpose, we obtain d(i)2µ dR — sin 0 d 1 d2 1 _ sin2 0 d r2 s in2 0[E — V(r)] sin 0 r2 ONE- ELECTRON ATOMS Q Û O(cp) has the same value at cp = 0 as it does at cp = 2n, that is EI)(0) _ 0(2g) Evaluating the exponential in the particular solution 1(cp), we obtain e imt0 = e iml2n or 1 = cos m127r + i sin m12ir The requirement is satisfied only if the absolute value of m1 has one of the values (7-18) Imll = 0, 1, 2, 3, ... In other words, m 1 can be only a positive or negative integer. Thus the set of functions which are acceptable solutions to (7-15) are (7-19) (1)m,(ço) = e irn o where m 1 has one of the integral values specified by (7-18). The quantum number m 1 is used as a subscript to identify the specific form of an acceptable solution. In solving (7-16) for the functions O(0), the procedure is similar to that used in Appendix I to obtain analytical solutions of the time-independent Schroedinger equation for the simple harmonic oscillator potential. Interested students are referred to Appendix N, which goes through this quite lengthy procedure. Here we shall only quote the results. It is found that solutions to (7-16) which are acceptable (remain finite) are obtained only if the constant 1 is equal to one of the integers (7-20) = I m1l , I mil + 1, l mll + 2, 'mil + 3, . . . The acceptable solutions can be written Oimi(0) = sin^m^^OFi (cos 0) (7-21) The Filmil(cos 0) are polynomials in cos 0, which have forms that depend on the value of the quantum number 1 and on the absolute value of the quantum number mi . Thus it is necessary to use both of these quantum numbers to identify the functions Oimi(0) that are acceptable solutions to the equation. Examples of these functions will be presented in Section 7-6. The procedure used in the solution of (7-17) for the functions R(r), which is also similar to that used for the simple harmonic oscillator potential, is also carried out in Appendix N. It is found that there are bound-state solutions which are acceptable (remain finite) only if the constant E (the total energy) has one of the values En, where uZ 2e4 En = (47rc0)22h2n2 In this expression the quantum number n is one of the integers n=1+1,1+2, 1 +3,... The acceptable solutions are most conveniently written as Rni(r) = e i Zr/nao Zr ao Zr Gn 1 cto ) (7-22) (7-23) (7-24) where the parameter ao is ao = 4gEO h2 ue 2 (7-25) The Gni(Zr/ao) are polynomials in Zr/ao , with different forms for different values of n and 1. Thus both of these quantum numbers are required to identify the different functions Rn1(r) that are acceptable solutions to the equation. But the allowed values En of the total energy carry only the quantum number n as a label since they depend only on the value of that quantum number. Examples of the functions Rni(r) will be presented in Section 7-6. One of the important results of the Schroedinger theory of the one-electron atom is the prediction of (7-22) for the allowed values of total energy of the bound states of the atom. Comparing this prediction for the eigenvalues itZ 2e4 13.6 eV En _ (4irc0)22h2n2 n2 with the predictions of the Bohr model (see (4-18)), we find that identical allowed energies are predicted by these treatments. Both predictions are in excellent agreement with experiment. Schroedinger's derivation of (7-22) provided the first convincing verification of his theory of quantum mechanics. Figure 7-3 illustrates the Coulomb potential V(r) for the one-electron atom, and its eigenvalues En . What is the relation between the Coulomb potential and its eigenvalues, and the potentials studied in Chapter 6 and their eigenvalues? One obvious difference is that the quantum mechanical calculations leading to the eigenvalues of the Coulomb potential are appreciably more complicated. But the Coulomb potential is an exact description of a real three-dimensional system. The potentials previously treated are approximate descriptions of idealized one-dimensional systems, which are designed to simplify the calculations. Part of the complication for the Coulomb potential is also due to its spherical symmetry, which forces the use of spherical polar coordinates instead of rectangular coordinates. The similarities are much more fundamental than the differences. For the Coulomb potential, as for any other binding potential, the allowed total energies of a particle bound to the potential are discretely quantized. Figure 7-4 makes a comparison between the allowed energies for a Coulomb potential and for several one-dimensional binding potentials. In this figure the Coulomb potential is represented on a crosscut along a diameter through the one-electron atom. Note that all the binding potentials have a zero-point energy. That is, in all cases the lowest allowed value of total energy lies above the minimum value of the potential energy. Associated with its zero-point energy, the one-electron atom has a zero-point motion like other systems described by binding potentials. In the following section we shall see that this phenomenon can give us a basic explanation of the stability of the ground state of the atom. - 0 —0.85 —1.51 —3.39 The Coulomb potential V(r) and its eigenvalues E n . For large values of n the eigenvalues become very closely spaced in energy since E„ approaches zero as n approaches infinity. Note that the intersection of V(r) and En , which defines the location of one end of the classically allowed region, moves out as n increases. Not shown in this figure is the continuum of eigenvalues at positive energies corresponding to unbound states. Figure 7-3 A0`d1:13N 303a 4N `d`Sb381A1 f1NWf1 1Mdf10`S311 -1 tlAN 3013 7-5 EIGENVALUES, QUANTUM NUMBERS, AND DEGENERACY 0 + 03 ONE- ELECTRON ATOMS N Simple harmonic oscillator Finite square well Coulomb Figure 7-4 A comparison between the allowed energies of several binding potentials. The three-dimensional Coulomb potential is shown in a cross-sectional view along a diameter; the other potentials are one-dimensional. Although the eigenvalues of the one-electron atom depend on only the quantum number n, the eigenfunctions depend on all three quantum numbers n, 1, m1 since they are products of the three functions &t(), (1) i„,,(e), and (I),„,((p). The fact that three quantum numbers arise is a consequence of the fact that the time-independent Schroedinger equation contains three independent variables, one for each space coordinate. Gathering together the conditions which the quantum numbers satisfy, we have =0,1,2,3,... l=lmil, lmil +1,lm il+ 2,lmil+ 3,... n=1+1,1+2, 1 +3,... These conditions are more conveniently expressed as n = 1,2,3,... 1= 0, 1,2,...,n-1 lmil (7-26) (7-27) m1 = —1, —1+1, ... 0 ..., +1-1,1 , , Show that the conditions of (7-27) are equivalent to those of (7-26). 10. According to (7-26) the minimum value of 1 is equal to Imil and the miminum value of Imil is O. Thus the minimum value of / is 0 and the minimum value of n, which is equal to 1 + 1, is 0 + 1 = 1. Since n increases by integers without limit, the possible values of n are n = 1, 2, 3, .... For a given n, the maximum value of l is the one satisfying the relation n = 1 + 1, that is, 1 = n — 1. Consequently the possible values of 1 are 1= 0, 1, 2, ... , n — 1. Finally, for a given 1, the largest value which l m il can assume is l mil = 1. Thus the maximum value of mi is +1 and the minimum v al ue is —1, and it can assume only the values mi = —1, —1 + 1, , 0, , +1 — 1, +1. • Example 7 1. - , Because of its role in specifying the total energy of the atom, n is sometimes called the principal quantum number. Because the azimuthal, or orbital, angular momentum of the atom depends on 1, as we shall soon see, 1 is sometimes called the azimuthal quantum number. We shall also see that if the atom is in an external magnetic field there is a dependence of its energy on mi . Consequently, m 1 is sometimes called the magnetic quantum number. The conditions of (7-27) make it apparent that for a given value of n there are generally several different possible values of 1 and m i . Since the form of the eigenfunctions depends on all three quantum numbers, it is apparent that there will be situations in which two or more completely different eigenfunctions correspond to exactly the same eigenvalue E. As the eigenfunctions describe the behavior of the atom, we see that it has states with completely different behavior that nevertheless have the same total energy. In physics the word used to characterize this phenomenon is degeneracy, and eigenfunctions corresponding to the same eigenvalue are said to be degenerate. There is little relation to the common usage of the word; degenerate eigenfunctions are not at all reprehensible! Table 7-1 Possible Values of I and m1 for n = 1, 2, 3 3 2 n 1 l 0 0 1 0 1 2 m1 0 0 —1,0, +1 0 —1,0, +1 —2, —1,0, +1, +2 Number of degenerate eigenfunctions for each l 1 1 3 1 3 5 Number of degenerate eigenfunctions for each n 1 4 9 N ^ EIGENVAL UES , Q UANTUM N UM BERS , ANDDEGENERACY Degeneracy also occurs in classical mechanics and in the related old quantum theory. In the discussion of elliptical orbits of the Bohr-Sommerfeld atom in Section 4-10, we indicated that the total energy of the atom is independent of the semiminor axis of the ellipse. Thus the atom has states with very different behavior, that is, with the electron traveling in very different orbits, which nevertheless have the same total energy. Exactly the same phenomenon occurs in planetary motion. This classical degeneracy is comparable to the 1 degeneracy that arises in the quantum mechanical one-electron atom. The energy of a Bohr-Sommerfeld atom, or of a planetary system, is also independent of the orientation in space of the plane of the orbit. This is comparable to the m i degeneracy of the quantum mechanical atom. In either classical or quantum mechanics, degeneracy is a result of certain properties of the potential energy function that describes the system. In the quantum mechanical one-electron atom, the degeneracy with respect to mr arises because the potential depends only on the coordinate r, so the potential is spherically symmetrical and the total energy of the atom is independent of its orientation in space. The / degeneracy is a consequence of the particular form of the r dependence of the Coulomb potential. If an external magnetic field is applied to the atom, then its total energy will depend on its orientation in space because of an interaction between currents in the atom and the applied field. We shall study this later, and we shall find that the orientation in space is determined by the quantum number m1. Thus in an external magnetic field the degeneracy with respect to mI is removed and the atom has different energy levels for different m1 values. If the external magnetic field is gradually reduced in intensity, the dependence of the total energy of the atom on m l is reduced in proportion. When the field is reduced to zero the energy levels that correspond to different values of m1 degenerate into a single energy level, and the corresponding eigenfunctions become degenerate. Many properties of alkali atoms can be discussed in terms of the motion of a single "valence" electron in a potential which is spherically symmetrical, but which does not have the 1/r behavior of the Coulomb potential. The energy of this electron does depend on 1. Thus the degeneracy with respect to 1 is removed if the form of the r dependence of the potential is changed. We shall study this phenomenon on a number of occasions later in this book, and in the process more insight into the origin of the / degeneracy of the Coulomb potential will be obtained. From (7-27) it is easy to see how many degenerate eigenfunctions there are, for an isolated one-electron atom, which correspond to a particular eigenvalue E. The possible values of the quantum numbers for n = 1, 2, and 3 are shown in Table 7-1. O NE-ELECTRON A TOMS Inspection of this table makes it apparent that: 1. For each value of n, there are n possible values of 1. 2. For each value of 1, there are (21 + 1) possible values of m 1 . 3. For each value of n, there are a total of n 2 degenerate eigenfunctions. 76 - EIGENFUNCTIONS The mathematical techniques used in quantum mechanics to obtain (7-22) for the eigenvalues of the one-electron atom are, admittedly, quite complicated compared to those used in the Bohr model to obtain the same equation. Putting aside questions concerning the logical consistency of the postulates of the Bohr model, it is still reasonable to question whether all the extra work involved in the quantum mechanical treatment of the one-electron atom is justified by the results obtained. The answer is, overwhelmingly, yes! We can now find out much more about the one-electron atom than we possibly could from the Bohr model, because we have the eigenfunctions as well as the eigenvalues. The eigenfunctions contain a wealth of additional information about the properties of the atom. The remainder of this chapter, and the following chapter, will be devoted largely to studying the eigenfunctions and extracting this information from them. We know that the eigenfunctions are formed by taking the product Y'nlm:(r,e,(P) = R./(0 0/.,(19)(1).,(0 We also know, from (7-19), (7-21), and (7-24) that for any bound state 'm t (W) = eimiro 0im,(0) = sinlm'l 0 (polynomial in cos 0) and Rn1(r) = e- (constant)r/n r1 (polynomial in r) All the eigenfunctions have basically the same mathematical structure, except that with increasing values of n and 1 the polynomials in r and cos 0 become increasingly more complicated. Table 7-2 lists the one-electron atom eigenfunctions for the first three values of n. They are expressed in terms of the parameter ao= 4^rEO^i2 2 =0.529 x 10 -to m = 0.529 A which is the radius (or, from Section 4-7, the electron-nucleus separation) of the smallest orbit of a Bohr hydrogen atom. The multiplicative constant in front of each eigenfunction has been adjusted so that it is normalized. In other words, the integral over all space of the corresponding probability density functions equals one, so that in each quantum state there is probability one of finding the atomic electron somewhere. Verify that the eigenfunction 0211, and the associated eigenvalue E2, satisfy the time-independent Schroedinger equation, (7-12), for the one-electron atom with Z = 1. • Since the differential equation is linear in 0, for the purposes of this verification we can ignore completely the multiplicative constant 1/8n 112 42, and write the eigenfunction as Example 7-2. tif = re-'12a0 s in Oe`co This is the simplest case with a nontrivial dependence on all three coordinates. Nevertheless, the verification of this case should give the student some confidence in the validity of all the eigenfunctions quoted in Table 7-2. Before beginning, let us introduce the convenient notation >Ji = f(r,cp) sin 0 = f sin 0 - Quantum Numbers . m1 n 1 0 0 ^ 100 = 2 0 0 ^zoo = 2 1 0 2 1 ±1 3 0 0 'I, 4'300 3 1 0 — ifr310 = 3 1 1 3/2 -Zr /a° ao ) e (z ^ (z)3/2 1 4 ^2^ 2 ^3/2 '/^ 1 Zo e -Zr/2ao sin ee + "° 4'21±1= 8 r o f ^1 ,^ \ a / a 1 3/2 1 = ` CZ V C 3/2 Vi 0 — /f 81 rc — 2 ao ao 22 + 2Z2 ao ) e zn/ 3a0 3a ocos 0 e ao Zr /3ao(3 cos 2 0 — 1) Z222 e - 1 81 ^/67c (Z) aoj 032,o0 = ^3 Zr Zr 3 Zr -Zr/ ao i9±` 6_ aoj ao ` 3 27 — 18 (6 — 81^ (ao ) \ ao/ ao 13/2 2 e Zr/2a° ao C Z)3/2 ( 3 -- 0 \ ao / (Z )3/2 Zr Zr/2ao cos B e _^ Y'210 — 4 2^ ao ,I, + 111/ 31 ± 1 2 w Eigenfunctions 1 3 N Some Eigenfunctions for the One-Electron Atom SNOIlO Nfl3N3 0 13 Table 7 2 (z \3/2 Z 2 e Zr /3a0 sin B cos O e ± `9 2 +1 032±1 = ±2 ( z y/2 'I, 1 ( Y'32±2 = 162 jc \ a• 81 ^ ao ao Z2r2 2 e Zr /3ao sin 2 e + -2i(p ao and -r/2a0 = gre- r/2ao 9(00)Ye This notation will be useful in evaluating the derivatives that enter in (7-12), which is = h2 2p [ r 01- V 2 (r ar ) + r2 sin B sin e First we calculate ae NI a 30 s in 9 73--6 f sin 0 0 ae a^ (sin B ae r 2 sin 0 aB a 1 Next we calculate 1 r2 sin 2 e a 2,1, =ae(f = f sin + r2 sin2 0 42+ V sin e) = f cos e cos 0 = f(cos 2 e — sin2 6) ^ 2 2^ ) = ) f (cos2 e— sin2 e ( r2 \ sin e _ (020 = — tif = —f sin 0 f r 2 sin e e — Elk O NE- ELECT RON AT OMS Adding these two results, we obtain 1 a a (sin B + f a2tfr r 2 sin 2 8 a92 r 2 sin 8 a8 r2 sin B a4 1 (cost B — sin2 8 — 1) 2f sin 2 B r2 sin B 2f sin 21/i 6 r2 r2 Then we calculate a % = g I e - r/2ao \ Or r2 a Or a^ar =_ = 2(z r 3 e -r/tap 2 2 e r/zao _ 3r e r/2a o + r e -r/2a0 2a0 2a0 4a2o r/2a0 _ r — 2 2 ( 1 - r/2ao 2a0 g( 2 = 2gre- r/ zao 1 e 2a0 r z e - r/2ao _ r (r2 a^) = g 2re ( Or ( r2 Or a (r2atk) Or r — 1— rao + 1 8a02 ) 0 = 2 (1 — a + 0 8a 02) 4 2 8ao )0 Substituting this term, and the term coming from the B and cp derivatives, into the differential equation that is supposed to be satisfied, we obtain 2µ [2 (r ra 0 + 8a20 V r ] + ) l/i = Eli/ or h2 1 (1rµao 8a0) +V =E Now ,ue4 E = E2 8(471E0)2h2 Also e2 V= 4zzE0r and ao = So we have h2 µe µe2 e2 /ie 4 8(47CE0)h2 47te0 r 8(47CE0) 2h 2 µe2 (1 µ 47tE0h 2 r 47tE Oh 2 2 Since inspection demonstrates that this equation is satisfied identically, we have completed the verification. • 7 7 - PROBABILITY DENSITIES We begin to extract information from the one-electron atom eigenfunctions by studying the forms of the corresponding probability density functions 'I,* e iEnt/ri a — iE„t/ w*tp* ** nl lm r ^ m* l R nO I lmi mi — 4' nlmi'Pnlmi = RO — P nlmt As these are functions of three coordinates, we cannot directly plot them in two dimensions. Nevertheless, we can study their three-dimensional behavior by considering separately their dependence on each coordinate. We treat first the r dependence in terms of the radial probability density P(r), defined so that P(r) dr is the 0.5 0.4 0.3 0.2 0.1 0 0.2 0.02 0.1 0.01 0 15 0 0.2 0.02 0.1 0.01 15 0.5 0 n=3,1=0 15 20 15 20 0.1 10 r 25 ao/Z Figure 7 5 The radial probability density for the electron in a one-electron atom for n = 1, 2, 3 and the values of / shown. The triangle on each abscissa indicates the value of r as given by (7-29). For n = 2 the plots are redrawn with abscissa and ordinate scales expanded by a factor of 10 to show the behavior of P„I(r) near the origin. Note that in the three cases for which 1 = /max = n 1 the maximum of Pn1(r) occurs at rBohr = n 2a o/Z, which is indicated by the location of the dashed line. - „i — m P S3I1ISN30 AlI1I8H8Oad probability of finding the electron at any location with radial coordinate between r and r + dr. By integrating p YPer g g the probability p Y densityY `F*`h^ which is a probability unit volume, over the volume enclosed between spheres of radii r and r + dr, it is easy to show that (7-28) P„l(r) dr = R,*,i (r)R„1(r)4itr2 dr The factor of 4nzr 2 is present on the right side because the volume enclosed between the spheres is given by that factor. The use of the quantum numbers n and l as labels to specify the form of a particular radial probability density function is obviously appropriate, but the form of these functions does not depend on the quantum number mi. Figure 7-5 plots several P„l(r), using dimensionless quantities for each axis. co Inspection of the figure shows that the radial probability densities, for each set of the pertinent quantum numbers, have appreciable values only in reasonably restricted ranges of the radial coordinate. Thus, when the atom is in one of its quantum states, specified by a particular set of its quantum numbers, there is a high probability that the radial coordinate of the electron will be found within a reasonably restricted range. The electron would quite probably be found within a certain so-called shell contained within two concentric spheres centered on the nucleus. A study of the figure will demonstrate that the characteristic radii of these shells is determined primarily by the quantum number n, although there is a small 1 dependence. This property can be seen in a more quantitative way by using the expectation value of the radial coordinate of the electron to characterize the radius of the shell. An obvious extension of the arguments of Section 5-4 to three dimensions shows that c the expectation value is given by the expression ONE- ELECTRON ATOMS N o 00 rn1 = Jo rPn,(r) dr If the integral is evaluated, this yields rn1 = nZ o { 1 + 2 C1 l(l n2 1)1l (7-29) The values of r n1 are indicated in Figure 7-5 with small triangles. It is apparent that r n1 depends primarily on n, since the l dependence is suppressed by the factor of 1/2 and the factor of 1/n 2 in (7-29). An interesting comparison can be made between (7-29) and (4-16) n2a0 rBohr = which gives the radii of the circular orbits of a Bohr atom (more precisely, it gives the electron-nucleus separation; see Section 4-7.) Quantum mechanics shows that the radii of the shells are of approximately the same size as the radii of the circular Bohr orbits. These radii increase rapidly with increasing n. The basic reason is that the total energy E„ of the atom becomes more positive with increasing n, so the region of the coordinate r for which E„ is greater than V(r) expands with increasing n, as can be seen in Figure 7-3. That is, the shells expand with increasing n because the classically allowed regions expand. Example 7-3. (a) Calculate the location at which the radial probability density is a maximum for the ground state of the hydrogen atom. (b) Next calculate the expectation value for the radial coordinate in this state. (c) Then interpret these results in terms of the results of measurements of the location of the electron in the atom. •(a) The radial probability density for the n = 1, 1 = 0 ground state is P1 o(r) = Ri o(r)R1o(r) 47cr2 We take R 10(r) from the r-dependent factor of the first eigenfunction listed in Table 7-2, with Z = 1, and obtain 2r/aor2 P1o(r) = e-r/aOe -r/a°r2 = e- We have ignored normalization (i.e., for simplicity taken the multiplicative constant equal to one) since it has no effect on what we are about to do. This is to find the maximum in P 1 0(r) by evaluating its derivative with respect to r and setting the result equal to zero. That is dP10(r) = — 2 e - 2r/ao1. 2 + e 2r/a o dr a0 = — r e - 2r/a ° 2r=0 ao 2r The solution to the equation we have obtained is r --=0 ac, r = ao This is the location of the maximum in the radial probability density. (b) To calculate the expectation value of the radial coordinate r, we evaluate (7-29), with n = 1, l = 0, and Z = 1. We obtain rio = ao {1 + (1/2)[1]} = 1.5a 0 (c)We have found that the expectation value of r is somewhat larger than the value of r at which the radial probability density is a maximum. The reason is that the radial probability density is asymmetrical about its maximum in such a way that there is a small, but not negligible, probability of finding fairly large values of r in measurements of the location of the electron in the atom. So, although the most likely location of the electron is at r = ac, (i.e., at the ground state Bohr electron-nucleus separation, the average value obtained in measurements of the location is r = 1.5a0 . All these features can be seen by inspecting the top curve of Figure 7-5. • Example 7 4. In its ground state, the size of the hydrogen atom can be taken to be the radius of the n = 1 shell for Z = 1, which is essentially ac, = 4xe0 h 2 /pee 0.5 A. Show that this fundamental atomic dimension can be obtained directly from consideration of the uncertainty . principle. 2 ^ The form of the potential function —e V(r) = - 4nEor tends to cause the atom to collapse since the smaller the distance from the electron to the nucleus the more negative is the potential energy. This tendency is opposed by the effect of the uncertainty principle, as follows. If the electron is located within a region of size R, then any component of its linear momentum must have an uncertainty of approximately Ap= R This uncertainty reflects the fact that the linear momentum of magnitude p can be in any direction, so the components can have values ranging from —p to +p. Thus the uncertainty in any component of the linear momentum also satisfies approximately the relation Op=p Therefore, the electron must have a kinetic energy approximately equal to p2 (»)2 h2 2p 2p 2pR 2 We see that the kinetic energy becomes more positive with decreasing R, which opposes the effect of the potential energy to cause collapse. If the size of the atom is R, its potential energy is approximately V e2 =4itEOR Then the total energy of the atom is approximately E=K+V= h2 e2 2pR 2 4ne 0R Obeying the common tendency of all physical systems to be as stable as possible, the atom will adjust its size so as to minimize its total energy. The existence of an optimum size can be seen qualitatively by inspecting Figure 7-6, which plots K, V, and E as functions of R. (Note that R is not the radial coordinate; it is the size of the atom, which we are treating as a variable in order to determine its optimum value.) We can find the most energetically favorable size quantitatively by differentiating E with respect to R, and setting the derivative equal to zero. S3I1I SN3a A1I 1I 8`d8 O1:i d or co ONE- E LECTRONATOMS N Energy E=K+V 0 ■■=01.1111 C=11 R Figure 7-6 The qualitative behavior of the kinetic energy K, potential energy V and total energy E of a hydrogen atom, as functions of the size R of the atom. For small R, K increases more rapidly than V decreases because K oc 1/R 2 while V cc —1/R. For large R, K becomes negligible compared to V. As a result, E has a minimum at a certain value of R (indicated by the mark on the R axis), and at this size the atom is most stable. That is dE dR 2h 2 e2 =0 2µR 3 + 47r€0 R 2 Solving this equation for R, we find 47rE0h2 R = µe2 = a() the size which gives minimum total energy, and therefore the most stable atom. The uncertainty principle governs the minimum size of the atom because it governs its minimum energy. This is the zero-point energy of the ground state, which has a size that a ri ses from its zero-point motion. These simple ideas provide a very satisfactory answer to the question of the stability of the ground state of the atom. And this is particularly so if we also consider the discussion following Example 5-13, which shows that in its ground state the atom does not radiate. Figure 7-5 shows that the details of the structure of the radial probability density functions do depend on the value of the quantum number 1. For a given n, the function has a single strong maximum when l takes on its largest possible value; but additional weaker maxima develop inside the strong one when 1 takes on smaller values. Generally, these weaker maxima are not so important. However, there is a related property that can be very important. Inspection of the figure, particularly the expanded plots for n = 2, 1 = 0, and n = 2, 1 = 1, will demonstrate that the radial probability density functions have appreciable values near the origin at r = 0 only for 1 = 0. This means that only for l = 0 will there be an appreciable probability of finding the electron near the nucleus. Another way of seeing this property is to consider the probability density, 1P *+ = elk, itself. Inspection of the eigenfunctions listed in Table 7-2 will show that for values of r which are small compared to ao/Z, where the exponential term is slowly 4^nimt 1 nlmt — Rn Rn1OÎmtOtmtalmtalmt From (7-19) we have 1 Thus the probability density does not depend on the coordinate çp. The threedimensional behavior of >/iimttb nlmt is therefore completely specified by the product of the quantity R 1(r)R n,(r) = Pn,(r)/4itr2 and the quantity 0i t(0)01mt(0), which plays the role of a directionally dependent modulation factor. The form of the factor 01 t(0)01mt(0) is conveniently presented in terms of polar diagrams, of which one is shown in Figure 7-7. The origin of the diagram is at the point r = 0 (the nucleus), and the z axis is taken along the direction from which the angle 0 is measured. The distance from the origin to the curve, measured at the angle 0, is equal to the value of 0i t(0)01mt(0) for that angle. Such a diagram can also be thought of as representing the complete directional dependence of 1ÿflan, Y' nlmt by visualizing the three-dimensional surface obtained by rotating the diagram about the z axis through the 360° range of the angle cp. The distance, measured in the direction specified by the angles 0 and gyp, from the origin to a point on the surface, is equal to 0i t(0)01mt (0)(1)mt(Çp)'Fmt(cp) for those values of 0 and cp. Ont# ) `m t (9) = e - imw e tmi0 = Figure 7-7 A polar diagram of the factor which determines the directional dependence of the one-electron atom probability density. S3I1I SN30 JIlI1I8br80ad varying, the radial dependence of all the eigenfunctions has the behavior fi cc r' r —> 0 (7-30) This behavior can easily be verified by direct substitution into (7-17), the equation that determines the radial dependence of the tp. As a consequence, the radial dependence of the probability densities for small r is cc r 21 r 0 (7-31) From this it follows that the value of elk in a small volume near r = 0 is relatively large only for 1 = 0, and decreases very rapidly with increasing 1. The reason is that r° »r2 » r4 »..., for r-0. We see that there is some probability that the electron will be near the nucleus if = 0, but very much less probability that this will happen if 1 = 1, and even less if 1 = 2, etc. This can have important effects in certain circumstances because the potential energy of the atom becomes very large in magnitude if the electron is near the nucleus. We shall see later that this is particularly true for the case of multielectron atoms, which have essentially the same property. In fact the r1 behavior of the eigenfunctions for small r is of predominant importance in the structure of multielectron atoms. We shall also see later that the r1 behavior is due physically to the angular momentum of the atom, which depends on 1. Now let us proceed to the study of the angular dependence of the probability density functions 0 z ^ ONE- ELECTRO N ATOMS N z z z 1 =3, m1 = ±2 1 =3, mi = ±3 1=3 , m1 =±1 1 = 3,m1 =0 Figure 7-8 Polar diagrams of the directional dependence of the one-electron atom probability densities for / = 3; m 1 = 0, ±1, ±2, ±3. In Figure 7-8 we illustrate an example of the dependence of the form of Oi ,(9)O 1m,(9) on the quantum number ml , by a set of polar diagrams for l = 3, and the seven possible values of m 1 for this value of 1, i.e., for m1 = — 3, — 2, —1, 0, 1, 2, 3. Note the way in which the region of concentration of O*,,,(9)O 1m,(9), and therefore Otnitfrnlmi, shifts from the z axis to the plane perpendicular to the z axis as the absolute value of m1 increases. Some features of the dependence of O1 1(9)O1m,(9) on the quantum number / are indicated in Figure 7-9 in terms of a set of polar diagrams for m 1 = ± l and l = 0, 1, 2, 3, 4. In the case n = 1, 1 = m1 = 0, which is the ground state of the atom, 1/i n m14'nlm, depends on neither 0 nor cp and the probability density is spherically symmetrical. For the other states, the concentration of probability density in the plane perpendicular to the z axis, when m 1 = ± 1, becomes more and more pronounced with increasing 1. Figure 7-10 is an attempt to overcome the limitations of the two-dimensional printed page using shading to represent the three-dimensional appearance of the probability density functions for various states of the one-electron atom. The probability density functions displayed in these figures generally have a set of spherical and conical surfaces, defined by certain values of r and 0, on which they equal z 1=1,m1= ±1 1=3,m1=1- 3 1=2,m1=±2 1=4,m1=±4 Figure 7-9 Polar diagrams of the directional dependence of the one-electron probability densities for 1 = 0, 1, 2, 3, 4; m 1 = ±1. i CI) CD C) n = 1,1=m1 = 0 S3I1I SN3 a A1 1118t1 8Oa d .44 n=3, 1= 2, m1 =0 Figure 7-10 An artist's conception of the three-dimensional appearance of several one-electron atom probability density functions. For each of the drawings a line represents the z axis. If all the probability densities for a given n and 1 are combined, the result is spherically symmetrical. ONE- ELECTRON ATO MS zero. These nodal surfaces are analogous to the nodal points at which the probability density for a particle bound in a one-dimensional potential equals zero (see, for example, Figure 6-32). They are a consequence of the fact that the wave functions for a bound particle must be standing waves with fixed nodes. However, if a collection of hydrogen atoms has been completely isolated from its environment, it is not possible to then make measurements on the locations of the electron in each atom, knowing that they are all in a quantum state with a particular set of quantum numbers n, 1, m l , and thereby locate the nodal surfaces for that state. If it could be done it would certainly be remarkable, because it would allow the determination of the direction of the z axis. And this would amount to finding for each atom a preferred direction in a space which should be spherically symmetrical, because the Coulomb potential of the atom V = Ze 2/4rrE0 r is spherically symmetrical. In fact, it cannot be done because it is generally not possible to observe any of the probability density patterns of Figure 7-10 in actual measurements on free atoms (i.e., atoms in the complete absence of external magnetic or electric fields). The only exception is the spherically symmetrical state for n = 1, 1 = m1 = 0. The reason is that, with the exception of the state just mentioned, every state is degenerate with several other states of the same n value. Because the energies of atoms in degenerate states are identical, it is not possible experimentally to separate them from each other with techniques that leave the probability density unchanged. Thus, all that can be measured is the average probability density of the atoms for the entire set of states which are degenerate with each other. It turns out that the probability density functions, when averaged together in this manner, always yield a spherically symmetrical function. — Example 7 5. Evaluate the average of the probability density functions for the set of degenerate states corresponding to the energy E2. ■ We have 1 ' II,, 'r, I I' 'I 4 [''//,1''2001''200 + Y' 2 1-1121-1 + 4'210`1'210 + 5' 2 114'211] - = 128 J( Z) 3 1e-zr/a° 3 e -zr/aO C L Zr 2 2 ao I — ( ' 2 + ( ao/ 2 (2 sin e 0 + s 2 sin e 8 + co 2 O 2 (7-32) L \2 a 0r^ + \a 0r/ J This spherically symmetrical distribution would be the result of a sequence of measurements on the locations of the electrons in one-electron atoms of total energy E2. Of course, it cannot be used to detelmine the direction of the z axis, and so there is no contradiction with the fact that this direction was initially chosen in a completely arbitrary way. ` •ote that even for each subset of states including all possible values of m1 for a given n and 1 (a "subshell") the sum of the probability densities is spherically symmetrical. That is is spherically symmetrical, and also t f/2 1 -1 111 21 -1 + 1V 210 210 +114114/211 is spherically symmetrical. This important property is illustrated in Figure 7-10. It will be used later in arguments concerning multielectron atoms, and nuclei. • 1281 (a0 , 2oo^20o On the other hand,, consider a situation in which the orientation of the z axis is not arbitrary because there is a preferred direction defined, for instance, by an external magnetic or electric field applied in that direction to the collection of hydrogen atoms. In such a field the quantum states are not degenerate, as we shall see later, and measurements of the probability density of atoms in a particular state can be performed. In fact, such measurements can be used to determine the direction of the external field. To help the student understand the ideas just discussed, let us restate them as follows: 1. If the behavior of an alom is governed by a potential which has spherical symmetry, like the Coulomb potential which depends only on the distance from the electron to the nucleus, In the next section we shall show that the quantum numbers 1 and m1 are related to the magnitude L of the orbital angular momentum of the electron, and to its z component LZ, by the relations L= N/l(l+ 1)h L Z = mi ff We mention this now because it is an important clue to the interpretation of the dependence of t/J n nitfr„ mi on 1 and m1. Consider the case m 1 = 1. Then LZ = lh, which is almost equal to L = \/l(l + 1)h. In this case the angular momentum vector must point nearly in the direction of the z axis. For a Bohr atom this would mean that the orbit of the electron would lie nearly in the plane perpendicular to the z axis, as illustrated in Figure 7-11. With increasing values of 1, the value of lh approaches the value of /l(l + 1)h, so that L Z approaches L. This means the angle between the angular momentum vector and the z axis decreases. In terms of the Bohr picture, this demands that the orbit lie more nearly in the plane perpendicular to the z axis. An inspection of the polar diagrams of Figure 7-9 will demonstrate the correspondence between these features of i/rnlmtJniml and the picture of a Bohr orbit. For m 1 = 0 we have LZ = 0, and the angular momentum vector must be perpendicular to the z axis. In a Bohr atom this would mean that the plane of the orbit contained the z axis. Some S3I1ISN34 AlIiIBt/B Oad none of the properties of the atom should single out any particular direction in space because all directions are equivalent. 2. If the atom is placed in an external electric or magnetic field, the spherical symmetry is destroyed and the direction defined by the external field becomes unique. 3. When one direction is unique, we choose one axis of our coordinate system to be in that preferred direction because it simplifies the description of the physical situation. We can choose other directions, but this unnecessarily complicates the mathematical description. (In electromagnetism, as an example, when treating a cylindrical wire it is very advantageous to take one axis of the coordinate system along the axis of the cylinder.) 4. By convention, we call the preferred axis the z axis. (The convention probably comes from cylindrical coordinates, in which the axis about which the angular coordinate varies is called the z axis.) But we could have called the preferred axis the x or y axis, just as well. 5. Even if there is no preferred direction, because no external field is applied to the atom, we still must choose some arbitrary direction in space for the z axis of our coorindate system. But in this case the z axis is not unique physically; it is merely a mathematical construct. Therefore, its choice should have no measurable consequences. We should also point out that a uniform applied field can serve to define for the atom only a single preferred direction. As we have indicated, such a field will generally remove part of the degeneracy of the eigenfunctions, and probability densities that depend on the angle B can be measured. But the probability densities remain independent of the angle 9, since 1i*i/i cc (1)„*,(0)I;,ü(cp) = e - amt9 e`mt ° = 1 for every eigenfunction. That is, the probability densities retain their axial rotation symmetry about the direction of the applied field, as certainly must be the case. A nonuniform applied field can serve to define additional preferred directions. It is not surprising that such fields can destroy the axial rotation symmetry of the probability density of an atom under their influence. Although we have not allowed for this possibility in our development, because we shall not need to, it is easy to do if necessary by taking particular solutions to (7-15) in the form (1),„,,((p) = cos micp or im,(cp) = sin m1 cp, instead of in the form we have taken. With no applied field, or with uniform applied field, the eigenfunction associated with cos m 1cp is degenerate with the eigenfunction associated with sin m ice, so measurement of the probability density will always yield a co-independent combination cc cos t m ice + sine mice = 1, just as with the eigenfunctions that we use. In the nonuniform applied field the degeneracy can be removed, however, and probability densities that do not have axial rotation symmetry can be observed. The solutions 1 mt(cp) = cos mice and t mi((p) = sin mice are frequently used in chemistry since one atom in a molecule is acted on by a highly nonuniform field produced by the other atoms. ONE- ELECTRO N ATOMS z A Bohr orbit lying in a plane nearly perpendicular to the z axis. Figure 7 11 - indication of this behavior can be seen in the polar diagram for l = 3, ml = 0 of Figure 7-8. Although there are many points at which the quantum mechanical theory of the one-electron atom corresponds quite closely to the Bohr model, there are certain striking differences. In both treatments the ground state corresponds to the quantum number n = 1, and it has the same value of total energy. But in the Bohr model the orbital angular momentum for this state is L = nh = h, whereas in quantum mechanics it is L = 111(/ + 1)h = 0, since l = 0 when n = 1. There is an overwhelming amount of evidence, from measurements of atomic spectra and elsewhere, that shows the quantum mechanical prediction for zero orbital angular momentum in the ground state to be the correct one. This prediction is also in agreement with one obtained by using the techniques we developed earlier to calculate the expectation values of the total kinetic energy of the electron in the ground state and of the kinetic energy associated only with radial motion. The two values are found to be equal, implying that the motion is entirely radial in that state. If the Bohr model were modified in a way that would allow for zero angular momentum states, the orbit for such a state would be a radial oscillation in which the electron passes directly through the nucleus, and the oscillation could take place along any direction in space. This would correspond, in a sense, to a spherically symmetrical probability density or charge distribution, similar to that which is predicted by quantum mechanics and is observed experimentally. Nevertheless, it is difficult to visualize the motion of an electron in the ground state of the quantum mechanical atom. That is, it is difficult to make an analogy to a classical picture, such as the Bohr picture. But this situation is not unique; it is equally difficult to visualize the motion of an electron traveling through a two-slit diffraction apparatus. 7-8 ORBITAL ANGULAR MOMENTUM We shall now proceed to justify the relations (7-33) LZ = mlh (7-34) L= V1(/ +1)h between the quantum numbers m l and 1, and the z component LZ and magnitude L of the angular momentum of an electron in its "orbital" motion about the center of an atom. The justification will take a little effort, but it will be well worth it. We have just seen that these relations are very useful in interpreting the angular dependence of the probability density functions for a one-electron atom. As we continue our study of quantum physics, we shall see that the angular momentum relations are extremely important in the study of all atoms (and nuclei). The basic reason is that in most circum- Lz = xp y — Ypx where x, y, z are the components of r, and px , py , pz are the components of p. In order to study the dynamical quantity angular momentum in quantum mechanics, we construct the associated operators. This is done by replacing px , py , pz by their quantum mechanical equivalents — ih a/ax, — ih 0/0y, — ih 0/3z, according to an obvious three-dimensional extension of (5-32). Thus the operators for the three components of angular momentum are Lxop = — ih \Y az — z aY al 0) a — x-L yop = — ih (z ô x 88y— Lzop = — ih(x — y (7-36) ax ) Because we must use spherical polar coordinates, these expressions must be transformed into these coordinates. Appendix M shows how this can be done. The results are Lxop = ih sin (p \ Lyop = 6 + cot B cos çP ^ ^P / ih — c os cp ^ a + cot 8 s in (p ^ C L = — i^t ô^p ^ ) (7-37) W(11N3WOW EI `dT1 JNV 1d11 81:1 0 stances the z component and magnitude of the angular momenta of the particles in microscopic systems remain constant. From a classical point of view, this happens because in most systems the particles move in spherically symmetrical potentials that cannot exert torques on them. We shall find that, of all the quantities that can be used to describe atoms (and nuclei), angular momentum and total energy are about the only ones that do remain constant. A consequence is that most experiments on such systems involve measuring angular momentum and total energy. Therefore, quantum mechanics must be able to make predictions about angular momentum, as well as total energy. Another parallel between these two is that both are quantized. In other words, the relations of (7-33) and (7-34), stating that L z and L have the precise values mr h and Jl(l + 1)h, are quantization relations just like the energy quantization relation stating that the total energy E of a one-electron atom has the precise values —uZ2e4/(4n€0)22h2 n2. Angular momentum quantization is certainly as important as energy quantization. The only reason that it has not appeared before in our treatment of Schroedinger quantum mechanics is that the treatment was restricted to onedimensional systems. Of course, angular momentum is the dynamical quantity that sets real three-dimensional systems apart from one-dimensional idealizations in which it has no meaning. The angular momentum of a particle, relative to the origin of a certain coordinate system, is the vector quantity L defined by the equation (7-35a) L=r xp where r is the position vector of the particle relative to the origin, and p is the linear momentum vector for the particle. By evaluating the components in rectangular coordinates of the vector, or cross, product, it is easy to show that the three rectangular components of L are Lx = Ypz — zp y (7-35b) Ly = zpx — xp z CO N ONE- ELECTRON ATOMS ^ ci. We shall also be interested in the square of the magnitude of the angular momentum vector L, which is L2 =LX + Ly + Lz As is indicated in Appendix M, in spherical polar coordinates the associated operator is 1 a2 o __ 2 r 1 a C sin 0 a l (7-38) L°p — sin 0 a0 a0 + sine 0 09 2 , The first step in deriving the angular momentum quantization equations involves using the operators to calculate the expectation values of the z component of L, and of the square of its magnitude, for an electron in the n, 1, m1 quantum state of a oneelectron atom. According to the three-dimensional extension of the prescription of (5-34), the expectation value Lz is v ^ ir Zit r =J0 0JJ 0 T*Lzpp YJr 2 sin 0 dr dO dkp The quantity r2 sin 0 dr d0 dçp is the element of volume in spherical polar coordinates, and the integrations are taken over the complete ranges of all three coordinates. Because it will simplify the notation, without causing confusion, we shall write this expression as Lz = J T*Lzop lii dZ Here dr stands for the three-dimensional volume element r 2 sin 0 dr dO dçp, and f stands for the three definite integrals f ô Pig'. ô'The same shorthand notation will be . used in the remainder of this chapter, and in the following chapters. Continuing our calculation of L z , by expressing the wave function as a product of the eigenfunction and the exponential time factor we obtain Lz = or N etE,,f/ Y'n Lmt L zop e J - iEnt/^tY'nlmt dZ (7 39) Lz = jZm i L;opVmnim i dz - Similarly, the expectation value of L 2 is L2 = ^/^ * o2 ,/, (7-40) (7-40 Y^nlmtLo p Y'nlmt dZ To evaluate the integrals in the two numbered equations above, we must first evaluate 2 ' J^ Lzop^nlmt and L op Iinlmi . Example 7-6. Evaluate atom eigenfunction. to-We have Lz opi nlm t , where Lzop = —itza/ap, and where Lzop ^/nlm t = Ih aY'nlmt 09 Since knlmt = R nl(r)eImtleAmt(9) we obtain l^2 a^nlmt 09 r = Rnl^r)^Im t ^B) L — I^l d^m t (9)1 Ll9 Cam , is a one-electron N According to (7-19) omt((P) = ^ v en"' SO dyp = lYnietm^^ = lml 0m t (40) Thus i^i. a anlm [ = ^ R ni (I')O (B) [ — IlllmI ^m(t ^)^ ~Imt = mlhRnl(r)OIm t (e) 0m t ((p) and we obtain the answer Lz op ^Ÿnlm t — (7-41) mlhOnlmt ^ Although we do not have a concise expression for the functions O lmt(9), which must be differentiated to evaluate L p!r/nlmt, we know that these functions satisfy the differential equation (7-16). Using this fact, it is not difficult to show that (7-42) Loptfrnlmt = 1(1 + 1)1221 nlmt Using (7-41) from Example 7-6 in (7-39), which is T Lz = it is trivial to evaluate Lz . J Y'nlmtLzoPY^nlmt dx We have Lz = mlh Y^nimt^nlmt dz But we know that this integral has the value one because it is equal to the probability density integrated over all space, i.e., the probability of finding the electron somewhere. Thus we obtain Lz = mlh (7-43) In a similar fashion we use (7-42) in (7 -40), which is L2 = 2 ,1 ^* nlm t Lop^Y, nlmt da to obtain LZ = 1(1 + 1)h2J Y^nlmt`l', J/,, nlmt da _ L Z = 1(1 + 1)h2 * (7-44) Let us compare the results of our expectation value calculations, (7-43) and (7-44), with the quantization relations we are trying to verify, that can be written Lz = mitt (7-45) L2 = l(1 + (7-46) The former are certainly consistent with the latter, but they are not proofs of the latter. The quantization relations make stronger statements about the values of L z and L2 . These relations say that any measurement of the angular momentum of an electron in the n, 1, m 1 state of the atom will always yield Lz = m lh and L2 = 1(1 + 1)h2 1A1I11N3WOW adTnJNd 1b`1181:10 dOmi(T) O NE-ELE CTRO N ATOM S since, in that state, these quantities have precisely the values quoted. But the expectation value relations say only that the values quoted will be obtained on the average, that is, when the results of a large number of measurements of L Z and L2 are averaged. To complete the proof of the quantization relations is a matter of continuing along the line we have been following. For example, by calculating the expectation value of some power of LZ , say the square Lz, it is found that LZ = (m1h) 2 . This immediately leads to the conclusion that not only must L Z equal mh on the average, i.e., L Z m1h, but that LZ must equal mh always, i.e., LZ = m1h. The point is that if LZ fluctuated about its average m1h it would not be possible to obtain LZ = (m1h)2 because when averaging a power of L Z higher than the first more weight is given to fluctuations above the average than to fluctuations below the average. In order to proceed with our interpretation of the angular momentum of one-electron atoms, we defer the details of this proof to the following section. There we shall also obtain the interesting conclusion that L x and L y , the x and y components of the orbital angular momentum, do not obey quantization relations. The fact that Cam , does not describe a state with a definite x and y component of orbital angular momentum, because these quantities are not quantized, is mysterious from the point of view of classical mechanics. According to the angular momentum conservation law of classical mechanics, the orbital angular momentum vector of an electron moving under the influence of a spherically symmetrical potential V(r) of a one-electron atom in free space would be completely fixed in direction and magnitude, and all three components of the vector would have definite values. The reason is that there would be no torques acting on the electron. The fact that this result is not obtained in the quantum mechanical theory is a consequence of the fact that there is an uncertainty principle relation which states that no two components of an angular momentum can be known simultaneously with complete precision. Because the z component of orbital angular momentum has the precise value m1h, the relation requires that the values of the x and y components be indefinite But one thing can be said about the values of these components: Upon evaluating Lx and L y, their average values, it is found that both equal zero. So although the particular value of L x that would be obtained in any particular measurement cannot be predicted, it can be predicted that the average value that would be obtained in a set of measurements of L x is zero. And similarly for L y . Many of the properties of the orbital angular momentum can be conveniently represented by a vector model. Consider the set of states having a common value of the quantum number 1. For each of these states the length of the orbital angular momentum vector, in units of h, is L/% = x/1(1 + 1). In the same units, the z component of this vector is LZ/h = m1 . The z component can assume any integral value from L Z/1i = — Ito L i/h = +1, depending on the value of m 1 . The case of l = 2 is illustrated in Figure 7-12. The figure depicts the angular momentum vectors for each of the five states NI2(2 — 0^ ^^ - 1 C _____ i -2 + 1) Figure 7-12 Representing the angular momentum vectors (measured in units of h) for the possible states with I = 2. In each state the vector is equally likely to be found anywhere on a cone symmetrical about the z axis. It has a definite magnitude and z component but does not have a definite x or y component. , 7-9 EIGENVALUE EQUATIONS Here we shall complete the derivation, started in the previous section, of the orbital angular momentum quantization conditions. Then we shall generalize the results of the derivation to point out an interesting feature of Schroedinger's theory of quantum mechanics. To study the quantization of the orbital angular momentum, we focus attention first on its z component, L2 . Now, if the z component quantization condition of (7-45) is valid, then any measurement of L Z will always yield the same precise value specified by that quantization condition LZ = m 1hi (7-47) Furthermore, measurements of some higher power of L 2 , say the square LZ , will always yield the same value LZ = (m 1h)2 . As a consequence, the expectation value of the square of L Z will be just LZ = (m lh)2 . Note that, since we also have L Z = m 1h, this means (7-48) Lz = LZz That is, the expectation value of the square of L Z equals the square of the expectation value of L2 , if the quantization condition of (7-47) is valid. On the other hand, if (7-47) is not valid then measurements of L Z can lead to various values, subject, however, to the constraint that the values average out to yield mlhi because we have proven in (7-43) that L Z = m 1h in any case. If the measured values of L Z fluctuate about the average value m1h, then the expectation value of the square of L Z will no longer equal the square of m itt.. The reason is that when averaging a higher power of L2 , like its square LZ , we give much more weight to the cases in which L Z is larger than LZ , and much less weight to the equally numerous cases in which L Z is smaller than L. In this situation Lz (mlh) 2 , so L?L 2 . An example is shown in Table 7-3, which applies the ideas just discussed to calculating the square of the average, and the average of the squares, of the ages of a group of children whose individual ages are 1, 2, and 3 years. Inspection of the table shows that when the ages are first squared, and then averaged, a larger result is obtained than when the ages are first averaged, and then squared. This will be true in any case in which a power of the ages higher than the first is averaged, and in which the ages fluctuate. But if all the children in the group have ages precisely equal to each other, and therefore to the average age, then it makes no difference in N 01 ^ SNOI.Ldf1 03 3fTI `dnN30I3 corresponding to the five possible values of m 1 for this value of 1. In any one of these states the angular momentum vector is equally likely to be found anywhere on a cone symmetrical about the z axis, and therefore has a definite z component as well as a definite magnitude. The vector does not have a definite x or y component, but the value of either of these quantities is as likely to be positive as it is to be negative. The actual orientation in space of the angular momentum vector is known with the greatest precision for the states with m 1 = + 1. But even for these states there is some uncertainty since the vector can be anywhere on a cone of half-angle cos' [lWl(l + 1)]. In the classical limit 1 — co, and this angle becomes vanishingly small. Thus, in the classical limit the angular momentum vector for the states m 1 = + l is constrained to lie almost along the z axis and is therefore essentially fixed in space. This agrees with the behavior predicted by the classical theory, i.e., with the classical orbital angular momentum conservation law. The quantum number m1 determines the space orientation of the orbital angular momentum vector of the one-electron atom. Therefore, in a sense it determines the orientation in space of the atom itself. As the spherically symmetrical Coulomb potential implies that there is no preferred direction in the space in which the atom is situated, we can understand why the theory predicts that the total energy of the atom does not depend on m1 which determines this orientation. Thus we can understand why the eigenfunctions are degenerate with respect to the quantum number ml . The energy of the atom simply does not depend on its orientation in empty space. 0 m The Square of the Average, and the Average of the Squares, of a Set of Fluctuating Numbers Table 7-3 ONE- ELECTRON ATOMS N ^ Q ^ r A= 1,2,3 A 1+2+3 - 3 - 6 3 - 2 A2 = 4 A2 = 1,4,9 1+4+9 - 14 - 4.67 3 3 AA- N/A 2 —A2 =,/4.67-4=,/0.67=0.82 2 - U which order the operations are carried out and the average of the squares equals the square of the averages. An example of that situation is shown in Table 7-4. _ For another illustration of these ideas, consider the quantity Ax = Jx2 — z 2. As mentioned in Example 5-10, this quantity is used as a measure of the fluctuations that would be observed in measurements of the x coordinate of a particle. If there were no fluctuations, then x2 = X2 . But the uncertainty principle demands that there be fluctuations in x (which are larger the smaller the fluctuations in the linear momentum p). As a result x 2 > x2, and the difference between x 2 and z2 increases as the fluctuations in x increase so ,Jx2 — x 2 is a measure of these fluctuations. _ Now, it is easy to prove the validity of the relation expressed by (7-48), LZ = Lz2 , and therefore also the validity of the quantization condition L z = midi of (7-47). To do this we twice use (7-41), Lzo r 4'nimi = mih1nimi, to calculate L. According to the three-dimensional extension of the prescription for calculating expectation values, we have lif LZ = J `I' *LZoP di This immediately gives z Lz = V nimi L zop I nimt d^ The dynamical quantity LZ is the product of two factors of the form L z LZ =Lz •Lz According to the expectation value prescription, the operator L oP obtained from that dynamical quantity is thus the product of two operators of the form Lzpp . Therefore t1'' L o4'n pim i = L z op nim, zop The Square of the Average, and the Average of the Squares, of a Set of Nonfluctuating Numbers Table 7-4 A = 2, 2, 2 ^ - 2 +2+2 — 6 =2 3 A2 = 4 A 2 =4,4,4 A2 AA - 12 - 4 3 — A 2 = —4=0 4+4+4 3 3 = L operates twice on i/inimi. But according to (7-41) Lz op l nim, = mlhY nlmi Thus each operation of Lz.p on Otani , yields the same function Y'nim,, multiplied by a constant factor mih. Therefore, the result of two operations is simply to multiply `i'nlm, by two factors of m ih. That is / ,` Lz o pinim i = (mlh) 2 Y'nim, Knowing this, we immediately obtain Lz = J ' //,, 4'n m,(mih) 2 V'nim t dZ = (mih)2 J 'Zm1'fl1m1 dT = (mlh)2 L 2 where we have made use of the fact that the integral over all space of tiromiOnim, equals one because of the normalization condition. Since we have verified (7-48), we have completed our verification of the quantization condition Lz = mih. The proof of the validity of the quantization condition L 2 = 1(1 + 1)h2 is carried through in a completely parallel manner. Note that these proofs depend on (7-41) and (7-42), Lzoptitnim, = mlh n l m i and 4,20„1,n , = 1(1 + 1)h2llinimi. The equations state the surprising facts that the result of operating on the one-electron atom eigenfunction Y'nimi with the differential operator Lzop is simply to multiply that eigenfunction by the constant mih, while the result of operating on it with the differential operator Lop is simply to multiply it by the constant 1(1 + 1)h2. These results are certainly not typical of what happens when a differential operator operates on a function. For instance, if we operate on a function, say f(x) = x2, with the differential operator d/dx, we obtain a very different function f'(x) = 2x. As another example, it is not difficult to show that the results of operating on cam , with the operators Lxop or Lyop is to produce new functions of r, 8, 9 in which these variables enter quite differently from the way they enter in the function Y'nim,• That is (7-49) Lxop`Nnim, # (const)ili n/m, (const)ilinim i (7-50) Lyop^ nimt The ideas that we have developed, in the process of verifying the angular momentum quantization conditions, can be extended to provide a deeper insight into the theory of Schroedinger quantum mechanics. They can also be used to lead into the more sophisticated theories, such as Heisenberg's matrix mechanics. We must leave these matters for more advanced books. Here we shall say only that the properties associated with (7-41) and (7-42) are perfectly general. That is, whenever the dynamical quantity f has the precise value F in the = quantum state described by the function ifi, then that function satisfies the relation (7-51) fop ll/ = Flk where fop is the operator corresponding to f. We shall also show that the time-independent Schroedinger equation can be written in the form of (7-51). To do this, consider the time-independent Schroedinger equation in rectangular coordinates h 2 020 a20. + Vi/r = Etli 2µ ax 2 + ay2 + az2 a2,k1 Rewrite it as C ^a 2 + ^2J 2,u ^ 2 + +Vl i^=Ei%i By comparing (7-3) with (7-4), we see that the square bracket is just the operator eop for the total energy. Thus we have eop ifi = Elfr —L co cb SNOI1`df1O3 3MIdAN3 00 In other words, L opt/ nim, means that N CO ONE- ELE CTR O N ATO M S N Here E is one of the precise allowed values of the total energy of the system described by the potential V. The system is also described by the total energy operator eop . The general relation of (7-51) is called an eigenvalue equation, i' is said to be an eigenfunction of the operator fop , and F is said to be the corresponding eigenvalue. This is the same terminology as is used in the particular case of the eigenvalue equation for the total energy operator that is, in the case of the time-independent Schroedinger equation. The total energy operator eop is sometimes called the Hamiltonian. These considerations lead to the important conclusion that, since (7-49) and (7-50) show I'nlmz is not an eigenfunction of the operators Lx0 or Lyop , the corresponding dynamical quantities Lx and Ly do not have precise values in the one-electron atom. That is, L x and Ly dontbeyquaizcodtns. QUESTIONS U 1. If a hydrogen atom were not at rest, but moving freely through space, how would the quantum mechanical description of the atom be modified? 2. Since it is well known that the Coulomb potential has a much simpler form in spherical polar coordinates, why did we begin our treatment of the one-electron atom in rectangular coordinates? 3. In what important equations of classical physics does the Laplacian operator enter? 4. Would the results of the calculations be affected if we took different forms for the separation constants that arise in the splitting of the time-independent Schroedinger equation, for the one-electron atom, into three ordinary differential equations? 5. Why must I'((p) be single valued? How does this lead to the restriction that ml must be an integer? 6. What would happen if we took e - `m` 0 as the particular solution to the D((p) equation? What about cos m ice or sin m ice? 7. Why do three quantum numbers arise in the treatment of the (spinless) one-electron atom? 8. Can you say what the functions O(0) and 1(cp) would be like if V were a function of r, but not proportional to — 1/r? (This is the case for the valence electron of an alkali atom.) 9. Just what is degeneracy? 10. What is the relation between the size of a Bohr atom and the size of a Schroedinger atom? 11. What is the fundamental reason why the size of the hydrogen atom in its ground state has the value it does? 12. For a one-electron atom in free space, what would be the mathematical consequences of changing the choice of direction of the z axis? The physical consequences? What if the atom is in an external electric or magnetic field? 13. Why does a uniform electric or magnetic field define only one unique direction in space? 14. How do the predictions of the Bohr and Schroedinger treatments of the hydrogen atom (ignoring spin and other relativistic effects) compare with regard to the location of the electron, its total energy, and its orbital angular momentum? 15. Devise an explanation for the obvious relation between the last two terms of the Laplacian operator, in spherical polar coordinates, and the operator for the square of the magnitude of the orbital angular momentum. 16. Using the connection between L and 1, explain physically why >(i*0 is very small near r = 0, unless 1 = 0. 17. Exactly why do we say that for a hydrogen atom in free space the orbital angular momentum vector can be located with equal probability anywhere on a cone symmetrical about the z axis? 18. Is every eigenfunction of angular momentum magnitude necessarily also an eigenfunction of total energy? Is the reciprocal statement true? 19. Are examples of eigenvalue equations found in classical physics? If so, what are they? 1. Using the technique of separation of variables, show that there are solutions to the three-dimensional Schroedinger equation for a time-independent potential, which can be written iEtm P(x,y,z,t) = Y' (x,y,z)e where l/i(x,y,z) is a solution to the time-independent Schroedinger equation. 2. Verify that D(cp) = eim") is the solution to the equation for (Kcp), (7-15). 3. Hydrogen, deuterium, and singly ionized helium are all examples of one-electron atoms. The deuterium nucleus has the same charge as the hydrogen nucleus, and almost exactly twice the mass. The helium nucleus has twice the charge of the hydrogen nucleus, and almost exactly four times the mass. Make an accurate prediction of the ratios of the ground state energies of these atoms. (Hint: Remember the variation in the reduced mass.) 4. (a) Evaluate, in electron volts, the energies of the three levels of the hydrogen atom in the states for n = 1, 2, 3. (b) Then calculate the frequencies in hertz, and the wavelengths in angstroms, of all the photons that can be emitted by the atom in transitions between these levels. (c) In what range of the electromagnetic spectrum are these photons? 5. Verify by substitution that the ground state eigenfunction Iiloo, and the ground state eigenvalue E 1 , satisfy the time-independent Schroedinger equation for the hydrogen atom. 6. (a) Extend Example 7-4 to obtain from the uncertainty principle a prediction of the total energy of the ground state of the hydrogen atom. (b) Compare with the energy predicted by (7-22). 7. (a) Calculate the location at which the radial probability density is a maximum for the n = 2, 1 = 1 state of the hydrogen atom. (b) Then calculate the expectation value of the radial coordinate in this state. (c) Explain the physical significance of the difference in the answers to (a) and (b). (Hint: See Figure 7-5.) 8. (a) Calculate the expectation value V for the potential energy in the ground state of the hydrogen atom. (b) Show that in the ground state E = V/2, where E is the total energy. (c) Use the relation E = K + V to calculate the expectation value K of the kinetic energy in the ground state, and show that K = — V/2. These relations are obtained for any state of motion of any quantum mechanical (or classical) system with a potential in the form V(r) cc — 1/r. They are sometimes called the virial theorem. 9. (a) Calculate the expectation value V of the potential energy in the n = 2, 1 = 1 state of the hydrogen atom. (b) Do the same for the n = 2, 1 = 0 state. (c) Discuss the results of (a) and (b), in connection with the virial theorem of Problem 8, and explain how they bear on the origin of the 1 degeneracy. 10. By substituting into the equation for R(r), (7-17), the form R(r) cc r1, show that it is a solution for r —* O. (Hint: Ignore terms that become negligible relative to others as r -> O.) 11. Consider the probability of finding the electron in the hydrogen atom somewhere inside a cone of semiangle 23.5° of the +z axis ("arctic polar region"). (a) If the electron were equally likely to be found anywhere in space, what would be the probability of finding the electron in the arctic polar region? (b) Suppose the atom is in the state n = 2, 1 = 1, 1 = 0; recalculate the probability of finding the electron in the arctic polar region. m 12. (a) Sketch a polar diagram of the directional dependence of the one-electron atom probability density for 1 = 2, m 1 = O. (b) At what angle 6 does the angular probability density have its minimum value ? (c) Where does the angular probability density have a value one-fourth its maximum value? 13. Consider the hydrogen atom eigenfunction 0432. What are (a) the total energy in eV; (b) the expectation value of the radial coordinate in A; (c) the total angular momentum; (d) the z component of the angular momentum; (e) the uncertainty in the angular momentum; (f) the uncertainty in the z component of the angular momentum? 14. Show that the sum of hydrogen atom probability densities for the n = 3 quantum states, analogous to the sum in Example 7-5, is spherically symmetrical. N ^ W sw 318oad PROBLEMS ONE- ELECTRON ATOMS CD N 15. Show that I(q) = cos m19, and 41)(9) = sin m 19, are particular solutions to the equation for 0(p), (7-15). 16. (a) Evaluate 4,0 ,,11/2 -1 for the hydrogen atom. (b) Why does the result indicate that Y' 21 _ 1 is not an eigenfunction of Lx . p ? 17. Prove that Lôpl//nlm, = 1(1 + 1)h2111n1mz. (Hint: Use the differential equation satisfied by 01mi(0), (7-16).) 18. We know that 1/i = elkx is an eigenfunction of the total energy operator eop for the onedimensional problem of the zero potential. (a) Show that it is also an eigenfunction of the linear momentum operator pop , and determine the associated momentum eigenvalue. (b) Repeat for rÿ = e - 1kx. (c) Interpret what the results of (a) and (b) mean concerning measurements of the linear momentum. (d) We also know that lit = cos kx and t = sin kx are eigenfunctions of the zero potential e ap . Are they eigenfunctions of pop? (e) Interpret the results of (d). 19. All four of the functions e`m'u, e - am^ 9', cos mo, and sin m1q are particular solutions to the equation for 0(9), (7-15) (see Problem 15). (a) Find which are also eigenfunctions of the operator for the z component of angular momentum LZop . (b) Interpret your results. 20. A particle of mass ti is fixed at one end of a rigid rod of negligible mass and length R. The other end of the rod rotates in the x-y plane about a bearing located at the origin, whose axis is in the z direction. This two-dimensional "rigid rotator" is illustrated in Figure 7-13. (a) Write an expression for the total energy of the system in terms of its angular momentum L. (Hint: Set the constant potential energy equal to zero, and then express the kinetic energy in terms of L.) (b) By introducing the appropriate operators into the energy equation, convert it into the Schroedinger equation h2 a2`P(cp,t) a1P(p,t) = ih 21 09 2 at where I = µR 2 is the rotational inertia, or moment of inertia, and 'P(9,t) is the wave function written in terms of the angular coordinate 9 and the time t. (Hint: Since the angular momentum is entirely in the z direction, L = LZ and the corresponding operator is LZ0 = 21. By applying the technique of separation of variables, split the rigid rotator Schroedinger equation of Problem 20 to obtain: (a) the time-independent Schroedinger equation h2 d20(9) = E^(q ) 2I d ^2 and (b) the equation for the time dependence of the wave function dT(t) _ iE dt h T(t) In these equations E = the separation constant, and ch(9)T(t) = 111(9,t), the wave function. Figure 7-13 The rigid rotator moving in the x-y plane considered in Problem 20. (c) Compare the results of quantum mechanics with those of the old quantum theory obtained in Problem 42 of Chapter 4. (d) Explain why the two-dimensional quantum mechanical rigid rotator has no zero-point energy. Also explain why it is not a completely realistic model for a microscopic system. 25. Normalize the functions OM = e`mc found in Problem 24. 26. (a) Calculate the expectation value of the angular momentum, L, for a two-dimensional rigid rotator in a typical quantum state, using the eigenfunctions found in Problem 25. (b) Then calculate L 2 and L 2 , and interpret what your results have to say about the v alues of L that would be obtained in a series of measurements on the system. sw31eoad 22. (a) Solve the equation for the time dependence of the wave function obtained in Problem 21. (b) Then show that the separation constant E is the total energy. 23. Show that a particular solution to the time-independent Schroedinger equation for the rigid rotator of Problem 21 is IM = e`m° where m = J2IE/h. 24. (a) Apply the condition of single valuedness to the particular solution of Problem 23. (b) Then show that the allowed values of the total energy E for the two-dimensional quantum mechanical rigid rotator are h2m 2 E Im^= 0,1,2,3,... 2I 8 MAGNETIC DIPOLE MOMENTS, SPIN, AND TRANSITION RATES 8-1 INTRODUCTION 267 relation between magnetic dipole moment and angular momentum; justification of using partly classical procedures 8 2 - ORBITAL MAGNETIC DIPOLE MOMENTS 267 magnetic dipole moment and angular momentum of orbiting electron; Bohr magneton; orbital g factor; Larmor precession; magnetic dipole in uniform magnetic field; effects of nonuniform magnetic field 8 3 - THE STERN GERLACH EXPERIMENT AND ELECTRON SPIN - 272 apparatus; space quantization; qualitative agreement, and quantitative disagreement, with Schroedinger predictions; Phipps-Taylor experiment; spin; quantum numbers s and ms; spin angular momentum, magnetic dipole moment and g factor; Zeeman effect; spin and fine structure; nonclassical character of spin; Dirac's relativistic theory 8-4 THE SPIN-ORBIT INTERACTION 278 internal magnetic field in one-electron atom; spin-magnetic dipole moment orientational energy; Thomas precession; spin-orbit interaction energy 8 5 - TOTAL ANGULAR MOMENTUM 281 coupling between orbital and spin angular momenta; behavior of total angular momentum; conditions satisfied by quantum numbers j and mi 8-6 SPIN-ORBIT INTERACTION ENERGY AND THE HYDROGEN ENERGY LEVELS 284 convenient expression for spin-orbit interaction energy; application to hydrogen atom; other relativistic effects; fine-structure constant; comparison of Dirac, Sommerfeld, and Bohr results; Lamb shift; hyperfine structure 8 7 - TRANSITION RATES AND SELECTION RULES one-electron atom selection rules; failure of old quantum theory to explain transition rates; relation between transition rates and selection rules; oscillating electric dipole moment in a mixed quantum state; radiation by an oscillating electric dipole; evaluation of transition rate; electric dipole matrix element; quantum electrodynamics picture of stimulated and spontaneous emission; relation of selection rules to matrix elements; evaluation of ml 266 288 selection rule; selection rules and physical, or mathematical, symmetries; l dependence of parity of eigenfunctions for spherically symmetrical potentials; selection rule violations; metastable states N rn ^ Cu C) 8-8 A COMPARISON OF THE MODERN AND OLD QUANTUM THEORIES 295 superiorities of the modern theories 296 PROBLEMS 297 8-1 INTRODUCTION In this chapter we continue our study of the one-electron atom. First we shall discuss experiments which measure the orbital angular momentum L of an atomic electron. These experiments do not actually measure L directly. Instead they measure a related quantity µl, the orbital magnetic dipole moment, by measuring its interaction with a magnetic field applied to the atom. We shall develop the relation between p i and L that forms the basis of the measurements. We shall also remind the student of some of the properties of the interaction between a magnetic dipole and a magnetic field used in the measurements, and in others frequently carried out in atomic, solid state, and nuclear physics. When considering the results of measurements of atomic magnetic dipole moments, we shall discover the very important fact that electrons have an intrinsic angular momentum called spin, and an associated spin magnetic dipole moment. The effect that electron spin has on the energy levels of a one-electron atom will then be explored. Finally, we shall develop a procedure for calculating the rate at which excited one-electron atoms make transitions to lower-lying states by emitting the photons that form their line spectrum. Our treatments in this chapter will employ a combination of simple electromagnetic theory, partly classical physics such as the Bohr model, and quantum mechanics. Completely quantum mechanical treatments will not be presented because they require a more advanced knowledge of electromagnetic theory than has been assumed in this book. This procedure is justified by the fact that the results agree with those of completely quantum mechanical treatments. Of course, the justification is available to us only because someone has taken the trouble to work out the completely quantum mechanical treatments. 8 2 ORBITAL MAGNETIC DIPOLE MOMENTS - Consider an electron of mass in and charge — e moving with velocity of magnitude y in a circular Bohr orbit of radius r, as illustrated in Figure 8-1. (Since it is conventional to use for magnetic dipole moment, here we do not use it for the reduced electron mass. No confusion will arise because the inherent accuracy of the experiments, and calculations, generally does not warrant making a distinction between the reduced électron mass and the electron mass m.) The charge circulating in a loop constitutes a current of magnitude e ev = = T 2nr (8-1) where T is the orbital period of the electron whose charge has magnitude e. In elementary electromagnetic theory, it is shown that such a current loop produces a magnetic N ORBITAL MAGNETIC DIPOL E M OM ENTS QUESTIONS CO MAGNETI C DIPOLE MOMENTS, SPIN, AND TRANSITION RATES Figure 8-1 The orbital angular momentum L and the orbital magnetic dipole moment µ1 of an electron —e moving in a Bohr orbit. The magnetic field B produced by the circulating charge is indicated by the curved lines. The fictitious magnetic dipole that would produce an identical field far from the loop is indicated by its poles N, S. field which is the same at large distances from the loop as that of a magnetic dipole located at the center of the loop and oriented perpendicular to its plane. For a current i in a loop of area A, the magnitude of the orbital magnetic dipole moment 1.11 of the equivalent dipole is = iA (8-2) and the direction of the magnetic dipole moment is perpendicular to the plane of the orbit, in the sense indicated in Figure 8-1. The figure shows the magnetic field produced by the current loop. It also indicates the two fictitious poles of a dipole that would produce a magnetic field which becomes identical to the actual field far from the loop. The quantity µ1 specifies the strength of this magnetic dipole; it equals the product of the poles' strength times their separation. Because the electron has a negative charge, its magnetic dipole moment µ1 is antiparallel to its orbital angular momentum L, whose magnitude is given by L = mvr (8-3) and whose direction is illustrated in Figure 8-1. Evaluating i from (8-1), and A for a circular Bohr orbit, (8-2) yields µi ev _ evr = iA = 2rcr 7.0.2 2 (8-4) Dividing by (8-3), we obtain µ1 _ evr _ e L 2mvr 2m (8-5) We see that the ratio of the magnitude µ1 of the orbital magnetic dipole moment to the magnitude L of the orbital angular momentum for the electron is a combination of universal constants. It is usual to write this ratio as µ1 L __ Alta h (8-6) where eh 0.927 x 10 -23 amp-m2 2m = lb= and (8-7) (8-8) The quantity µb forms a natural unit for the measurement of atomic magnetic dipole moments, and is called the Bohr magneton. The quantity gi is called the orbital g factor. It is introduced, even though it appears here to be redundant, to preserve symmetry with equations we shall develop later in treating cases involving g factors which are not equal to one. In terms of these quantities, we may rewrite (8-5) as a vector equation specifying both the magnitude of µ1 and its orientation relative to L. Thatis =— glub L h (8-9) The ratio of 1u1 to L does not depend on the size of the orbit or on the orbital frequency. By making a calculation similar to the one above for an elliptical orbit, it can be shown that gi/L is independent of the shape of the orbit. That this ratio is completely independent of the details of the orbit suggests its value might not depend on the details of the mechanical theory used to evaluate it, and this is actually the case. Upon evaluation of µi quantum mechanically (which cannot be done here because the electromagnetic theory required is too sophisticated), and dividing by the quantum mechanical expression L = .Jl(l + 1)h, the ratio of 12 1 to L is found to have the same value that we have obtained. Granting this, the student will accept that the correct quantum mechanical expressions for the magnitude and z component of the orbital magnetic dipole moment are µi = gh b L = 9^ b ^l(l + 1)h = glµb^l(l + 1) (8-10) L Z = 9^ b m lh = — giµbmi (8 - 11) and µiz = g^ b — The minus sign in the last equation reflects the fact that the vector µi is antiparallel to the vector L. Now we shall remind the student of the behavior of a magnetic dipole of moment u1 when it is placed in an applied magnetic field B. In elementary electromagnetic theory it is shown that the dipole will experience a torque = gi x B (8-12) tending to align the dipole with the field, and that, associated with this torque, there is a potential energy of orientation AE = —µi • B (8-13) Example 8-1. Assume that a magnetic dipole, whose moment has magnitude µ i is aligned parallel to an external magnetic field, whose strength has magnitude B. Take µi = 1 Bohr magneton (typical of the magnetic dipole moment of an atom), and B = 1 tesla (typical of the field produced by a fairly powerful electromagnet). Calculate the energy required to turn the magnetic dipole so that it is aligned antiparallel to the field. ^ According to (8-13), the orientational potential energy when the dipole is parallel to the field is —µ 1B, and it is +fi1B when the dipole is antiparallel to the field. So the energy that must be supplied to turn the dipole is 2µ1B =•2 x 0.927 x 10 - 23 amp- m 2 x 1 joule/amp-m 2 = 1.85 x 10 -23 joule = 1.16 x 10 -4 eV S1N3WOW 31OdI4 0I13NJt/W 1b1I9 1=1O g1 = 1 Although this energy is very small, even by atomic standards, the dipole cannot turn unless it is supplied the energy. Conversely, if the dipole is originally aligned antiparallel to the field, it cannot turn to align itself parallel to the field unless it can get rid of the same amount of elegy • If there is no way for a system, consisting of a magnetic dipole moment µ l in a magnetic field B, to dissipate energy the orientational potential energy AE of the system must remain constant. In these circumstances, µ/ cannot align itself with B. Instead t1 will precess around B in such a way that the angle between these two vectors remains constant, and that the magnitudes of both vectors remain constant. The precessional motion is a consequence of the fact that, according to (8-9) and (8-12), the torque acting on the dipole is always perpendicular to its angular momentum, in complete analogy to the case of a spinning top. The precession, and its explanation, are illustrated in Figure 8-2. It is easy to show (see the figure caption) that the magnitude of the angular frequency of precession of pi about B is given by = gittb B (8-14) This equation also indicates that the sense of the precession is in the direction of B. The phenomenon is known as the Larmor precession, and to is called the Larmor frequency. ;/ Equation (8-14) is obtained from a classical treatment. But a quantum mechanical treatment leads to the same result, in the sense that the expectation values of the components perpendicular to the magnetic field of a quantum mechanical magnetic dipole moment change cyclically in time in the same way as do the actual components perpendicular to the magnetic field of a classical magnetic dipole moment. To simplify the discussion in subsequent sections, we shall frequently speak of the precession of a quantum mechanical magnetic dipole moment \ / \ / \ / / /c/L MAG NETICD IPO LE MO MENTS, SP IN, AND TRANSITION RATES ti / o \^ ____i____. L sin 81 \ ------... \ / / \^, — __ -- (gbub/h) L x B A torque ti= µl x B= arises as the atom's magnetic dipole moment p i interacts with the applied field B. This torque gives rise to a change dL in the angular momentum during time dt, according to a form of Newton's law, dL/dt = T. The change dL causes L to precess through an angle wdt, where w is the precessional angular velocity. From the diagram, we see that dL = (L sin 0)w dt, or Lw sin 8 = dL/dt = z = (g bub/h)LB sin B. So w = gbubBlh, as in (8-14). Figure 8-2 — Illustrating the forces FN and Fs acting on the poles of a fictitious magnetic dipole, equivalent to the circulating electron of Figure 8-3, located in a region where the applied field B is converging. Since FN is greater in magnitude than Fs , the net force on the dipole is in the direction in which B becomes more intense. This situation may be familiar to the student in the case in which the fields and dipole moment are electric instead of magnetic. Figure 8-4 in a magnetic field, although to be strictly correct we should speak of the cyclic change in the expectation values of its perpendicular components. If the applied magnetic field is uniform in space, there will be no net translational force acting on the magnetic dipole (although there is certainly a torque). But if the field is nonuniform, there will be such a translational force (in addition to the torque). What really happens is illustrated in Figure 8-3. This figure shows that an electron moving with velocity y through a circular orbit, in a region in which the B field is converging, feels a force proportional to — v x B that always has a component in the direction in which the field becomes more intense. The effect can also be seen via the analogy between a fictitious magnetic dipole in a nonuniform magnetic field, and an electric dipole in a nonuniform electric field, as illustrated in Figure 8-4. Using this analogy, it is easy to show that the average force acting on the magnetic dipole is FZ = Oz u l (8-15) where z is the coordinate axis in the direction of increase of the field strength, and ôBZ/ôz is the rate at which it increases. We conclude that a magnetic dipole in a nonuniform magnetic field experiences a torque, which will cause precession, and a force, which will cause displacement. S1N3WOW 310dI4 0 I13N JVW1V1181:10 In a region where an applied field B is converging, an electron moves in a Bohr orbit with velocity y, the field exerting force F on the electron. Because the electron charge is negative, F cc —v x B. Regardless of the position of the electron in the orbit, this force has a component that is radially outward and a component in the direction towards which B becomes more intense. Averaged over the orbit, the radial component cancels, and the average force is in the latter direction (upward). Figure 8-3 N ^ MAG NETICDIPO LE MOMENTS, SPIN, AND TRANSITION RATES N The Stern-Gerlach apparatus. The field between the two magnet pole pieces is indicated by the field lines drawn at the near end of the magnet. The field intensity increases most rapidly in the positive z direction (upward). Figure 8-5 8-3 THE STERN-GERLACH EXPERIMENT AND ELECTRON SPIN In 1922 Stern and Gerlach measured the possible values of the magnetic dipole moment for silver atoms by sending a beam of these atoms through a nonuniform magnetic field. A drawing of their apparatus is shown in Figure 8-5. A beam of neutral atoms is formed by evaporating silver from an oven. The beam is collimated by a diaphragm, and it enters a magnet. The cross-sectional view of the magnet shows that it produces a field that increases in intensity in the z direction defined in the figure, which is also the direction of the magnetic field itself in the region of the beam. As the atoms are neutral overall, the only net force acting on them is the force F of (8-15), which is proportional to µz .. Since the force acting on each atom of the beam is proportional to its value of pl., each atom is deflected in passing through the magnetic field by an amount which is proportional to pl=. Thus the beam is analyzed into components according to the various values of pi.. The deflected atoms strike a metallic plate, upon which they condense and leave a visible trace. If the orbital magnetic moment vector of the atom has a magnitude µi, then in classical physics the z component µis of this quantity can have any value from — µi to +µi . The reason is that classically the atom can have any orientation relative to the z axis, and so this will also be true of its orbital angular momentum and its magnetic dipole moment. The predictions of quantum mechanics, as summarized by (8-11), are that µii can have only the discretely quantized values (8-16a) µi s = — giµbmi where m 1 is one of the integers mi =- 1, - 1+1,...,0,...,+1 - 1,+ 1 (8 -16b) Thus the classical prediction is that the deflected beam would be spread into a continuous band, corresponding to a continuous distribution of values of pi= from one atom to the next. The quantum mechanical prediction is that the deflected beam would be split into several discrete components. Furthermore, quantum mechanics predicts that this should happen for all orientations of the analyzing magnet. That is, the magnet is essentially acting as a measuring device which investigates the quantization of the component of the magnetic dipole moment along a z axis, which it defines as the direction in which its field increases in intensity most rapidly. Since, according to quantum mechanics, A. should be quantized for any choice of the z Z is quantized for any choice of that direction, the same results directonbausL should be obtained for all positions of the analyzing magnet. Classically predicted Stern and Gerlach found that the beam of silver atoms is split into two discrete components, one component being bent in the positive z direction and the other bent in the negative z direction. Figure 8-6 shows the type of pattern observed on the detecting plate. They also found that these results were obtained independent of the choice of the z direction. The experiment was repeated using several other species of atoms, and in each case investigated it was found that the deflected beam is split into two, or more, discrete components. The results are, qualitatively, very direct experimental proof of the quantization of the z component of the magnetic dipole moments of atoms and, therefore, of their angular momenta. In other words, the experiments showed that the orientation in space of atoms is quantized. The phenomenon is called space quantization. But the results of the Stern-Gerlach experiment are not quantitatively in agreement with (8-16a) and (8-16b), the equations summarizing the predictions of the theory we have developed. According to these equations, the number of possible values of j is equal to the number of possible values of m l, which is 21 + 1. Since l is an integer, this is always an odd number. Also for any value of l one of the possible values of m1 is zero. Thus the fact that the beam of silver atoms is split into only two components, both of which are deflected, indicates either that something is wrong with the Schroedinger theory of the atom, or that the theory is incomplete. The theory is not wrong (we shall see later that atoms do have orbital angular momenta and magnetic dipole moments with the predicted properties); but, as it stands, the Schroedinger theory of the atom is incomplete. This is shown most clearly by an experiment performed in 1927 by Phipps and Taylor, who used the SternGerlach technique on a beam of hydrogen atoms. The experiment is particularly significant because the atoms contain a single electron, so the theory we have developed makes unambiguous predictions. Since the atoms in the beam are in their ground state because of the relatively low temperature of the oven, the theory predicts that the quantum number / has the value l = 0. Then there is only one possible value of m1 namely mi = 0, and we expect that the beam will be unaffected by the magnetic field since pi. will be equal to zero. However, Phipps and Taylor found that the beam is split into two symmetrically deflected components. Thus there is certainly some magnetic dipole moment in the atom which we have not hitherto considered. One possibility is a magnetic dipole moment associated with motion of charges in the nucleus. The magnitude of such a magnetic dipole moment would be of the order of ehl2M, where M is the mass of a proton. But the magnetic dipole moment measured experimentally from the size of the splitting is of the order of Lb = ehl2m, where m is the mass of an electron, which is about 2000 times larger. Therefore, the nucleus cannot be responsible for the observed magnetic dipole moment. Its source must be the electron. This leads us to some reasonable assumptions, which are also supported by other evidence to be discussed shortly. We assume that an electron has an intrinsic (built-in) magnetic dipole moment µs , due to the fact that it has an intrinsic angular momentum S called its spin. From a classical point of view, we can think, at least crudely, of the , THESTE RN -G ERLAC H EXPE RIMENT AN D ELECTRONSPIN Observed Figure 8-6 The deflection pattern recorded on the detecting plate in a Stern-Gerlach measurement of the z component of the magnetic dipole moment of silver atoms. Maximum deflection occurs at the center of the beam because the atoms there pass through the region of maximum field gradient, ôB Z/8z. The observed pattern consists of two discrete components due to space quantization. According to the classical prediction a continuous band would be expected. MAG NETICDIPOLEMO MENTS, S PIN, ANDTRA NS ITIO N RATES co ci s O electron producing the external magnetic field of a magnetic dipole because of the curent loops associated with its spinning charge. We also assume that the magnitude S and the z component S. of the spin angular momentum are related to two quantum numbers, s and ms , by quantization relations which are identical to those for orbital angular momentum. That is (8-17) S = Vs(s + 1)h (8-18) SZ = ms h (Note that Sx and S,, are not quantized, as is also the case for L x and L,,.) We further assume that the relation between the spin magnetic dipole moment and the spin angular momentum is of the same form as the relation for the orbital case. That is s (8-19) 11s.= —gsitbms (8-20) 9sµb The quantity gs is called the spin g factor. From the experimental observation that the beam of hydrogen atoms is split into two symmetrically deflected components, it is apparent that ,us= can assume just two values, which are equal in magnitude but opposite in sign. If we make the final assumption that the possible values of m s differ by one and range from —s to +s, as is true of the quantum numbers m 1 and 1 for orbital angular momentum, then we can conclude that the two possible values of m s are (8-21) ms = —1/2, + 1/2 and that s has the single value (8-22) s = 1/2 By measuring the splitting of the beam of hydrogen atoms, it is possible to evaluate the net force FZ they feel while traversing the magnetic field. From analogy to (8-15), and from (8-20), this is F = — (ôBZ/az)p hgsms . Since ub is known and ôBZ/ôz can be measured, the experiments determine the value of the quantity gsms . Within their accuracy, it was found that gsms = ± 1. Since we have concluded that m, = + 1/2, this implies (8-23) 9s = 2 These conclusions are confirmed by many different experiments. For instance, in the Zeeman effect a uniform external magnetic field is applied to a collection of atoms, and measurements are made of the potential energies of orientation in the field of the magnetic dipole moments of the atoms. As we shall discuss in detail in Chapter 10, this is done by measuring the splitting of the spectral line emitted when the atoms decay from some higher energy level to their ground state energy level. The splitting of the line occurs because the levels themselves are split according to the different values assumed by the orientational potential energy of the atoms. A simple example is the Zeeman effect for hydrogen atoms. In their ground state these atoms have no orbital angular momentum, and therefore no orbital magnetic dipole moments. But the measurements show that their ground state energy level is split by the applied magnetic field into two components, symmetrically disposed about the energy of the ground state in the absence of a field. This splitting reflects the two possible values of the orientational potential energy AE= — 'is •B= —p s.B = gsubmsB = ± gsLbB/2 A beam of hydrogen atoms, emitted from an oven running at a temperature T = 400°K, is sent through a Stern-Gerlach magnet of length X =1 m. The atoms experience a magnetic field with a gradient of 10 tesla/m. Calculate the transverse deflection of a typical atom in each component of the beam, due to the force exerted on its spin magnetic dipole moment, at the point where the beam leaves the magnet. •At this temperature, the atoms are in their ground state and have no orbital angular momentum or orbital magnetic dipole moment. They typically have kinetic energy 2kT, where k is Boltzmann's constant. (The kinetic theory shows that while the atoms in the oven typically have kinetic energy (3/2)k T, the atoms emitted in the beam typically have kinetic energy 2kT. The reason is that the more energetic atoms hit the walls of the oven more frequently and thus have a higher probability of impinging on the hole in the wall through which the beam is emitted.) From (8-15) and (8-20), they experience a transverse force Example 8-2. aBz F2_az libgsms Since gsms = +1, this is a Z µb FZ=±a The typical longitudinal velocity v x of an atom of mass M in traveling through the magnet can be evaluated by setting 2 Mvz =2kT So 4k T M Thus the time t the atom experiences the transverse force in traveling through the magnet of length X is X X M vx = t = —= vx l4kT —X 4kT Vj M Because of the force they have a transverse acceleration a2 = FZ /M, and so suffer a transverse deflection 2 1 Z=a 2 Zt _ +- FZX2M 2M 4kT 1 aa z µbX 2 8kT — + 10 tesla/m x 0.927 x 10 -23 amp-m2 x 1 m 2 — 8 x 1.38 x 10 -23 joule/°K x 400°K +2.1x10 -3 m ^ ^ w ^ THESTERN -GERLACH EXPE R IM ENT AND ELE CT R ON SPIN where the z axis is taken in the direction of the applied field. The fact that the level is symmetrically split into two components confirms the conclusion that m s = ± 1/2, and the measured magnitude of the splitting confirms the conclusion that gs = 2. Recent spectroscopic measurements of Lamb, using a technique of extreme accuracy, actually have shown that gs = 2.00232. However in almost all situations it is quite adequate to say simply that the spin g factor for an electron is twice as large as its orbital g factor; i.e., that the spin magnetic dipole moment is twice as large, compared to the spin angular momentum, as the orbital magnetic dipole moment is compared to the orbital angular momentum. On the other hand, µs and S are antiparallel, just like µ i and L, because the relative orientation of either pair of vectors depends only on the fact that the electron has a negative charge. MAGNETICD IPOLE M OMENTS, SPIN, AND TRANSITIO N RATES The separation of the two components is about half a centimeter, which is quite easy to 4 observe. The idea of electron spin was introduced some time before the work of Phipps and Taylor. In the final sentence of a research paper on the scattering of x rays by atoms, published in 1921, Compton had written, "May I then conclude that the electron itself, spinning like a tiny gyroscope, is probably the ultimate magnetic particle." This was really more of a speculation than a conclusion, and Compton apparently never followed it further. Credit for the introduction of electron spin is generally given to Goudsmit and Uhlenbeck. In 1925, as graduate students, they were trying to understand why certain lines of the optical spectra of hydrogen and the alkali atoms are composed of a closely spaced pair of lines. This is the fine structure, which had been treated by Sommerfeld in terms of the Bohr model as due to a splitting of the atomic energy levels because of a small (about one part in 10 4) contribution to the total energy resulting from the relativistic variation of electron mass with velocity (see Section 4-10). The results of Sommerfeld were in good numerical agreement with the observed fine structure of hydrogen. But the situation was not so satisfactory for the alkalis. In these atoms the electron responsible for the optical spectrum would be expected( to move in a Bohr-like orbit of large radius at low velocity, so the relativistic variation of mass would be expected to be small. However, the fine structure splitting was observed to be very much larger than in hydrogen. Consequently, doubt arose concerning the validity of Sommerfeld's explanation of the origin of fine structure. In considering other possibilities, Goudsmit and Uhlenbeck proposed that an electron has an intrinsic angular momentum and magnetic dipole moment, whose z components are specified by a fourth quantum number m s, which can assume either of two values, —1/2 and + 1/2. The splitting of the atomic energy levels could then be understood as due to a potential energy of orientation of the magnetic dipole moment of the electron in the magnetic field that is present in the atom because it contains moving charged particles. The energy of orientation would be either positive or negative depending on the sign of ms i.e., depending on whether the spin is "up" or "down" relative to the direction of the internal magnetic field of the atom. (This should not be confused with the previously mentioned Zeeman effect, which involves the splitting of energy levels of an atom due to the orientational pptential energy of its magnetic dipole moment in an external magnetic field applied to the atom.) Uhlenbeck has described the circumstances as follows: , "Goudsmit and myself hit upon this idea by studying a paper of Pauli, in which the famous exclusion principle (to be treated in Chapter 9) was formulated and in which, for the first time, four quantum numbers were ascribed to the electron. This was done rather formally; no concrete picture was connected with it. To us this was a mystery. We were so conversant with the proposition that every quantum number corresponds to a degree of freedom (an independent coordinate), and on the other hand with the idea of a point electron, which obviously had three degrees of freedom only, that we could not place the fourth quantum number. We could understand it only if the electron was assumed to be a small sphere that could rotate... . Somewhat later we found in a paper of Abraham, to which Ehrenfest drew our attention, that for a rotating sphere with surface charge the necessary factor two in the magnetic moment (gs = 2) could be understood classically. This encouraged us, but our enthusiasm was considerably reduced when we saw that the rotational velocity at the surface of the electron had to be many times the velocity of light! I remember that most of these thoughts came to us on an afternoon at the end of September 1925. We were excited, but we had not the slightest intention of publishing anything. It seemed so speculative and bold, that something ought to be wrong with it, especially since Bohr, Heisenberg, and Pauli, our great authorities, had never proposed anything of the kind. But of course we told Ehrenfest. He was impressed at once, mainly, I feel, because of the visual character of our hypothesis, which was very much The most recent experimental evidence indicates that the electron is a point particle, and certainly not "bigger than the whole atom." One set of experiments studies the scattering of electrons by electrons at very high kinetic energies. If these objects had appreciable extent in space, in collisions which were so close that they overlap, the force acting between them would be modified just as in the close collision of an a particle and a nucleus. It was found that the electrons always act like two point objects, with charge —e and magnetic dipole moment µ s, even in the closest collisions investigated. Thus electrons have an extent less than this collision distance, which is about 10 -16 m. In comparison to the dimensions of an atom (10 -10 m), or even the dimensions of a nucleus (10 -14 m), electrons have negligible dimensions. Although the electron seems to be a point particle, four quantum numbers are required to specify its quantum states. The first three arise because three independent coordinates are required to describe its location in three-dimensional space. The fourth arises because it is also necessary to describe the orientation in space of its spin, which can be either "up" or "down" relative to some z axis. For a classical point particle, there is room only for the first three quantum numbers. But the electron is not a classical particle. Schroedinger quantum mechanics is completely compatible with the existence of electron spin; but it does not predict it, so spin must be introduced as a separate postulate. The reason for this is that the theory is an approximation which ignores relativistic effects. The student will recall that the theory is based on the nonrelativistic energy equation, E = p2/2m + V. The student may also recall reading in Chapter 5 brief mention of the fact that Dirac developed a relativistic theory of quantum mechanics in 1929. Using the same postulates as the Schroedinger theory, but replacing the energy equation by its relativistic form E = (c2p2 + môc4)1/2 + V, Dirac showed that an electron must have an intrinsic s = 1/2 angular momentum, an intrinsic magnetic dipole moment with a g factor of 2, and all the other properties we have stated previously. This was a great triumph for relativity theory; it put electron spin on a firm theoretical foundation and showed that electron spin is intimately connected with relativity. A quantitative treatment of the Dirac theory would, unfortunately, be beyond the level of this book, but we shall from time to time describe qualitatively its results. Another aspect of the nonclassical character of spin can be seen by noting that the quantum number s, which specifies the magnitude of the spin angular momentum S, has the fixed value 1/2. Therefore, we cannot take S to the classical limit by letting THE STERN -GERLA CHEXP ERIMENT AND ELE CTR ONSPIN in his line. He called our attention to several points, e.g., to the fact that in 1921 A. H. Compton already had suggested the idea of a spinning electron as a possible explanation of the natural unit of magnetism, and finally said that it was either highly important or nonsense, and that we should write a short note for Naturwissenschaften (a physics research journal) and give it to him. He ended with the words `and then we will ask Lorentz.' This was done. Lorentz received us with his well known great kindness, and he was very much interested, although, I feel, somewhat skeptical too. He promised to ,think it over. And in fact, already next week he gave us a manuscript, written in his beautiful handwriting, containing long calculations on the electromagnetic properties of rotating electrons. We could not fully understand it, but it was quite clear that the picture of the rotating electron, if taken seriously, would give rise to serious difficulties. For one thing, the magnetic energy would be so large that by the equivalence of mass and energy the electron would have a larger mass than the proton, or, if one sticks to the known mass, the electron would be bigger than the whole atom! In any case, it seemed to be nonsense. Goudsmit and myself both felt that it might be better for the present not to publish anything; but when we said this to Ehrenfest, he answered: `I have already sent your letter in long ago; you are both young enough to allow yourselves some foolishness!' " (from The Conceptual Development of Quantum Mechanics by Max Jammer, McGraw-Hill, 1966) MAGNETICDIPO LEMOMENTS, SPIN, AND TRANS ITIO N RATES s --> cc, as we did in Section 7-8 for the magnitude of the orbital angular momentum L by letting its quantum number 1—> co. An equivalent statement is that in the classical limit the magnitude of S is completely negligible because h is so small, so spin is essentially nonclassical. This being the case, it is sometimes more harmful than helpful to think of spin in terms of a classical model like a small spinning sphere; but it must be admitted that it is difficult to avoid thinking in such terms. 8 4 THE SPIN ORBIT INTERACTION - - Although spin itself is subtle, there is nothing subtle about many of the effects it produces. Perhaps the most important is that it doubles the number of electrons which the "exclusion principle" allows to populate the quantum states of multielectron atoms. When we study this effect in Chapter 10, we shall see that the ground states of atoms would be very much altered if electrons did not have spin. This would have profound consequences on the periodic properties of atoms, and therefore on all of chemistry and solid state physics. In the present section we shall study the interaction between an electron's spin magnetic dipole moment and the internal magnetic field of a one-electron atom. Since the internal magnetic field is related to the electron's orbital angular momentum, this is called the spin-orbit interaction. It is a relatively weak interaction which is responsible, in part, for the fine structure of the excited states of one-electron atoms. The spin-orbit interaction also occurs in multielectron atoms, but in such atoms it is reasonably strong because the internal magnetic fields are very strong. Furthermore, an effect completely analogous to the spin-orbit interaction occurs in nuclei. The nuclear spin-orbit interaction is so strong that it governs the periodic properties of nuclei. The origin of the internal magnetic field experienced by an electron moving in a one-electron atom is easy to understand if we consider the motion of the nucleus from the point of view of the electron. In a frame of reference fixed on the electron, the charged nucleus moves around the electron and the electron is, in effect, located inside a current loop which produces the magnetic field. The argument is illustrated qualitatively in Figure 8-7. To make the argument quantitative, we note that the charged nucleus moving with velocity — y constitutes a current element j, where j = — Zev According to Ampere's law, this produces a magnetic field B which, at the position of the electron, is ,ua jxr _ Ze,u o v xr B— 4n r3 47L r3 Figure 8-7 Left: An electron moves in a circular Bohr orbit, the motion as seen by the nucleus. Right: The same motion, but as seen by the electron. From the point of view of the electron, the nucleus moves around it. The magnetic field B experienced by the electron is in the direction out of the page at the electron's location. Ze E— 47CEp r r3 From the last two equations, we have B = — Eoµ ov x E or B= — Z c (8-24) v xE since c = 1/1/€0 µo . The quantity B is the magnetic field strength experienced by the electron when it is moving with velocity v relative to the nucleus, and therefore through the electric field of strength E which the nucleus exerts on it. Equation (8-24) is actually of very general validity, and it can be derived from relativistic considerations. The electron and its spin magnetic dipole moment can assume different orientations in the internal magnetic field of the atom, and its potential energy is different for each of these orientations. If we evaluate the orientational potential energy of the magnetic dipole moment in this magnetic field, from an equation analogous to (8-13), we have AE= —µ 5 •B Using (8-19), this can be written in terms of the electron's spin angular momentum S as AE= 9b S• B h But this energy has been evaluated in a frame of reference in which the electron is at rest, whereas we are interested in the energy as measured in the normal frame of reference in which the nucleus is at rest. Because of an effect of the relativistic transformation of velocities, called the Thomas precession, the transformation back to the nuclear rest frame results in a reduction of the orientational potential energy by a factor of 2. Thus, the spin-orbit interaction energy is AE = 2g h S• B (8-25) The transformation leading to the factor of 2 is interesting, but rather complicated, so we shall not carry it out here. (It is carried out in Appendix O.) We shall find it convenient to express (8-25) in terms of S • L, the scalar, or dot, product of the spin and orbital angular momentum vectors. To this end, we use, in (8-24), the relation —eE=F between the electric field E and the force F acting on the electron of charge — e. We also use the relation F dV(r) r _ dr r between the force and the potential. (The term r/r is a unit vector in the radial direction which gives F its proper direction.) With these relations, (8-24) becomes 1 1 dV(r) B=— 2 ec r dr vxr NO I18d1:131NI 11 8b10- NId S 3H 1 It is convenient to express this in terms of the electric field E acting on the electron. According to Coulomb's law 0 ^ MAGNET ICDIPOLE MOMENTS, SPIN, AND TRANSITION R ATES N ^ ^. ^ L U Multiplying and dividing by the electron mass m allows us to write this in terms of the orbital angular momentum, L = r x my = — my x r, as B=1 1 dV(r) L emc 2 r dr (8-26) Note that the strength of the magnetic field B, experienced by the electron because it is moving about the nucleus with orbital angular momentum L, is proportional to the magnitude of L, and also that the magnetic field vector is in the same direction as the angular momentum vector. With this result, we can express the spin-orbit interaction energy, (8-25), as DE = Evaluating gs and Jib , 1 dV(r) gslµb 2emc 2^i r dr S•L we obtain 4E = 1 1 dV(r) 2m 2c 2 r dr S•L (8-27) This equation was first derived in 1926 by Thomas, using as we have a combination of the Bohr model, Schroedinger quantum mechanics, and relativistic kinematics. However, it is in complete agreement with the results of the relativistic quantum mechanics of Dirac. It is important in the theory of multielectron atoms as well as of one-electron atoms. Furthermore, a similar equation is central to the understanding of the theory of the structure of nuclei, as we shall see later in the book. 8-3. Estimate the magnitude of the orientational potential energy AE for the n = 2, l = 1 state of the hydrogen atom, to check whether it is of the same order of magnitude as the observed fine-structure splitting of the corresponding energy level. (There is no spin-orbit energy in the n = 1 state, since for n = 1 the only possible value for lis l = 0, which means L = 0.) ^ The potential is Example e2 V(r) = 47rE 0 r 1 So dV(r) e2 dr 47rE0 r 2 and 2 1 S•L 47rE0 2m2 c 2 r3 The magnitude of S • L is approximately h2 since each of these angular momentum vectors has a magnitude of approximately h. The expectation value of 1/r 3 for the n = 2 state is approximately 1/(3a 0 ) 3. Thus ^E= e2 1 m3e6 IAEI . 47rE02m 2 c2 3 3 ( 4 71E0) 3h6 e 2 me8 54 x (47rE0) 4c 2 h4 (9 x 109 nt-m2/coul2)4 x 9 x 10 -31 kg x (1.6 x 10 -19 coul)8 54 x (3 x 10 8 m/sec) 2 x (1.1 x 10 -34 joule-sec)4 -23 jou1e-10 -4 eV —10 Since S • L can be either positive or negative, depending on the relative orientation of the two vectors, the energy level is split by roughly 2 x 10 -4 eV. Comparing this with the energy of the n = 2, 1 = 1 level of hydrogen, E 2 = — 3.4 eV, we see that the ratio of the predicted energy splitting to the energy itself, IAE/EI, is about one part in 104. This is in reasonable agreement with the splitting required to explain the fine structure of the lines of the hydrogen spectrum associated with this level, as discussed in Section 4-10, and therefore it provides some confirmation of the theory we have developed. A more detailed comparison of the theory with experiment will be made shortly. t 8-5 TOTAL ANGULAR MOMENTUM If there were no spin-orbit interaction, the orbital and spin angular momenta L and S of an atomic electron would be independent of each other. That is, when an atom without spin-orbit interaction is in free space there would be no torques acting on either L or S, so both of these vectors are equally likely to be found anywhere on cones surrounding the z axis—with the orientation of one vector unrelated to the orientation of the other. (The vector S is found with equal likelihood anywhere on such a cone, just as is true of the vector L, because S x = Sy = 0, just as L x = Ly = O.) The vectors do, however, have the fixed magnitudes and z components L, L5, S, S. These fixed values are the ones specified by the quantum numbers 1, m 1 , s, m5 . However, there is a spin-orbit interaction. That is, a strong internal magnetic field is acting on the atomic electron, the orientation of which is determined by L, and produces a torque on its spin magnetic dipole moment, the orientation of which is determined by S. As in the case of the Larmor precession of Section 8-2, the torque will not change the magnitude of S. Nor will the reaction torque acting on L change its magnitude. But the torque does enforce a coupling between L and S which makes them undergo a precessional motion with the orientation of each dependent on the orientation of the other. They precess around their sum, instead of lying in cones symmetrical about the z axis. Since these vectors are not constrained to be found in cones that have z-axis symmetry, their z components, L Z and SZ , do not have fixed values when there is a spin-orbit interaction. The situation is illustrated in Figure 8-8, which shows L and S precessing due to the spin-orbit interaction coupling. Their motion is involved, but not as involved as it might be because they must move in such a way that their sum, the total angular momentum J, has a simple behavior. That is, if the atom is in free space so that no external torques act on it, its total angular momentum J=L+S (8-28) maintains a fixed magnitude J and a fixed z component J. The vectors L and S precess around their sum J, and their components in the direction of J remain fixed so that its magnitude J is fixed. Also, J has a fixed component JZ since it can be found with equal probability anywhere on a cone symmetrical about the z axis. As we continue our studies of atoms, we shall find the total angular momentum to be quite useful because of the simple behavior of its magnitude and z component. This is particularly so in the case of multielectron atoms, where the many orbital and spin angular momenta, that compose the total angular momentum, have very complicated behaviors. Wf11N31/1OWtdbif1JNb' 1d101 Example 8-4. Estimate the magnitude of the magnetic field B acting on the spin magnetic dipole moment of the electron in Example 8-3. 10- From an equation analogous to (8-13), we have AE = —µ S • B. So 14E1 — µsB where -23 amp-m2 µs µb — 10 Therefore 10 -23 joule m2 - 1tesla B~ 10 23 am pThis is about equal to the field produced by an electromagnet operating at the limit at which its iron core saturates. We see that the electron's spin magnetic dipole moment feels a strong magnetic field because it is moving at a high velocity through the strong electric field surrounding the nucleus. co MAGNETIC D IPOLE MOMENTS, SPIN, AND TRANSITION RATES N Figure 8-8 The angular momentum vectors L, S, and J for a typical case of a state with 1 = 2, j = 5/2, m; = 3/2. The vectors L and S precess uniformly about their sum J, and J can be found anywhere on the cone symmetrical about the z axis. By using techniques closely related to those we used in Section 7-8 to study the properties of the orbital angular momentum, it can be shown that the magnitude and z component of the total angular momentum J are specified by two quantum numbers j and m;, according to the usual quantization conditions (8-29) J = Vj(j + 1)h and JZ = mitt (8-30) The possible values of the quantum number mi are, as would be expected (8-31) m; = —j, —j + 1, ... , +j — 1, +j We may determine the possible values of the quantum number j by taking the z component of (8-28), which defines J. This gives JZ =L2 +SZ Now, in the absence of the spin-orbit interaction, L Z and SZ would satisfy the quantization conditions LZ = m1h and SZ = m2h. And in such a situation it would still be possible to define J = L + S, and its z component would still satisfy the quantization condition JZ = m;h. So if there were no spin-orbit interaction we could write m .h = mlh + msh or m; = ml +ms Since the maximum possible value of m1 is 1, and the maximum possible value of m s s = 1/2, the maximum possible value of m; is is (8-32) (mi)max = l + 1/2 Even though there actually is a spin-orbit interaction, (8-32) is valid. The reason is that angular momentum conservation prevents any interaction internal to the isolated atom from changing the z component of its total angular momentum. Hence the spin-orbit interaction cannot change the restriction on that quantity imposed by (8-32). According to (8-31), the maximum possible value of m ; is also the maximum possible value of j. In common with the other angular momentum quantum numbers, the possible values of j differ by integers. Therefore these values must be members of the decreasing series j=l+ 1/2, 1— 1/2, l- 3/2, 1- 5 /2,. .. STIL +S S ILI + ISI LI / LI L / L+S L I L+S ^L + S ILI — lSII Figure 8 9 Vector diagrams which show that for any two vectors L and S the magnitude + SI of their sum is always at least as large as the magnitude of the difference in their IL - magnitudes, IILI — S. The case for which ILI > ISI is shown; the student can show in his own diagram that the conclusion is unaltered if ILI < ISI. To determine where the series terminates, we may use the vector inequality IL + Si >_ IILI — IS whose validity the student may easily demonstrate by inspecting Figure 8-9. Writing L + S as J, we have from the above inequality IJI >- II LI - I SII or \/j(/ + 1)h 10(/ + 1)h — Vs(s + 1)hl From this it can be shown with no difficulty that since s = 1/2 there are generally two members of the series which satisfy the inequality. These are (8-33a) j = l + 1/2, 1 — 1/2 It is even more apparent that if 1 = 0 there is only one possible value of j, namely if l = 0 (8-33b) j = 1/2 The content of the equations stating the possible values of the quantum numbers mi and j can be represented in terms of the rules of vector addition, by constructing a set of vectors whose lengths are proportional to the values of the quantum numbers 1, s, and j. This is illustrated in the following example. Enumerate the possible values of the quantum numbers j and mi, for states in which 1 = 2 and, of course, s = 1/2. •According to (8-33a), the two possible values of j are 5/2 and 3/2. According to (8-31), for j = 5/2 the possible values of m 1 are — 5/2, — 3/2, —1/2, 1/2, 3/2, 5/2. The same equation states that for j = 3/2 the possible values of m i are — 3/2, —1/2, 1/2, 3/2. Vector diagrams for this case are shown in Figure 8-10. Inspection should make their interpretation obvious. • Example 8-5. Vector diagrams of the type shown in Figure 8-10 represent only the rules for adding the quantum numbers l and s to obtain the possible values of the quantum numbers j and mi. If the relation between the magnitude of an angular momentum vector, such as L, and its associated quantum number were L = 1h, instead of L = V1(1 + 1)h, these diagrams would also represent the addition of the angular momenta L and S to obtain the angular momentum J and its z component J. Since this relation is approximately valid, such diagrams are sometimes used in discussions of atomic structure as a simplified description of the addition of the angular momentum vectors themselves. The description is another form of the vector model. The description is useful, but it must be remembered that it is only approximate. An accurate description of the behavior of the angular momenta would have an appearance similar to that previously shown in Figure 8-8, which illustrates the angular momentum vectors for the case 1 = 2, j = 5/2, m; = 3/2. Wfl1NOWOW EIVi flJ NV 1V101 LII S MAGNETI CDI POLEMOM ENTS, SPIN, AND TRANSIT IO N RATES co N z 5/2 3/2 CV 1/2 —1/2 —3/2 —5/2 Figure 8 10 Vector diagrams representing the rules for adding the quantum numbers / = 2 and s = 1/2 to obtain the possible values for the quantum numbers j and m i . Left: - The maximum possible value of j is obtained when a vector of magnitude / is added to a parallel vector of magnitude s, yielding j = / + s =2 + 1/2 = 5/2. The maximum possible z component of this vector gives the maximum possible value of the quantum number mi , and the minimum possible z component gives the minimum possible value of mi . The intermediate values of mi differ by integers. Thus the possible values are mi = —5/2, —3/2, —1/2, 1/2, 3/2, 5/2. Right: A vector of magnitude I = 2 is added to an antiparallel vector of magnitude s = 1/2 to yield a vector of magnitude j =I s = 2 — 1/2 = 3/2, which represents the minimum possible value of the quantum number j. The possible z components of the vector of magnitude j =°3/2, which differ in value by integers, correspond to the possible values m• = —3/2, —1/2, 1/2, 3/2. There are no values of j intermediate between 5/2 and 3/2 since its possible values also may differ only by integers. Note that these diagrams do not accurately represent the addition of the angular momenta associated with the quantum numbers. — 8-6 SPIN-ORBIT INTERACTION ENERGY AND THE HYDROGEN ENERGY LEVELS In the first part of this section we shall obtain an expression for the spin-orbit interaction energy in terms of the potential function V(r) and the quantum numbers 1, s, and j. In the second part we shall explain how the expression is used to predict the detailed structure of the energy levels of the hydrogen atoms. The expression for the spin-orbit interaction energy will also enter, on several occasions, into our subsequent discussion of multielectron atoms, and it will enter into our discussion of nuclei, since they have very strong spin-orbit interactions. According to (8-27), the spin-orbit interaction energy is 1 1 dV(r) AE = S• L 2tn2c2 r dr To express this in terms of 1, s, and j, we first write J=L+S Taking the dot product of this equality times itself, and employing the fact that L•S=S•L,we have J•J=L•L+S•S+2S•L So S•L=(J•J—L•L—S•S) /2 or S•L= (J2 —L2 —S2 )l2 (8-34) In a quantum state associated with the quantum numbers 1, s, and j, each term on the right has a fixed value, and S • L has the fixed value S•L= h2 [ j(j + 1) — 1(1 + 1) — s(s + 1)] Thus 0 r CO () dVr s(s + 1)] 1(1 + 1)— dr It should be evident that the spin-orbit energy for the state is the expectation value of this quantity. (See Appendix J for a detailed justification.) That is, the energy arising from the spin-orbit interaction is AE h2 4m2c2 [j(j + 1) — 1(1 + 1) — s( s + 1)] 1 dV(r) (8-35) where the expectation value (1/r) dV(r)/dr is calculated using the potential function V(r) for the system and the probability density (actually the radial probability density 4irr2RZR i1) for the state of interest. As was indicated earlier, (8-35) gives a convenient expression of an important result. Now we consider the energy levels of the hydrogen atom. In Section 7-5 we obtained the predictions of quantum mechanics for the energy levels of a hydrogen atom in which the spin-orbit interaction is not considered, and found that they are simply the predictions of the Bohr model. In Example 8-3 we estimated the change in the energy of a typical one of these levels due to the presence of the spin-orbit interaction. We found that the energy is shifted up by about one part in 10 4 if L is approximately parallel to S (if j = 1 + 1/2), and that it is shifted down about that amount if L is approximately antiparallel to S (if j = 1— 1/2). We also saw that there is obviously no spin-orbit energy shift if L = 0 (if j = 1/2). To obtain quantitative predictions of the hydrogen atom spin-orbit interaction energy-level shifts from the general expression of (8-35), the potential function is equated to the Coulomb potential V(r) = — e2/47rEOr, and then the expectation value (1/r) dV(r)ldr is calculated using the hydrogen atom eigenfunctions. However, before these predictions can be compared with experiments, other effects, of comparable importance in the hydrogen atom, must be taken into account. In discussing Sommerfeld's relativistic modification of the Bohr model in Section 4-10, we estimated that the shift in a typical hydrogen atom energy level, due to the relativistic dependence of mass on velocity, is about one part in 10 4. So this relativistic effect produces energy shifts in the hydrogen atom comparable to those produced by the spin-orbit interaction, which is really also a relativistic effect but a different one. A complete treatment of all the effects of relativity on the energy levels of the hydrogen atom can be given only in terms of the Dirac theory. But results which are almost (i.e., except for 1 = 0. states) complete can be obtained from the Schroedinger theory by adding to the simple hydrogen energy-level formula both the expectation value of the correction to the energy due to the spin-orbit interaction and the expectation value of the correction to the energy due to the dependence of mass on velocity. We shall not do this for two reasons: (1) it would get us into some fairly lengthy calculations, and (2) relativistic effects, other than the spin-orbit interaction, are significant only for hydrogen and a few more atoms of very small atomic number Z. For typical atoms of medium and large values of Z, and the levels involved in their optical spectra, the energy associated with these relativistic effects remains of the order of 10 -4 times the energy of a level. But we shall see later that the spin-orbit interaction energy increases very rapidly with increasing Z. The spin-orbit interaction is the only effect we have considered that is generally important in a typical atom, and we have already said enough about it here. Therefore, we do no more than present the results SPIN -OR BIT INTER ACTION EN ERG Y AND T HE HYDROG EN E NER GYLEVE LS h2 AE = 4 2c2 [1(1 + 1) w MAGNET IC DIPOLE M OMENTS, S PIN, AND TR AN SITION RATES of Dirac's completely relativistic treatment of the hydrogen atom energy levels, which predicts that the energies are (4^r€O 22h2n2 [1 + n2 j +1 1/2 ( 4n)] (8-36) In this equation µ stands for the reduced electron mass, u = mM/(m + M), and a is the fine-structure constant, a = e2/47rEahc ^ 1/137. If the student will compare these results of the Dirac theory with the results of the Sommerfeld model expressed in (4-27a) and (4-27b), he will see that they are essentially the same. (Both j + 1/2 and no are integers ranging from 1 to n.) Since the Sommerfeld model is based on the Bohr model, it is only a very rough approximation to physical reality. In contrast, the Dirac theory represents an extremely refined expression of our understanding of physical reality. That these two theories lead to essentially the same results for the hydrogen atom is a coincidence that caused much confusion in the 1920s, when the modern quantum theories were being developed. The coincidence occurs because the errors made by the Sommerfeld model, in ignoring the spin-orbit interaction and in using classical mechanics to evaluate the average energy shift due to the relativistic dependence of mass on velocity, happen to cancel for the case of the hydrogen atom. The energy levels of the hydrogen atom, as predicted by Bohr, Sommerfeld, and Dirac are shown in Figure 8-11. In order to make visible the energy-level splittings, 0 Bohr n= 3 n =2 Sommerfeld Dirac no= 3 ng =2 ng = 1 j= 5/2.l = 2 j = 3/2, l = 1,2 ng =2 j= 3/2,1=1 ng =1 j = 1/2,1=0,1 j= 1/2,1=0,1 -5 n=1 -15 j-1/2,1=0 Figure 8-11 The energy levels of the hydrogen atom for n = 1, 2, 3 according to Bohr, Sommerfeld, and Dirac. The displacements of the Sommerfeld and Dirac levels from those given by Bohr have been exaggerated by a factor of (1/a) 2 ti (137) 2 1.88 x 104. M H2 ---^ I O I D r,A,A1 1K= To amplifier SS M Metastable state n=2, j= 1/21 4.4x10 eV l=0 l-1 10.2 eV n=1, j = 1/2 1= 0 Ground state Figure 8-12 The apparatus of Lamb and Retherford. Molecular hydrogen (H 2) entering oven O is largely dissociated into atomic hydrogen which leaves the oven, passing through slits S, S. The arrangement K, A is essentially a vacuum diode, electrons being emitted from heated cathode K and accelerated toward anode A. As the hydrogen passes through this region, some atoms collide with the electrons and are excited into the n = 2, 1 = 0 state described in the text. This state is called a metastable state because decay from it to the ground state (n = 1, I = 0) is highly inhibited by the A/ selection rule and because all other states lie above it except the n = 2, I = 1, j = 1/2 state which, according to the Dirac theory, has exactly the same energy as the metastable state. The experiment showed, however that the / = 1 state was in fact about 4.4 µeV below the metastable state. These levels are shown below the apparatus. The metastable atoms pass out of the collision region K, A and are detected by detector D. Any mechanism which causes these atoms to undergo a transition to the / = 1 state (transitions to the ground state are forbidden) will result in a decreased signal from D, which is sensitive only to metastable atoms. Such transitions can be induced by passing the atoms through a region where there is an alternating electric field whose frequency y is such that by — 4.4 µeV, or y 1060 MHz. Such an alternating field is provided by a waveguide W,W, through whose walls the beam is passed. To measure exactly the energy difference (Lamb shift) between the metastable (/ = 0) and 1= 1 states (both n = 2, j = 1/2), we could in principle merely vary the frequency y, searching for a value that maximized transitions from the former to the la tt er state, thereby minimizing the signal from D. In practice, the frequency is not easily adjusted and the levels themselves are adjusted instead by a known amount by means of a magnet M,M, this shifting being due to the Zeeman e ff ect. SPIN -O RB IT INTERACTIONENER GY A ND TH E HY DROG EN E NERGY LEVELS called the fine structure, the shifts of the Sommerfeld and Dirac energy levels from those given by Bohr have been exaggerated by a factor of (137) 2 = 1.88 x 104. Thus the diagrams would be completely to scale if the value of the fine-structure constant a were 1 instead of ^ 1/137. Not shown on the Dirac energy-level diagram are the values of the quantum number mi, which specify the orientation in space of the atom, since its energy is independent of the orientation if there are no external fields. There is a similar space orientation quantum number in the Sommerfeld model, whose M AGNETI C DIPOLE MOME NTS, SPI N, AND TRANSIT ION RATES values are not shown on the Sommerfeld energy levels, since the quantum number is of no consequence unless an external field is applied to the atom. Also not shown are the energy levels of hydrogen measured by optical spectroscopy. They are in very good agreement with the levels of both Sommerfeld and Dirac. The only difference between the results of these two treatments is that Dirac, but not Sommerfeld, predicts that for most levels there is a degeneracy (in addition to the trivial degeneracy with respect to space orientation just mentioned) because the energy depends on the quantum numbers n and j but not on the quantum number 1. Since there are generally two values of l corresponding to the same value of j, the Dirac theory predicts that most levels are really double. This prediction was verified experimentally in 1947 by Lamb, who showed that for n = 2 and j = 1/2 there are two levels, which actually do not quite coincide. The 1 = 0 level lies above the / = 1 level by about one-tenth the separation between that level and the n = 2, j = 3/2, l = 1 level. The experiments involved measuring the frequency of photons absorbed in transitions between the two levels, using the apparatus shown in Figure 8-12. The energy separation between these levels is so small that the frequency is in the microwave radio range. Since measurements of radio frequencies can be made very accurately, it is possible to obtain the energy separation to five significant figures. These very accurate measurements of the so-called Lamb shift can be explained with precision in terms of the theory of quantum electrodynamics, as can the slight departure of the spin g factor from 2 mentioned in Section 8-3. We cannot develop this quite sophisticated theory here, but we shall discuss it in the following section in connection with radiation by excited atoms, and in Chapter 17 in connection with the properties of the elementary particles. Even with its exaggerated scale, Figure 8-11 cannot show the hyperfine splitting of co the energy levels, which in hydrogen is due to an interaction between the internal magnetic field produced by the motion of the electron and a spin magnetic dipole moment of the nucleus. As nuclear magnetic dipole moments are smaller than electronic magnetic dipole moments by —10',the hyperfine splitting is smaller than the spin-orbit splitting by the same factor. Nevertheless, we shall see later that this effect can be understood quantitatively in terms of Schroedinger quantum mechanics, and that it can be used to measure nuclear spins and magnetic moments. In fact, every aspect of the behavior of a hydrogen atom can be explained in detail by the theories of quantum physics! 8-7 TRANSITION RATES AND SELECTION RULES If hydrogen atoms are excited to their higher energy levels, e.g., in collisions with energetic electrons in a gas discharge tube, the atoms will in due course spontaneously make transitions to successively lower energy levels. In each transition between a pair of levels, a photon is emitted of frequency equal to the difference in their energies divided by Planck's constant. The discrete frequencies emitted in all the transitions that take place constitute the "lines" of the spectrum, but measurements show that not all conceivable transitions do take place. Photons are observed only with frequencies corresponding to transitions between energy levels whose quantum numbers satisfy the selection rules: Al = +1 (8-37) Aj = 0, ± 1 (8-38) That is, transitions take place only between levels whose 1 quantum numbers differ by one and whose j quantum numbers differ by zero or one. Measurements of the spectra of other one-electron atoms show that these selection rules apply to transitions in all such atoms. that they are not normally observed. We have already used elementary quantum mechanics, in Example 5-13 and the discussion following, to develop much of the physical picture that the theory provides for the emission of photons by excited atoms. According to that example, if the wave function describing an atom is the wave function associated with a single quantum state, then the probability density function for the atom will be constant in time. But if the wave function is a mixture of the wave functions associated with two quantum states, corresponding to the two energy levels E2 and E1 , then the probability density contains terms which oscillate in time at frequency y = (E2 — E1)/h. Since the atomic electron can be found at any location where the probability density has an appreciable value, the charge it carries is not confined to a particular location. In effect, the atom has a charge distribution which is proportional to its probability density. Thus when the atom is in a mixture of two quantum states its charge distribution oscillates at precisely the frequency of the photon emitted in the transition between the states. This is true since the photon carries away the excess energy E2 — E1, and so has frequency y = (E 2 — E1)/h. The simplest aspect of the atom's charge distribution that can be oscillating is the electric dipole moment. This is the product of the electron charge and the expectation value of its displacement vector from the essentially fixed massive nucleus. The electric dipole moment is a measure of the separation of the center of the electron charge distribution from the nuclear center of the atom. Even in classical physics, a charge distribution that is constant in time will not emit electromagnetic radiation, while a charge distribution with an oscillating electric dipole moment emits radiation of frequency equal to the oscillation frequency. In fact, an oscillating electric dipole is the most efficient radiator. We can actually use the classical formula for the rate of emission of energy by an oscillating electric dipole to obtain the important factors in thé formula for atomic transition rates. In Appendix B it is shown that the dipole radiates electromagnetic energy at the average rate R, where 47z 3 v4 R = 3EOC3 p2 (8-39) with p the amplitude of its oscillating electric dipole moment and y the frequency of oscillation. Since the energy is carried off by photons whose energies are of magnitude hv, the rate of emission of photons, R, is R 4ir3 v 3 2 R=—= p 3E0hc3 (8-40) ( ) This probability per second that a photon is emitted is just equal to the probability per second that the atom has undergone the transition. Thus R is also the atomic transition rate. S31 f11:i NOI10313SaNd S31H1:1 N OIlI SNHal As discussed in Section 4-11, some of the selection rules could be given some justification in the old quantum theory by using the correspondence principle to invoke certain restrictions that apply in the classical limit; but the predictions of this technique were not reliable. Furthermore, the old quantum theory had nothing at all to say about atomic transition rates. A transition rate is the probability per second that an atom in a certain energy level will make a transition to some other energy level. It is easy to measure a transition rate by measuring the probability per second of detecting a photon of the corresponding frequency, since this is proportional to the intensity of the corresponding spectral line. So it should certainly be possible to calculate a transition rate from atomic theory. An impressive feature of the Schroedinger quantum mechanics is that this can be done with no difficulty, using the atomic eigenfunctions. Of course all the selection rules can be obtained from transition rate calculations, since a selection rule just specifies which transitions have rates so small 0 w MAG NETICD IPOLE MOMENTS, SPIN, AND TRANSITI O NRATES N Relative to an origin at the essentially fixed nucleus, the electric dipole moment p of the one-electron atom is defined as (8-41) p = —er where —e is the charge of the electron and r is its position vector from the nucleus at the origin. To obtain an expression for the amplitude of the oscillating electric dipole moment of the atom when it is in a mixture of two states, we calculate the expectation value of p, using the mixed state probability density obtained in Example 5-13 E1)tm ,/, ,I' I, 'I, /, ' I, i(E2 ,/, ,/^ = Crc1W14'1 + *21/402 + c!cit 2 4'1e -E1)t/^i + crc2`Vtu'2e-i(EZThere is no way, from the present argument, for us to determine precisely what values of the adjustable constants c 1 and e2 should be used to specify how much of the two quantum states are mixed together. But the results we seek are independent of their values, as will be seen shortly, so for simplicity we set them both equal to 1. Then we have Ef)t/fi + ,l, D ie —i(EL—E1)11 w*ip = Vf(Ÿf + oc o + * i// f e i(E; — where we have replaced the labels 2 and 1 by i and f, for initial and final. As this probability density is not normalized, when we use it to evaluate the expectation value of p we obtain only a proportionality, but this will suffice. That is, we have p cc or p cc I kIkf ch + J J' T1*(_ er)11 dT cc -Veil-P dz i(E;- Ef)t/he I tfrt erll/f dti + e - i(E , -Ef)t/# J * ertui d Z teri tuirertu J1d + e where we have sandwiched the term er between the other terms of the integrands to conform with accepted notation, and where the integrals are three-dimensional. Now the first two integrals on the right are not associated with an oscillating p; in fact both integrals yield zero. The last two integrals are each multiplied by complex exponentials with a time dependence that oscillates at the frequency y = (1/27r)(Ei — E f )/h = (E1 — E f )/h. These two terms describe oscillations in the electric dipole moment expectation value, of amplitude which is measured by the magnitude of the integral in either term. Thus we find that the amplitude of the oscillating electric dipole moment is proportional to the quantity pfi , where r (8-42) pfi - ^/i fer^A i ch J This quantity is called the matrix element of the electric dipole moment taken between the initial and final states. Note that its value depends on the behavior of the atom in both the initial state, through th, and in the final state, through tPfc. This is reasonable because the radiating atom is in a mixture of the two states. Setting the p in (8-40) proportional to pfi , we obtain R oc 32 v pfi EphC 3 where R is the transition rate. We have obtained the factors v 3 and pfi, as well as the constants Eohc 3 , in the expression for the transition rate by a partly classical argument. A much more sophisticated argument which uses only Schroedinger quantum mechanics (and is based on the last equation derived in Appendix K) leads to the same result, except that the numerical proportionality constant is determined. The result is R= 167t 3 v 3pfi 3€0hc3 (8-43) - Inducing photon O Before During Emitted photon After A schematic illustration of the emission of a photon by an atom. Electromagnetic radiation impinging on the atom induces dipole charge oscillations in the atom. Then the atom emits electromagnetic radiation. Figure 8-13 TRANSITION RATES AND SELECTION RULES The same equation can be derived in an even more rigorous manner from the theory of quantum electrodynamics, which provides an exact treatment of the quantization properties of electromagnetic fields. Although the results are not different, quantum electrodynamics gives a more complete picture of the emission of photons by excited atoms. In particular, it explains how the radiating atom gets into the mixed state. This happens through a kind of resonance interaction between vibrations of the appropriate frequency, in a surrounding field of electromagnetic radiation, and an atom in the initial state. The interaction induces the charge oscillations of that frequency, which are characteristic of the mixed state, and then the atom emits electromagnetic radiation of the same frequency. The process is indicated schematically in Figure 8-13. The emission of photons by atoms, under the influence of the photons that comprise an electromagnetic field applied to the atom, is a phenomenon called stimulated emission. Atoms also emit photons when an electromagnetic field is not applied, in a phenomenon called spontaneous emission. Quantum electrodynamics shows that spontaneous emission takes place because there is always some electromagnetic field present in the vicinity of an atom, even if a field is not applied! The reason is that the electromagnetic field has an energy content which is discretely quantized because the energy, at any particular frequency, is given by the number of photons of that frequency. Like any other system with discretely quantized energy, the electromagnetic field has a zero point energy. The quantum electrodynamics shows that there will always be some electromagnetic field vibrations present, of whatever frequency is required to induce the charge oscillations that cause the atom to radiate "spontaneously." We can see that spontaneous and stimulated emission are qualitatively similar. In spontaneous emission, the electromagnetic field surrounding the atom is in its zero-point energy state. In stimulated emission an additional field is applied so that the electromagnetic field surrounding the atom is in a higher energy state. Then more intense field vibrations of the required frequency are present, and there is more chance that the atom will be stimulated to radiate. From this argument, it is apparent that the transition rate for stimulated emission is proportional to the intensity of the applied electromagnetic field. For intense fields it becomes very large and the atom radiates very efficiently. This has important practical consequences in the laser, a device to produce extremely bright beams of coherent light that will be discussed in Chapter 11. In that chapter we shall go more deeply into the relation between stimulated and spontaneous emission, but here we shall consider only spontaneous emission. The transition rate for spontaneous emission, evaluated in (8-43), is independent of whether or not an external field is applied. It depends only on the proporties of the atomic eigenfunctions. Since the eigenfunctions are known, the electric dipole moment matrix elements between various pairs of levels can be obtained by calculating the value of the associated integral (8-42). Then the rates for transitions between these levels can be calculated from (8-43). N ^ MAG NETICDIPO LE MO MENTS, SPIN, AND TRAN SITION RATES N It is found that the agreement between the predictions and the measurements is quite good, even though the transition rates vary appreciably from one case to the next. For the transition of the hydrogen atom from its first excited state to its ground state, the transition rate has the value R ti 108 sec'. This means that in about 10 -8 secthprobaily nsthaocuredibqlton.Isaid that the first excited state has a lifetime t = 1/R ^ 10 -8 sec. Although the 1,3 dependence in (8-43) leads to a range of values of R, the value just quoted is typical of the orders of magnitude encountered in atomic transition rates—except that the transition rates between certain pairs of levels are essentially zero. These are the transitions for which the spectral lines are observed to be absent, or extremely weak. The transition rates are predicted to be zero in these cases because the integral in the electric dipole matrix element yields zero. Thus the selection rules are a set of conditions on the quantum numbers of the eigenfunctions of the initial and final energy levels, such that the electric dipole matrix elements are zero when calculated with a pair of eigenfunctions whose quantum numbers violate these conditions. When a hydrogen atom is placed in a very strong external magnetic field, the spin-orbit interaction coupling of its orbital angular momentum L to its spin angular momentum S is overwhelmed, and both vectors precess independently about the direction of the external field with constant z components LZ = mih and SZ = msh. That is, mi and m, are good quantum numbers under these circumstances. Spectrum measurements made on such atoms show the existence of a selection rule Am / = 0, ± 1. Obtain this section rule by evaluating the appropriate electric dipole matrix element. ^^ Written in full, the matrix element is Example 8-6. n 2 7r JJ J (' ' 1f (r,B,9)erilr i(r,B,cp)r 2 sin B dr dB dip 0 The triple integral factors into the product of three single integrals. The one that is interesting, because it leads to the selection rule, is 00 2n I (131 f(p)r(1)i(9) dcp = ^ 0 This is a vector quantity, which has components 27r Ix = J 4131 (9)x 0 i(9) dcP 0 2^ 1- y = r (1)4/(0.0)i(rp)d9 J 0 2n Iz = r J0 o f (ozali((p)d9 If we use the relations x sin 0 cos cp r sin 0 sin (p =r y= z=r cos B which can be verified by inspecting Figure 7-2, and also evaluate we obtain 2n ' Ix = r sin B J cos (pe i(m`; - nilf»dip f o ci(cp) and f((p) from (7-19), 2n ' i Iy = r sin B sin (pe` (mti - "f»4p o 27c I= r cos 9 z ei(mli - mif )^ d(p Any table of definite integrals will show that the integral in Iz equals zero, unless ml, mi f = 0 or Ami = 0 The integral in Ix can be rewritten, to yield 2a — Ix = 12 r sin B Cei(mt i mi f -1)^0 + ei(m ti -m if +1)ip] d(p o This definite integral equals zero, unless — mif = ± 1 or Am i = +1 The same result is obtained from the integral in Iy . Therefore, unless Am i = 0, or ± 1, there will be no components of I that are not zero. Since this will also be true of the electric dipole matrix element, we have obtained the selection rule. • Physically, the selection rules arise because of symmetry properties of the oscillating charge distribution of the atom. The atom cannot radiate like an electric dipole unless the electric dipole moment of its electron charge distribution is oscillating. A classical analogy is found in a very short antenna, which is center-fed from high frequency sources of alternating current, as illustrated in Figure 8-14. If the leads to the antenna are fed out of phase, so that charge flows into one end at the same time it flows out of the other, the antenna will radiate relatively efficiently. But if To ground To ground Upper diagrams: Center-fed antennas driven out of phase. Lower diagrams: Driven in phase. Left diagrams: The charge distributions are shown at some initial time. Right diagrams: At half a period later. The antenna driven in phase will emit very little radiation if its length is short compared to a wavelength, and if the distance to the ground plane is long compared to a wavelength. Figure 8-14 TRANS ITION RATES AND SELECTI ONR ULES Jo rn MAGNETI CD IPO LE MOMENT S, SP IN, AND TRANS ITIO N RATES N the leads are fed in phase, so that charge flows into or out of both ends in unison, the antenna will hardly radiate at all. Mathematically, it is the symmetry properties of the eigenfunctions in the matrix element that are responsible for the selection rules. Some idea of this can be obtained in an easy way by considering the parities of the eigenfunctions. In Section 6-8 we defined the parity of a one-dimensional eigenfunction as the quantity which describes the behavior of the eigenfunction when the sign of the coordinate is changed. The definition can be extended immediately to three dimensions. That is, eigenfunctions satisfying the relation (8-44) t/i(—x,—y, —z) = +ÿr(x,y,z) are said to be of even parity, and eigenfunctions satisfying the relation (8-45) li( — x, — y, — z) = — 11J(x, y,z) are said to be of odd parity. All eigenfunctions that are bound-state solutions to time-independent Schroedinger equations for a potential that can be written as V(r), like the Coulomb potential, have definite parities, either even or odd. The reason is that the probability densities ietP will then have the same value at the point (— x, —y,—z) that they have at the point (x,y,z), which is a requirement of the fact that the potential has the same value at these points. An example is found in the one-electron atom eigenfunctions of Table 7-2. To see this, inspect Figure 8-15, which shows that when the signs of the rectangular coordinates are changed in the parity operation the behavior of the spherical polar coodinates is (8-46) cp —+ir + (p r^ r, 9—*rc — B, By carrying out these changes on several of the eigenfunctions, it is easy to demonstrate that (8-47) 4inlmt(r, 7L — 9,rc + cp) = (— 1)1 /I 1m l (r,e,Çp) The parity is determined by (-1) i; it is even if the orbital angular momentum quantumnumber 1 is even, and odd if l is odd. This is true for all eigenfunctions, bound or unbound, of any spherically symmetrical potential V(r), since the only significant assumption that is used to obtain (8-47) is that V can be written as V(r). Now consider the matrix element of the electric dipole moment Pfi = fvJ f*ervJ dr The parity of er is odd since the vector r changes into its negative when the signs of the rectangular coordinates are changed. Therefore, if the initial and final eigenfunctions `Y i and Of are of the same parity, both even or both odd, the entire integrand will be of odd parity. If this is the case the integral will yield zero because the conz z B x Figure 8-15 Illustrating the parity operation. 8 8 A COMPARISON OF THE MODERN AND OLD QUANTUM THEORIES - We shall very briefly summarize the last chapters by making a comparison between the modern quantum theories (Schroedinger, Dirac, and quantum electrodynamics) and the old quantum theories (Bohr and Sommerfeld). One of the most striking aspects of the modern quantum theories is the way they lead progressively to more and more accurate treatments of the hydrogen atom. The Schroedinger theory without electron spin accounts for the energy levels of the atom that are observed in spectroscopic measurements of moderate resolution. Measurements of high resolution reveal the fine-structure splitting of the energy levels. They A COMPARI SO N OF THE MODERN AND OLD QUANTUM THEORIES tribution from any volume element will be cancelled by the contribution from the diametrically opposite volume element. Then the transition rate will also be zero. Therefore, the parity of the final eigenfunction must differ from the parity of the initial eigenfunction in an electric dipole transition. Since the parities are determined by (-1)', we can understand why transitions for A/ = 0, or ± 2, are not allowed, in agreement with the Al = ± 1 selection rule of (8-37). The reason is that in such transitions the parities of the initial and final eigenfunctions would be the same. Quantum electrodynamics shows, and experiments verify, that a photon carries angular momentum as well as linear momentum. In particular, the theory shows that the angular momentum carried by a photon emitted in an electric dipole transition is, in units of h, equal to 1. From this point of view, the total angular momentum quantum number selection rule Aj = 0, ± 1 of (8-38) represents the requirements of angular momentum conservation, which is fundamentally a symmetry property, by restricting electric dipole transitions to pairs of states where the change in the total angular momentum of the atom can be compensated for by the angular momentum carried by the photon it emits. (When Aj = 0 angular momentum conservation is satisfied by a change in the orientation in space of the total angular momentum vector of the atom at the time the photon is emitted.) This point of view also makes it apparent that Al = ± 3 electric dipole transitions cannot occur because they would lead to too large a change in the total angular momentum, even though they would be all right as far as parity is concerned. It should be mentioned that selection rules do not absolutely prohibit transitions that violate them, but only make such transitions very unlikely. If a transition cannot take place by the normal means of emission of radiation from an oscillating electric dipole moment, there is a very small probability (typically reduced by a factor of about 10 -4) that it will take place by emission of radiation from an oscillating magnetic dipole moment. This may occur through oscillations in orientation of electron spin angular momentum and magnetic dipole moment. Transitions can also take place with very small probabilities (typically reduced by approximately a factor of 10 -6 ) by emission of radiation from an oscillating electric quadrupole moment. This involves oscillations in the electron charge distribution of the atom between an elongated ellipsoid and a flattened ellipsoid. If an atom is excited to a state from which it can return to its ground state only by one of these highly inhibited transitions, it may remain in the excited state for an appreciable fraction of a second, instead of the lifetime of 10 -8 sec corresponding to the typical transition rate of 10 8 sec -1 . The excited state is said to be metastable, and the delayed emission of a photon is a form of phosphorescence. In practice, phosphorescence of atoms is rarely observed because the metastable state is deexcited, without the emission of a photon, when the atom collides with the wall of its container and gives up its excess energy directly to the atoms of the wall. A process completely analogous to phosphorescence is commonly observed in nuclei, however. MAG NETIC DIPO LEMOMENTS, SPIN, AND TRANS ITIO N RATES can be explained almost completely by adding to the Schroedinger theory corrections for the electron spin-orbit interaction and for the relativistic dependence of mass on velocity. They can be explained completely by the Dirac theory. Spectroscopic measurements of very high resolution show the Lamb shift, which can be understood in terms of quantum electrodynamics. Extremely high-resolution measurements show the hyperfine splittings, which can be accounted for in the Schroedinger theory by an interaction involving the nuclear spin. Another great success of the modern quantum theories is their ability to give very satisfactory treatments of the transition rates and selection rules observed in the measurements of the spectra emitted by hydrogen atoms, and all other one-electron and multielectron atoms. The record of the old quantum theory is spotty. The Bohr model leads to correct values for the energies of the unsplit hydrogen atom levels. Sommerfeld's relativistic modification of the model agrees with the fine-structure splittings in hydrogen, but the agreement is accidental. The relativistic modification cannot account for the Lamb shift, nor for hyperfine splittings. Furthermore, it disagrees by orders of magnitude with the fine-structure splittings seen in typical multielectron atoms. In fact, the Bohr model itself fails completely to explain many of the most obvious features of the energy levels of multielectron atoms; it is already in serious trouble with the helium atom that contains only two electrons. The old quantum theory is unreliable in explaining selection rules, and incapable of explaining transition rates. A particularly helpful feature of the Schroedinger theory is that almost all of the work done in applying it to one-electron atoms carries over directly when it is applied to multielectron atoms. And the theory is certainly accurate enough to explain every important feature of multielectron atoms. Furthermore, it is not very much more complicated to apply Schroedinger quantum mechanics to such atoms than it is to apply it to one-electron atoms. As we shall see in the next two chapters, part of the reason that this is true is that most of the electrons in a multielectron atom group together with other electrons to form symmetrical and inert shells in which they do not have to be treated individually. Only the few electrons in the atom which are not in such shells require detailed treatment. QUESTIONS 1. Why, in discussing Figures 8-1 and 8-4, do we speak of fictitious magnetic poles? 2. Why does the torque acting on a magnetic dipole in a magnetic field cause the dipole to precess about the field, instead of lining up with the field? 3. It is not possible to do a Stern-Gerlach experiment on a free electron to measure its spin magnetic dipole moment; it is only possible if the electron is in a neutral atom. Explain why. (Hint: There is a superficial answer, which has a superficial rebuttal. A complete answer involves the uncertainty principle.) 4. Exactly why do we conclude that the spin quantum numbers are half-integral? 5. Is it fair to criticize Schroedinger quantum mechanics for not predicting electron spin? 6. Are there conceptual difficulties with the idea of a point electron? 7. Is the electron the "ultimate magnetic particle"? S. Explain in simple terms why an electron in a hydrogen atom experiences a magnetic field. Does it experience a field in all quantum states? 9. Just what is the spin-orbit interaction? How does it lead to the observed fine-structure splitting of the spectral lines of the hydrogen atom? 10. When the spin-orbit interaction is taken into account, it is sometimes said that m1 and ms What are the good quantum numbers for the one-electron atom when the spin-orbit interaction is taken into account? arenolg"dqutmnbers."Explaiwhytmnogsaprite. PROBLEMS 1. Evaluate the magnetic field produced by a circular current loop at a point on the axis of symmetry far from the loop. Then evaluate the magnetic field produced at the same point by a dipole formed from two separated magnetic monopoles located at the center of the loop and lying along the axis of symmetry. Show that the fields are the same if the current in the loop and its area are related to the magnetic moment of the dipole by (8-2). Can you see how to extend the argument to show that the fields will be the same at all points far from the loop or dipole, and independent of the shape of the loop? 2. (a) Evaluate the ratio of the orbital magnetic dipole moment to the orbital angular momentum, iti/L, for an electron moving in an elliptical orbit of the Bohr-Sommerfeld atom discussed in Section 4-10. (Hint: The area swept out by the radius vector of length r, when the angular coordinate increases by the increment dB, is dA = r2 d9/2. Use L = mr2 dB/dt to evaluate dB in terms of the time increment dt, and then make the trivial integration.) (b) Compare the results with those of (8-5) for a circular orbit. 3. The field of an electromagnet is given by B = 0.02 + 0.0115z2, with B in tesla and z = distance in cm from the north pole of the magnet. A magnetic dipole whose moment has magnitude 1.34 x 10 -23 amp-m2 is located 8.00 cm from the north pole, the dipole moment vector at 40° to the local magnetic field direction. What are (a) the torque on the dipole, (b) the force on the dipole, and (c) the energy released if the magnetic dipole is turned parallel to the field? 4. A beam of hydrogen atoms in their ground state is sent through a Stern-Gerlach magnet, which splits it into two components according to the two spin orientations. One component is stopped by a diaphragm at the end of the magnet, and the other continues into a second Stern-Gerlach magnet which is coaxial with the beam leaving the first magnet, but is rotated relative to the first magnet about their approximately common axes through an angle a. There is a second diaphragm fixed on the end of the second magnet which also allows only one component to pass. Describe qualitatively how the intensity of the beam passing the second diaphragm depends on a. 5. Determine the field gradient of a 50 cm long Stern-Gerlach magnet that would produce a 1 mm separation at the end of the magnet between the two components of a beam of sw378oad 11. What are good quantum numbers for a one-electron atom in an external magnetic field which, compared to the internal field, is very weak? Extremely strong? 12. Why is the spin-orbit interaction particularly sensitive to the form of the potential V(r) for small r? How can this be used to study experimentally the potentials of multielectron atoms? 13. What is the justification of performing vector additions, as in Figure 8-10, with vectors whose lengths are proportional to the quantum numbers specifying the angular momenta, instead of with the angular momentum vectors themselves? 14. Describe briefly all the features of the hydrogen atom energy-level diagram in Figure 8-11, and explain the origin of these features. What features are not shown? 15. Can there be electromagnetic radiation emitted from an oscillating electric monopole (i.e., emitted from a charge of oscillating magnitude at a fixed location)? 16. There are similarities between the emission of electromagnetic radiation by a system of oscillating charges, and the emission of gravitational radiation by a system of oscillating masses, but dipole gravitational radiation cannot be emitted. Why? 17. What experimental evidence do you know of that is in contradiction to the presence of zero-point energy vibrations of the electromagnetic field? In support of its presence? 18. What is the relation between spontaneous and stimulated emission? 19. Explain in physical terms the origin of the selection rules. 20. Do all atoms pf a certain species take the same time to make a transition between a certain pair of levels? MAGNETICDIPO LE MOME NTS, SP IN, AND TRAN SITION RATES 6. 7. 8. 9. 10. 11. 12. silver atoms emitted with typical kinetic energy from a 960°C oven. The magnetic dipole moment of silver is due to a single 1 = 0 electron, just as for hydrogen. If a hydrogen atom is placed in a magnetic field which is very strong compared to its internal field, its orbital and spin magnetic dipole moments precess independently about the external field, and its energy depends on the quantum numbers ml and ms which specify their components along the external field direction. (a) Evaluate the splitting of the energy levels according to the values of m l and ms. (b) Draw the pattern of split levels originating from the n = 2 level, enumerating the quantum numbers of each component of the pattern. (c) Calculate the strength of the external magnetic field that would produce an energy difference between the most widely separated n = 2 levels which equals the difference between the energies of the n = 1 and n = 2 levels in the absence of the field. Use the procedure of Example 8-3 to estimate the spin-orbit interaction energy in the n = 2, 1= 1 state of a muonic atom, defined in Example 4-9. Prove that the only possible values of the quantum number j from the series j = l + 1/2, l — 1/2, 1— 3/2, ... , that satisfy the inequality ,\,/j (j + 1) > IVl(l + 1) — Js(s + 1) 1 with s= 1/2, are j=l+ 1/2, 1- 1/2,if 1 0, or j = 1/2, if 1 = 0. (a) Enumerate the possible values of j and mj, for the states in which 1 = 1, and, of course, s = 1/2. (b) Draw the corresponding "vector model" figures. (c) Draw a figure illustrating the angular momentum vectors for a typical state. (d) Show also the spin and orbital magnetic dipole moment vectors, and their sum the total magnetic dipole moment vector. (e) Is the total magnetic dipole moment vector antiparallel to the total angular momentum vector? Consider the states in which l = 4 and s = 1/2. For the state with the largest possible j and largest possible m j, calculate (a) the angle between L and S, (b) the angle between µi and µs , and (c) the angle between J and the +z axis. Enumerate the possible values of j and mj for states in which / = 3 and s = 1/2. The relativistic shift in the energy levels of a hydrogen atom due to the relativistic dependence of mass on velocity can be determined by using the atomic eigenfunctions to calculate the expectation value AErei of the quantity AErei = Erei — Ecias, the difference between the relativistic and classical expressions for the total energy E. Show that for p not too large AErei p4 = 8Yn 3C2 E2 + V2 -2EV 2mc 2 so that e4 AErei = — , ^,^ 1 2mc 2 (4itE0)22mc2 lJnljm) Y2 " nljm^ dz 1 Ene2 ç Inljm, r Cairn; dz 47LE0mC 2 J 13. (a) Draw the hydrogen energy-level diagram for all states through n = 2 as in the righthand part of Figure 8-11, but with the splitting according to l also shown. (b) With arrows connecting pairs of levels, show all the transitions that are allowed by the selection rules. 14. Verify that the parities of the one-electron atom eigenfunctions r 3007 0310 , 032o, and t//322 are determined by (-1) 1. 15. (a) Use parity considerations to prove that the first two integrals of the display equation preceding (8-42) both yield zero. (b) Interpret what this means about the existence of atomic electric dipole moments which are static in time. 16. By a straightforward evaluation of the electric dipole matrix elements for the eigenfunctions of Table 7-2, show that the selection rule Al = ± 1 of (8-37) is valid for the n = 2 —* n = 1 transitions of the hydrogen atom. 17. Consider the electric dipole moment matrix elements for a charged one-dimensional simple harmonic oscillator making the transitions ni = 3, n f = 0; ni = 2, nf = 0; ni = 1, n1 = 0. Use the eigenfunctions of Table 6-1 to show that the matrix elements which are 19. 20. 21. SW318 01:Id 18. not zero agree with the selection rule An = ± 1, discussed in Section 4-11. (Hint: Use parity considerations whenever you can.) (a) Calculate the rate for spontaneous transitions between the n = 1 and n = 0 states of a simple harmonic oscillator, carrying charge e. Take the mass of the oscillator to be equal to the mass of an atom of some typical ionic molecule, and the restoring force constant C to be 10 3 joules/m2 , which is typical for such a molecule. (Hint: Normalized eigenfunctions must be used.) (b) From the transition rate, estimate the average time required to complete the transition. This is the lifetime of the n = 1 vibrational state of the molecule. Consider enough of the electric dipole moment matrix elements for a charged particle in an infinite square well potential, using the eigenfunctions of Section 6-8, to see if there is a selection rule for this system and, if so, to determine what it is. Find the selection rule for a rigid rotator carrying charge — e. Use the eigenfunctions in 0 found in Problem 23 of Chapter 7. (Note: the selection rule to be found is Am = +1 not Am = 0, ± 1.) Use the result of Problem 8-20 to find the ratio R i2 /Rol of the rates of transition from states 2 to 1 and 1 to 0. 9 MULTIELECTRON ATOMS-GROUND STATES AND X-RAY EXCITATIONS 9-1 INTRODUCTION procedure to be used in analyzing a complicated system by a series of not too complicated steps 9 2 - IDENTICAL PARTICLES 302 relation to multielectron atoms; distinguishability of identical particles in classical physics; indistinguishability in quantum physics; time-independent Schroedinger equation for two noninteracting identical particles; necessity of, and difficulty with, labeling particles; eigenfunctions whose probability densities are unchanged in relabeling; symmetric and antisymmetric eigenfunctions for two identical independent particles in box; orthogonality 9 3 - THE EXCLUSION PRINCIPLE 308 weak and strong statements of principle; Slater determinants; fermions and bosons; relation between spin and symmetry 94 - EXCHANGE FORCES AND THE HELIUM ATOM 310 separation of space and spin eigenfunctions for two noninteracting electrons; general form of symmetric and antisymmetric space eigenfunctions; specific forms of singlet antisymmetric spin eigenfunction and of triplet symmetric spin eigenfunctions; total spin; quantum numbers s' and ms; geometrical interpretation of singlet and triplet spin states; correlation between spin and space coordinates; exchange forces; low-lying excited states of helium; helium ground state and Pauli's discovery of exclusion principle 9 5 - THE HARTREE THEORY 319 necessity of treating atomic electrons as moving independently in net potential; self-consistent determination of net atomic potenti al ; Hartree's procedure; Fock's calculation 96 - RESULTS OF THE HARTREE THEORY multielectron atom eigenfunction angular dependence; radial and total radial probability densities; argon atom results; shells; effective Z; shielding; shell radii and energies described by using effective Z in one-electron atom equations; l dependence of electron energies; its physical origin; subshells 300 322 GROUND STATES OF MULTIELECTRON ATOMS AND THE PERIODIC TABLE 331 significance of periodic table; energy ordering of outer filled subshells; spectroscopic notation; electron configuration; exclusion principle construction of periodic table; exceptional configurations; origin of properties of noble gases, alkalis, halogens, transition elements, lanthanides, and actinides; ionization energy; electron affinity 9 8 - X RAY LINE SPECTRA - 337 x-ray tubes; production of line spectra; holes; x-ray energy levels; x-ray notation; selection rules; effective Z estimates of x-ray wavelengths and their relation to Moseley's experiment and interpretation; determination of atomic number; x-ray absorption and absorption edges QUESTIONS 343 PROBLEMS 344 9-1 INTRODUCTION In this chapter we shall use Schroedinger's quantum mechanics to study multielectron atoms from helium to uranium. First we shall discuss in a general way the interesting properties of quantum mechanical systems containing several identical particles, such as electrons. This will lead us to the so-called exclusion principle, which is of dominant importance in determining the structure of multielectron atoms. Then we shall consider multielectron atoms in their ground states, and the systematic description of these atoms provided by the periodic table of the elements. We shall see that quantum mechanics gives a complete explanation of the periodic table, which is the basis of inorganic chemistry and much of organic chemistry and solid state physics. Finally, we shall consider the high-energy excited states of multielectron atoms that are involved in the emission of x rays by these atoms. A multielectron atom of atomic number Z contains a nucleus of charge + Ze surrounded by Z electrons each of charge —e. Every electron moves under the influence of an attractive Coulomb interaction exerted by the nucleus and the repulsive Coulomb interactions exerted by all the other Z — 1 electrons, as well as certain weaker interactions involving the angular momenta. The quantum mechanical treatment of this complicated system is easier than might be supposed. One reason is that the various interactions experienced by an atomic electron are of different strengths, so it is possible to deal with them one or two at a time in order of decreasing strength. In the first step, which we consider in this chapter, an approximate description which takes into account only the strongest interactions is developed. In subsequent steps, which we consider in the next chapter, the description is made more and more exact by successively taking into account the weaker interactions. We shall find that with this procedure it is not difficult to obtain a qualitative understanding of the behavior of multielectron atoms. Quantitative information about multielectron atoms can be obtained from this approximation procedure, but the required calculations must be carried out on large computers. Of course, we shall not be able to reproduce such calculations. However, in this chapter and the next we shall describe the calculations and their results. We shall also compare the results with the properties of multielectron atoms observed W o NOIlOfIDOalNI 9-7 MU LTIELECTRO N ATO MS-G ROU ND STATESAND X- RAYEXC ITATIO NS by experiment. Our description will be based, in major part, on the theory of the one-electron atom developed in the preceding chapters. 9 2 IDENTICAL PARTICLES - Before studying multielectron atoms, we must discuss an important topic of quantum mechanics that does not enter into the theory of one-electron atoms. This concerns the question of how to give an accurate quantum mechanical description of a system containing two or more identical particles, such as electrons. Discussing this question will lead us to quantum mechanical phenomena that have absolutely no classical analogues. In fact, the discussion will bring out some of the most striking differences between classical and quantum mechanics. The nature of the question can best be illustrated by a specific example. Consider a box containing two electrons. These two identical particles move around in the box, bouncing from the walls and occasionally scattering from each other. In a classical description of this system, the electrons travel in sharply defined trajectories so that constant observation of the system allows us to distinguish between the two electrons, even though they are identical particles. For instance, in classical physics we can follow the development of the system, without disturbing it, by taking motion pictures of the system. If on a certain frame of the film we label the image of one of the electrons 1, and label the image of the other electron 2, we can follow the motion of the electrons through subsequent frames and always be able to say which electron is 1 and which electron is 2. The procedure is indicated in Figure 9-1. Of course, we cannot label the electrons themselves any more than we can paint one red and the other green. Electrons are identical particles—any electron is exactly the same as any other electron. Nevertheless, in classical physics identical particles can be distinguished from each other by procedures which do not otherwise affect their behavior, and so it is possible to assign labels to the particles. In quantum mechanics this cannot be done because the uncertainty principle does not allow us to observe constantly the motion of the electrons without changing their 2 1 1. • 2 2 •• 2 1 Figure 9-1 Top: A sequence of ten frames from a motion picture of two electrons moving in a box, according to classical physics. If labels were assigned to their images in the first frame, there would be no ambiguity in assigning the same labels to their images in any subsequent frame, although it may be necessary to use high magnification and "slow motion." Bottom: An enlarged superposition of all ten frames, showing the trajectories of the electrons. h2 aZY' T + aZW T + a2Y' T ôzi ^ 2m ^ ax; ay; hZ (.020T+ a2Y' T 2m ôx2 ôy2 + aZ^T ôz2 + V T 4' T =EdT (9-1) where m = the mass of either particle x1, y 1 , z 1 = the coordinates of particle 1 x2, y2 , z2 = the coordinates of particle 2 This equation can be obtained immediately by writing the classical expression for the total energy of the system, replacing the dynamical quantities by their associated quantum mechanical operators to obtain the Schroedinger equation, and then separating out the time dependence. Since the procedure is a simple extension of that used to obtain the time-independent Schroedinger equation for one particle in three dimensions, (7-10), and since the validity of (9-1) is quite obvious anyway, we shall not include the details here. It is more important to point out that (9-1) does use labels, which specify the identity of the two particles as 1 and 2. The language of mathematics forces us to use such labels because there would otherwise be hopeless confusion between the symbols; we challenge the student to devise a way to write an unambiguous equation, analogous to (9-1), without employing particle labels. In S310 I11:Ib'd 1d JIlN3aI behavior. As we have seen in Section 3-3, the photons which we must use to illuminate the scene for the motion picture camera interact with the electrons in a significant and unpredictable manner. The behavior of the electrons is seriously affected by any attempt to distinguish them. An equivalent, but more formal, statement is that in quantum mechanics the finite extent of the wave functions associated with each electron may lead to an overlapping of these wave functions that makes it difficult to tell which wave function was associated with which electron. A good example is provided by the helium atom. The wave functions of the two electrons overlap highly in all quantum states, and so the electrons cannot be distinguished. There is also an overlap of the wave functions associated with the electron and the proton of a hydrogen atom. But this does not lead to any problems in distinguishing one particle from the other because an electron and a proton are not identical—they can be distinguished by the differences in their mass, charge, etc. We see that there is a fundamental distinction between the classical and quantum mechanical description of a system containing identical particles. An accurate quantum mechanical treatment of these systems must be formulated in such a way that the indistinguishability of identical particles is explicitly taken into account. That is, measurable results obtained from accurate quantum mechanical calculations should not depend on the assignment of labels to identical particles. This property leads to important effects which have no classical analogies because indistinguishability itself is purely quantum mechanical. Since it is the eigenfunctions that carry the burden of describing quantum mechanical systems, we must look for a way of writing them so that they contain a mathematical expression of the qualitative ideas developed above. We continue considering two identical particles (e.g., two electrons, or two protons, or two a particles, or two helium atoms) in a box. To simplify the argument, we assume that we can neglect the interactions between the particles. Then they will bounce between the walls of the box, but they will not scatter from each other. Despite this simplification, the results of the following discussion are of quite general validity. The time-independent Schroedinger equation for our system of two noninteracting particles in three dimensions can be written MULTIELECTR ON ATOMS- G ROUND STATES AND X- RAY EXCITATIONS co using (9-1), we clearly stand a chance of violating the quantum mechanical requirements of indistinguishability. We shall see later that this does happen, but that it is possible to arrange things in such a way as to remove the difficulty. We shall do this by finding certain linear combinations of labeled eigenfunctions which lead to measurable predictions that are independent of the assignment of the labels. In the time-independent Schroedinger equation, (9-1) ,z2) = the eigenfunction for the total system VT(x l , . . . ,z 2) = the potential energy for the total system ET = the total energy for the total system Since we have assumed that there is no interaction between the two particles, the particles move independently. The potential energy of the total system is then simply the sum of the potential energies of each particle in its interaction with the walls of the box. Each potential energy will depend only on the coordinates of one particle and, since the particles are identical, the two potential energy functions are the same. Thus (9-2) VT(xl, . • . ,z2) = V(xl,y1,z1) + V(x2,y2,z2) It is easy to show, by applying the technique of separation of variables, that for the potential of (9-2), there are solutions to (9-1) of the form (9-3) t1T(x1, ... ,z2) = 0(x1,y1,z1)0(x2,y2,z2) where 0(x 1 ,y1 ,z 1) and iJi(x2iy2 ,z2) satisfy identical one-particle time-independent Schroedinger equations. Note that the total eigenfunction is written as a product of the two eigenfunctions describing the independently moving particles. Each of the eigenfunctions describing one of the particles requires three quantum numbers to specify the mathematical form of its dependence on its three space coordinates. In addition, each requires one more quantum number to specify the orientation of the spin of the particle. We shall shorten the notation by using a single symbol, such as a, or /3, or y, etc., to designate a particular set of the four quantum numbers required to specify the space and spin quantum state of one of the particles. Thus a, for example, stands for a certain set of values of the four quantum numbers. Then a particular eigenfunction for particle 1 would be written 4a(x1,y1,Z1) We further shorten the notation by writing this as tka( 1) This eigenfunction contains the information that particle 1 is in the space and spin quantum state described by a. Numerically, it is the function of the form specified by >lia , evaluated at the coordinates of particle 1. An eigenfunction indicating that particle 2 is in the space and spin quantum state /3 would be written Vja(2) The total eigenfunction 1 T(x l , ... ,z2) for the case in which particle 1 is in the state a, and particle 2 is in the state 16, is (9-4) OT(xi, • • • ,z2) =' ( 1) a(2) An eigenfunction indicating that particle 1 is in the state J6, and particle 2 is in the state a, has the quantum number symbols interchanged T(x1, ... ,Z2 ) = 'i ( 1 )Y'Œ(2) (9-5) Now let us see whether measurable quantities, evaluated from these total eigenfunctions, depend on the assignment of the particle labels. The simplest measurable is the probability density function. For the eigenfunction of (9-4), it is (9-6) ^T OT = 0a( 1)Y^/3(2)YIa( 1)0p(2) and for the eigenfunction of(9-5), it is ,/^ Tk T = i^(1)^a(2)Y'#( 1)0a(2) (9-7) Since the two identical particles are indistinguishable, we should be able to exchange their labels without changing a measurable quantity such as the probability density. As an example, we carry out this operation on (9-6), obtaining 4( 1)02)ia( 1)(frQ(2) 1-*2' ia(2)14( 1 )ia(2)1kQ( 1) 21 where the arrows mean that the expression on the left changes into the expression on the right when 1 changes into 2 and 2 changes into 1. But it is apparent that the relabeled probability density function is not equal to the original probability density function. For instance, the first term in the relabeled function (expression on the right) is 1//« evaluated at the coordinates x 2 , y 2 , z 2 , while the first term in the original function (expression on the left) is tfrâ evaluated at the coordinates x i , y i , z i . Thus a relabeling of the particles actually does change the probability density function calculated from the eigenfunction of (9-4). The same is true for the eigenfunction of (9-5). Therefore, we must conclude that these are not acceptable eigenfunctions for the accurate description of a system containing two identical particles. The suspicion which we expressed after writing the time-independent Schroedinger equation, (9-1), has been justified. It is, however, possible to construct an eigenfunction which satisfies the timeindependent Schroedinger equation, and yet has the acceptable property that its probability density function is not changed by a relabeling of the particles. In fact, there are two ways of doing this. Consider the following two linear combinations of the eigenfunctions of (9-4) and (9-5) = r [0a(1)0(3( 2) + 013( 1)0a(2)] (9-8) [ia( 1*,q(2) — IPp( 1)0a(2)] (9-9) and 1/1,4 = The first is called the symmetric total eigenfunction, and the second the antisymmetric total eigenfunction (for reasons that will become apparent soon). Now the total energy of a system containing a particle in a quantum state a and another particle in a quantum state /3 will not depend on which particle is in which state, if the particles are identical. Thus both i/i T = t/ra(1)I/ (2) and 0 2. = t/rp(1)I/ç(2) are solutions to the time-independent Schroedinger equation, (9-1), corresponding to the same value of the total energy ET. Because that equation is linear in I/ T, it follows immediately that the linear combinations Os and OA, of the two forms of I//T, are also solutions. Since they correspond to the same value of ET, they are degenerate solutions—that is i/is and Il' A are different eigenfunctions corresponding to precisely the same eigenvalue. The phenomenon is called exchange degeneracy since the difference between the degenerate eigenfunctions has to do with exchange of the particle labels. The factor of 1/V2 ensures that O s and tfr A will be normalized if tJI T = IPa(1)0p(2) and t/I T = i/I (1)tfia(2) are normalized. It is easy to evaluate the probability density functions for his and 111 A, and then show that in both cases their values are not changed by an exchange of the particle labels. We shall obtain this result by investigating the effect of an exchange of the particle labels on the eigenfunctions themselves. Carrying out the operation, we have frs = 1 ^ ^a(2) + 0^( 1 )^a( 2)] a (1)^ ^ 1-^2 > 1 2-^1 V 2 [^a(2) ^( 1) + ^R(2)^a( 1 )] = ^s ^ (9-10) S310I11:IVd 1 `dJIl N3 4I i m 0 MU LTIELECTRON ATO MS-G ROU NDSTATESAND X- RAY EXCITATIONS M and = [0.( 1 )0/3(2) ^Y' /tl2) - ^/il 1 )^a12)^ —>^ 1 2 ^[0a( 2)0Jill ) - ^fi( 2)Y^ a(l)] = — ^A 1/` 2 -^1 V` (9-11) We see that the symmetric total eigenfunction Os is unchanged by an exchange of the particle labels, and that the antisymmetric total eigenfunction 0A is multiplied by minus one by an exchange of the particle labels. (These properties give rise to their names.) We then have for the probability densities (9-12) 1s0s 1-+ 22 V'sY's 2-^ 1 and 4' A4'A 1^2 ( 2- (9-13) 1 ) 2 Y'AVIA — Y'VJA Hence, for both the symmetric and antisymmetric total eigenfunctions, the probability density functions are not changed by an exchange of the particle labels. The change in sign of the antisymmetric eigenfunction under an exchange of the particle labels is, of course, not objectionable since an eigenfunction itself is not measurable. It can be shown that any measurable quantity that can be obtained from the symmetric, or antisymmetric, total eigenfunctions is not affected by an exchange of the particle labels. Thus these two eigenfunctions provide an accurate description of a system containing two identical particles. Although the labels 1 and 2 do appear in the expressions for O s and OA, this labeling does not violate the requirements of indistinguishability because the value of any measurable quantity obtained from the eigenfunctions is independent of the assignment of the labels. Example 9 1. Two identical particles move independently in a one-dimensional box of length a, one being in the ground state of the infinite square well potential describing the box and the other being in the first excited state of that potential. For simplicity, assume that the particles have no spin, so that the total eigenfunctions for the system are just space eigenfunctions. (a) Evaluate the symmetric and antisymmetric total eigenfunctions of (9-8) and (9-9), and verify that the factor 1/J' in these equations does properly normalize them. •Using the general forms of (6-79) and (6-80) for the eigenfunctions for one particle in an infinite square well potential, and also using the normalization constant evaluated in Example 5-10, we find that the normalized space eigenfunction of the particle in the ground state is -\12/a cos (nx /a) and the normalized space eigenfunction for the particle in the first excited state is \/2/a sin (2rcx/a). Thus writing the symmetric and antisymmetric space eigenfunctions for the two particle system as 0 + and 0_, we have from (9-8) and (9-9) - 1 2 r xx 1 2x2 2x 1 - cos sin + sin cos xx21 a a a a a N 1 2 r xx 1 2xx2 2nx 1 xx2 1 sin sin cos J tfr - = ^ a cos a a a a 1/i + = when both x 1 and x2 lie within the range -a/2 to a/2. When either x 1 or x2 lie outside that range both 0 + and >(i - are zero since the one particle eigenfunctions have zero value there. The normalization integral for 0 + is a/2 a/2 `V + dx l dx2 = f ^ 2 ^a) 2 Ccos2 - a/2 -a/2 x x1 xi 2xx2 sin ^x l a sin 2x1 2 2^cx2 a xx2 + sin2 2 cos a a 2xx 1 xx2 xxl 2x x2 cos cos sin + s in dx 1 dx2 a a a a + cos — 71 a sin 7c x1 a cos 2 Ex2 a -^ a/ 2 (r - a /2 f - cos2 ^ 1 dx1 -a/2 a/2 _-a/2 a/2 + -2 sin2 2"1 dx l J J a a n - a cos ^x2 a ^' dx2 -a/2 -a/2 a/2 a/2 12 + J a a - a/2 xx l sin a a/2 +J sin J 2a a WO 2 27cax2 dx2 27cx 1 a dx l - a/2 2 2itx2 rcx 2 dx2 - sin cos a a a a/2 (' 2 2rcx 1 -sin 2rcx 2 2 xx2 Tcxl dx 2 dx l cos sin cos a a a a a a -a/2 - a/2 - Now each of the first two terms in the bracket yields one since in each both integrals are just the normalization integrals for the normalized one-particle eigenfunctions J2/a cos (xx/a) and \/2/a sin (27cx/a). Furthermore, each of the last two terms in the bracket yields zero since both are the product of two integrals of the form, and value a/2 xx cos — a sin 27cx a dx = 0 -a/2 The value can be verified in any table of definite integrals. Thus the normalization integral for 0+ yields (1/2)[1 + 1], where the 1/2 came from squaring the factor 1/,J in (9-8). So we find that that factor does properly normalize 0 + by making its normalization integral equal one. We can also immediately show that the same conclusion is obtained for >Ji-. Inspection of a table of definite integrals will further show that the integral from -a/2 to a/2 of any two different sinusoidal eigenfunctions for a particle in an infinite square well potential has the value zero. In fact, it can be proven from general considerations that the integral over all x of any two different eigenfunctions of any particular potential has the value zero. This property is called orthogonality. Because of the orthogonality of one-particle eigenfunctions, only 2 of the 2 2 terms in the normalization integral for any symmetric or antisymmetric two-particle eigenfunction have nonzero values; and because of the normalization of one-particle eigenfunctions, those two values are both equal to 1. Therefore, the factor 1/V2 in (9-8) and (9-9) ensures that these total eigenfunctions are normalized in all cases. 4 (b) Write expressions for the expectation value of the separation distance D between the particles for the case in which the space eigenfunction for the two-particle system is symmetric, and for the case in which it is antisymmetric. Then show that in neither of these cases is this expectation value affected by an exchange of the particle labels. •The separation distance D is the absolute value of the difference in their x coordinates. That is, D =1x2 - x11 = 1x1 - x2 1. The expectation value D is, for the case of 0 + co J a/2 a/2 co J /4D/j ± dx 1 dx2 = a/2 a/ 2 J J DO+dx l dx2 -a/2 -a/2 -oo -oo f 1x2 - x11 [cos 2 ^ al sin2 27Câ 2 + sin 27Câ 1 cos MC -a/2 -a/2 + 2 cos ] 2nx1 27rx2 nxl cos rcx2 dx l dx 2 sin sin a a a a Similarly, for the case of i/i 2 D= a//2 a/2 J J 2 r 1x 2 - x11 L cos 2 m a l sin 27Lâ +sin2 2 -a/2 -a/2 - 2 cos 27Ex2 nxl sin 2 sin a a a cos dx2 I dx 1 dx 2 a a 1 cos na 2 s31 9ilad =2 ' MU LTIELECTRON ATOMS- GROUNDSTATESAND X- RAY EXCITATIO NS Some work would be required to evaluate the integrals fôr these two cases. But we can see immediately that in both the values are not affected by exchanging the particle labels. The reason is that in both integrals neither the factor ix2 — x i I nor the third term in the square brackets are changed and, although the first term in the square bracket changes into the second term, the second term changes into the first term. We can also see that the value of D obtained with the symmetric space eigenfunction is different from the value obtained with the antisymmetric space eigenfunction, because of the difference of the sign of the third term in the square bracket. In other words, the average separation between the particles in a state in which the space eigenfunction is symmetric is different from what it is in a state in which the space eigenfunction is antisymmetric. In Section 9-4 we shall give further interpretation to these results, and we shall see that they have very interesting consequences. • 9-3 THE EXCLUSION PRINCIPLE As a result of an analysis of data concerning the energy levels of atoms, which we shall study soon, in 1925 Pauli was led to his famous exclusion principle (weaker condition): In a multielectron atom there can never be more than one electron in the same quantum state. He then established from the analysis of other experimental data that the exclusion principle represents a property of electrons and not, specifically, of atoms. The exclusion principle operates in any system containing electrons. Now consider the antisymmetric total eigenfunction of (9-9), for a case in which both particles are in the same space and spin quantum state a. It is Y'A = [ ^fra( 1»a(2) — Œ( 1 )u(2)] = 0 (9-14) The eigenfunction is identically equal to zero. Hence, if two particles are described by the antisymmetric total eigenfunction, they cannot both be in a state with the same space and spin quantum numbers. The eigenfunctions we have been dealing with were obtained under the assumption that there are two identical particles, and that the interactions between them can be neglected. If there are more than two identical particles and/or if their interactions must be taken into account, the total eigenfunctions have different forms, as we shall see in Examples 9-2 and 9-3. But they can still be used to make linear combinations of definite symmetry, either symmetric or antisymmetric, and the antisymmetric linear combinations still have values identically equal to zero if any two particles are in the same quantum state. In other words, all antisymmetric total eigenfunctions have properties which conform to the requirements of the exclusion principle. So we conclude there is an alternative expression of the exclusion principle (stronger condition): A system containing several electrons must be described by an antisymmetric total eigenfunction. The condition specified by the second statement of the exclusion principle is stronger than the condition specified by the first statement, because it satisfies that condition, and it also satisfies the requirements of indistinguishability which demand total eigenfunctions of a definite symmetry. The stronger condition must be used in quantum mechanical calculations that aim for complete accuracy, but the weaker condition, which is much easier to apply, is often used in approximate calculations. In Section 9-5 we shall discuss the use of these conditions in the treatment of multielectron atoms, and we shall compare the results obtained from the stronger one with those obtained from the weaker. In discovering the exclusion principle, Pauli found the answer to a long-standing problem concerning the structure of multielectron atoms. He has written: Example 9-2. Determine the form of the normalized antisymmetric total eigenfunction for a system of three particles, in which the interactions between the particles can be ignored. ^ This is easy to do if it is noted that the two-particle antisymmetric total eigenfunction = 1 [C( 1 )tiip(2) — tfrp( 1 )t/a(2)] can also be written as a so-called Slater determinant 1 iGa( 1 ) ipa(2) OA = 2! op(1) op(2) where 2! = 2 x 1 = 2. The identity of these two expressions can be verified by expanding the determinant. In determinantal form, the extension to three particles is obvious Jia( 1 ) ^/a(2) /ia(3) Ÿp( 1) Op(2) 0p(3) 111 = 3! OP) tfry(2) OP) where 3! = 3 x 2 x 1 = 6. Expansion of this determinant yields 1 ^iA = 3! [ 0,( 1)(Pp(2)1/ y(3 ) + tip( 1)0 y(2) i,(3) + ifr y(1)0a(2)1ip(3) — O y( 1)Iip(2)Iia(3) — tfrp( 1 )0a(2)IÎiy(3) — tfra( 1 )tiy(2)0p(3)] Each term of this linear combination is a solution, for the same total energy, to the timeindependent Schroedinger equation for a potential energy function in which the variables can be grouped into a sum of terms that each depend on the coordinates of only one particle, as in (9-2). Therefore, the linear combination is also a solution. By exchanging the appropriate particle labels, as we did in (9-11) for a system of two particles, it is easy to verify that it is antisymmetric with respect to the exchange of any pair of labels. It also has the property of being identically equal to zero if any two particles are in the same space and spin quantum state. This can be seen most easily from the determinant itself, since it is a well known property of determinants that they vanish if any two rows are identical. It is not difficult to follow the procedures of Example 9-1 and to show that W A is normalized if i/ia(1)0 p(2)1/iy(3), and similar terms, are normalized. • As is the case for electrons, the symmetry character of other kinds of particles is a question settled by experiment. It is found that systems of protons, or of neutrons, or of certain other particles, must also be described by antisymmetric total eigenfunctions. On the other hand, it is found that systems of photons, helium atoms, and certain other particles, must be described by symmetric total eigenfunctions. There are important phenomena associated with the symmetry character of the symmetric particles. The most spectacular example is the "superfluid" behavior of liquid helium 31dI ONiad NOI SM OX33 H 1 "The question as to why all electrons for an atom in its ground state were not bound in the innermost shell had already been emphasized by Bohr as a fundamental problem in his earlier works.... However, no convincing explanation of this phenomenon could be given on the basis of classical mechanics. It made a strong impression on me that Bohr at that time and in later discussions was looking for a general explanation." Pauli's explanation of the problem was certainly general. All the electrons cannot be bound in the same quantum state represented by the innermost shell of the atom because the system must be described by antisymmetric total eigenfunctions, which vanish if even two electrons are in the same quantum state. To emphasize just how fundamental the problem is, we jump a little ahead of our development to state that if all the electrons in an atom were in the innermost shell, then the atom would be essentially like a noble gas. The atom would be inert, and it would not combine with other atoms to form molecules. If electrons did not obey the exclusion principle this would be true of all atoms. Then the entire universe would be radically different. For instance, with no molecules there would be no life! O MULTIELECTRON ATOMS- G ROUND STATESANDX- RAY EXCI TATI ONS r The Symmetry Character of Various Particles Particle Symmetry Generic Name Table 9 1 - Electron Positron Proton Neutron Muon Antisymmetric Antisymmetric Antisymmetric Antisymmetric Antisymmetric Fermion Fermion Fermion Fermion Fermion a particle He atom (ground state) rc meson Photon Deuteron Symmetric Symmetric Symmetric Symmetric Symmetric Boson Boson Boson Boson Boson Spin (s) 1/2 1/2 1/2 1/2 1/2 0 0 0 1 1 at temperatures near absolute zero. This, and other examples, will be discussed in Chapter 11, which treats the general properties of systems containing a large number of symmetric, or antisymmetric, particles. Table 9-1 lists several kinds of particles, their symmetry character, and also the value of the quantum number s that specifies the magnitude of their spin angular momentum. Also indicated are the two names, fermion and boson, sometimes used to distinguish the two classes of particles according to their symmetry character. It is very interesting to note that there must be some connection between the symmetry character of a particle and its spin. The point is that all the antisymmetric particles have half-integral spin just as the electron has, while all the symmetric particles have zero or integral spin. This connection has been studied by Pauli, and others, using very sophisticated forms of quantum mechanics. Some understanding of its origin has been obtained, but at the level of this book it is appropriate to say that the symmetry character of a particle should be considered as a basic property, like mass, charge, and spin, which is determined by experiment. An exception to this statement is that the symmetry of a well-bound composite particle, like a helium atom, can be predicted immediately from the symmetries of its constituents. (If the composite particle has an even number of antisymmetric constituents, it is symmetric.) Determine the form of the normalized symmetric total eigenfunction for a system of three particles, in which the interactions between the particles can be ignored. ^ In analogy to the relation between (9-8) and (9-9), the required eigenfunction can be obtained immediately by writing the linear combination found in Example 9-2 with all the signs positive. That is 1 s= ^^a( 1)1M2)0 y(3 ) + 0/3(1)0 y(2)0a(3) 3! Example 9 3. - + 0y(1)0a(2)Pp(3) + iky( 1)0/3(2)Y/a(3) + Y'/3( 1)J/a(2)Ii (3) + 0a( 1 )07(2)0p(3)] It is immediately apparent that this linear combination is symmetric with respect to the exchange of any two particle labels. The normalization can be verified by the procedure used • in Example 9-1. 9-4 EXCHANGE FORCES AND THE HELIUM ATOM We turn now to a property of indistinguishable particles which is, to say the least, very strange. Consider a pair of electrons in a system in which we can ignore any explicit interactions (like the Coulomb interaction) between the two particles. Ac- cording to (9-9), the total eigenfunction for the system can be written 12 [0.( 1)1M2) — ^R( 1)tka(2)] z This antisymmetric total eigenfunction depends on both the space variables and the spin variables of the two electrons since the symbols oc, /3, y, ... specify sets of three space quantum numbers plus one spin quantum number. For the present discussion we rewrite it in such a way that the space and spin variables occur in separate factors, i.e. (total eigenfunction) = (space eigenfunction) x (spin eigenfunction) We also make both factors have a definite symmetry with respect to exchange of the particle labels. Antisymmetry of the total eigenfunction can then be obtained by multiplying a symmetric space eigenfunction times an antisymmetric spin eigenfunction, or by multiplying an antisymmetric space eigenfunction times a symmetric spin eigenfunction. The normalized symmetric and antisymmetric space eigenfunctions have the forms we used in Example 9-1 1 symmetric space (9-15) eigenfunction: [fra( 1)0b(2) + tfrb( 1»a(2)] antisymmetric space 1 eigenfunction: [1fra(1)042) — O b(1)0a(2)] WO ld INf1I13 H3H1 aN b' S3 0171O3 30Nb'HJ X3 ^A = (9-16) where IIia(1)1/ib(2) and tJi b (1)1/42) are normalized. Each symbol from the series a, b, c, ... represents a particular set of the three space quantum numbers only (in contrast to the a, /3, y ... , which represent sets of three space and one spin quantum number). Of course these forms are very general, there being a wide variety of different Oa and 4/b for different systems. The forms of the symmetric and antisymmetric spin eigenfunctions are quite another matter. The reason is that the spin variable is not continuous like a space variable, but instead is discrete. For instance, the spin of a single electron can have only two discrete orientations relative to any z axis since its z component is either + 1/2 or —1/2, in units of h. Continuous functions, such as those displayed in the oneelectron atom space eigenfunctions of Table 7-2, therefore cannot be used for spin eigenfunctions. For the case of two noninteracting electrons, each of which has two possible spin orientations, there are only four possible spin states for the system, and therefore only four possible spin eigenfunctions. Because there are so few we can display their specific forms. If these four spin eigenfunctions for the system are written so as to have definite symmetries, then one will be antisymmetric and the other three symmetric. Matrices are frequently employed to write mathematical expressions for the spin eigenfunctions, but here we shall write them in terms of combinations of the symbols + 1/2 and —1/2 because their interpretations will be more obvious. The only possible antisymmetric spin eigenfunction for two noninteracting elect trons is antisymmetric spin 1 (9-17) 2[(+ 1/2, — 1/2) — (-1/2, + 1/2)] (singlet) eigenfunction: N/1This is a linear combination of a symbol (+ 1/2, — 1/2) that specifies a state where the z components of the spins have values, in units of h, of + 1/2 for electron 1 and —1/2 for electron 2, minus a symbol (-1/2, + 1/2) that specifies a state where the z , between the symbols, the linear combination is antisymmetric in an exchange of the compnetsar—1/2flcond+ret2.Duohminsg N MU LTIELE CTRON ATOMS- GROU ND STATES AND X- RAY EXCITATION S T labels of the two electrons since such an exchange would convert the first symbol to (-1/2, +1/2) and the second symbol to (+ 1/2, — 1/2), thereby changing the overall sign of the linear combination. We shall not need to further manipulate these symbols and their linear combinations, and we shall only use them to describe spin states. So it will not be necessary for us to further specify their mathematical (i.e., matrix) properties. There are three possible symmetric spin eigenfunctions (+ 1/2, + 1/2) symmetric spin 1 [(+ 1/2, — 1/2) + (-1/2, + 1/2) (triplet) (9-18) eigenfunctions: (— 1/2, —1/2) Their symmetry is obvious since for each an exchange of labels results in no change in the eigenfunction. These three describe the so-called triplet states, and the antisymmetric eigenfunction describes the so-called singlet state. All four of these spin eigenfunctions are normalized. A physical interpretation of the singlet and triplet states can be obtained by evaluating, for each state, the magnitude S' and z component Sz of the total spin angular momentum S'. This vector is S'= S1+S2 (9-19) the sum of the spin angular momenta of the two electrons. As is true for all angular momenta in quantum mechanics, S' and Sz are quantized according to the relations S' = /s'(s'+ 1)h (9-20) Sz = msh z Triplet Singlet Figure 9 2 Vector diagrams representing the rules for adding the quantum numbers s i = 1/2 and s 2 = 1/2 to obtain the possible values for the quantum numbers s' and ms. Left: The maximum possible value of s' is obtained when a vector of magnitude s i is added to a parallel vector of magnitude s 2 , yielding s' = s i + s 2 = 1/2 + 1/2 = 1. The maximum possible z component of this vector gives the maximum possible value of the quantum number ms, and the minimum possible z component gives the minimum possible value of ms. The intermediate values of ms (only one in this case) differ by integers. Thus the possible values are m's = +1, 0, —1. Right: A vector of magnitude s i = 1/2 is added to an antiparallel vector of magnitude s 2 = 1/2 to yield a vector of magnitude s' = s 1 — s 2 = 1/2 — 1/2 = O. A vector whose length is zero must have z component zero as well, so the only possible value for ms is zero. The term triplet refers to the state s' = 1 where three possible values of ms arise; the term singlet refers to the state s' = 0 where only one possible value of m' arises. - S;'4.1 7 . Fes, Triplet state Singlet state 0 Figure 9 3 - Triplet state: Two spin angular momentum vectors of magnitudes S 1 = S2 = V(1/2)(1/2 + 1)h. Either can be found with equal likelyhood anywhere on a cone symmetrical about the vertical z axis. But their orientations are correlated so that if one is found to be pointing in a particular direction the other will be found to be pointing in the same general direction. If their z components are both positive, S 1-- = S2= = + (1/2)h, or both negative, S1. = Sts = — (1/2)h, their sum is a total spin vector of magnitude S' = x/1(1 + 1)h and positive z component, S' = + 1h, or negative z component, Sz = — 1h. If the spin vectors have z components of opposite sign, but point in the same general direction, the total spin vector has a zero z component, Sz = 0, but still has magnitude S' = V1(1 + 1)h, because it will be found lying in the plane perpendicular to the z axis. These possibilities are the three which can occur in the triplet state. Singlet state: If the two spin vectors have z components of opposite sign and point in essentially opposite directions the total spin vector has zero z component, Sz = 0, because it has zero magnitude, S' = O. This is the singlet state. In a certain sense, the two spin vectors are out of phase in this state. In the same sense, the two vectors are in phase in the Sz = 0 triplet state. These phases are related to the minus and plus signs occurring between the terms in the linear combinations of the total spin eigenfunctions of (9-17) and (9-18). EXC HANGE FO RC ES AN D T H E HEL I UMATO M The quantum numbers satisfy the relations m's = ms — —s, . , +s ' (9-21) s'= 0,1 The relations between the quantum numbers, obtained when S' and Sz are evaluated, can be represented and explained by the rules of vector addition used in Section 8-5. Figure 9-2 shows two vectors of length s = 1/2 added to form a vector of length s' = 0 or 1, which can have, in the latter case, z components of + 1, 0, —1. As we have warned the student before, these vector addition diagrams must be interpreted cautiously since the vectors are not really angular momenta. But they do convey correctly the impression that in the three triplet states, which correspond to s' = 1, ms = + 1; s' = 1, m' = 0; s' = 1, ms = —1, the electron spins are essentially parallel. In the singlet state, s' = 0, ms = 0, the electron spins are essentially antiparallel. Figure 9-3 attempts to show the angular momenta; but as it cannot truly represent the linear combinations in (9-17) and (9-18) it oversimplifies somewhat. Now we shall employ these ideas to explain a fundamental property of a system containing two electrons. If the spins of the two electrons are "parallel" and the spin MULTIELE CTRON ATOMS- GROUNDSTATESA ND X-RAY EXCITATIO N S M eigenfunction is one of the symmetric triplets of (9-18), the space eigenfunction must be antisymmetric as in (9-16), in order to have the total eigenfunction antisymmetric. Let us consider such a situation for a case in which the space variables of the two electrons happen to have almost the same values. Then >ya(1) 0a(2) since the lefthand side is evaluated at the coordinates of electron 1, which are almost equal to the coordinates of electron 2 where the right-hand side is evaluated. For the same reason, /Jb(1) ^ 'iib(2). As a consequence ^^ (( ^^//,, tka(1)Y1b(2) ilj (1)Y'a(2) In this case the value of the antisymmetric space eigenfunction is r [4^a( 1)4 (2) — Y'b( 1)1Pa12)] r [00)0a(2) — Wb( 1)0a(2)] = 0 The result is that the probability density will be very small when the triplet state electrons have similar coordinates, i.e., when they are close together. Since there is little chance of finding them close together, the triplet state electrons act as if they repel each other. This has nothing to do with a Coulomb repulsion because we assumed at the very beginning of our treatment that there is no explicit interaction between the electrons. Instead, it has to do with the properties of antisymmetric space eigenfunctions. Symmetric space eigenfunctions have inverse properties. If the space eigenfunction for the two electrons is symmetric, and they happen to have almost the same coordinates, then that eigenfunction is [(1» a (2) + Ob( 1)0a(2)] = ' " Ob( 1 )t//a( 2) 2 [ G(1)4'b(2) + b(1)0a(2)] ti Y2 y since we shall again have ' a(1) ^ tia(2) and lib(1) th(2). Thus the probability density will have the value 20b (1) // (2) /' b(1)i/ia(2) when the two electrons with a symmetric space eigenfunction are close together. This is twice the average value over all space of the probability density for the symmetric space eigenfunction (because i/i b(1)1/ia(2) is normalized so the integral of 0b (1) L' (2)z/i b(1)0a(2) over all space equals one, as does the integral over all space of the symmetric space eigenfunction probability density). So there is a particularly large chance of finding the two noninteracting electrons close together if their space eigenfunction is symmetric. Thus, if the spins of the two electrons are "antiparallel" and the spin eigenfunction is the antisymmetric singlet, as in (9-17), the space eigenfunction must be symmetric, as in (9-15), and the singlet state electrons act as if they attract each other since there is a large chance of finding them close together. Figure 9-4 illustrates the symmetries of surfaces representing the x 1 and x2 dependences of a typical antisymmetric, or symmetric, space eigenfunction for a one-dimensional system containing two identical noninteracting particles. The particular simple case shown is for one particle being in the ground state of an infinite square well potential of width a, for which the eigenfunction has the form of one-half of a cosine wave, and the other particle being in the first excited state of that potential, for which the eigenfunction has the form of one full sine wave. The top surface represents a situation in which the particle whose coordinate is written x 1 is in the ground state (note the half cosine in the x 1 direction), and the particle whose coordinate is x 2 is in the first excited state (note the full sine in the x 2 direction). Since identical particles are indistinguishable, it is equally possible that the system is in a situation in which the particle with coordinate x 1 is in the first excited state and the particle with coordinate x 2 is in the ground state. This situation is described by the second surface from the top. In quantum mechanics, both situations are allowed for by taking the eigenfunction for the system to be a linear combination of equal parts of the eigenfunctions describing either of them. This can be done either by adding or subtracting. In subtracting, we obtain the antisymmetric space eigenfunction for the system, which is illustrated by the third surface; in adding, we obtain the sym- X]. xi Figure 9 4 Depicting the antisymmetric and symmetric space eigenfunctions of Example 9-1, 0_ and tp + for a system of two noninteracting identical particles in a one-dimensional infinite square well potential of width a when one particle is in the ground state with eigenfunction J2/a cos (irx/a) and the other is in the first excited state with eigenfunction V2/a sin (27rx/a). Top: The first term of Ji_ is shown by constructing the surface whose distance above or below the x i , x2 plane is the positive or negative value of (2/a) cos (irx i la) sin (2irx 2/a). Upper middle: The surface describing the second term of t4 _ , i.e., (2/a) sin (27rx i /a) cos (lrx 2 /a). Lower middle: 1/J2- times the first term minus the second term, which shows the geometry of tp _ itself. It is apparent that the value of 0_ is zero along the line x i = x2 , and it is small everywhere near that line. Thus the probability density I *_0 _ is very small wherever x i ^^ x 2 , and so the probability is very small that this condition will be achieved. Bottom: 1// times the sum of the term (2/a) cos (nx i /a) sin (21rx 2 /a) and the term (2/a) sin (21rx i /a) cos (lrx 2/a), showing the symmetric space eigenfunction 0 + for the system. This eigenfunction has its maximum magnitudes along the line x 1 = x2 . The probability density 0+0 + therefore has its largest magnitudes if the two particles are in the same location in their one-dimensional well, and so we conclude that there is a large chance of finding them close together. - WOl`d W f1113H 3H1dNb' S3 01:1O3 30NdHJX3 X2 ^ MULTIELECTRON ATOMS- GROUND STATESAND X- RAY EXCITATIO NS r metric space eigenfunction for the system, illustrated by the bottom surface. The point of particular interest here is that the antisymmetric space eigenfunction is zero along the line x 1 = x2 corresponding to the two particles being in the same location, while the symmetric space eigenfunction has its maximum magnitudes along the line. Thus the probability density telfr will be very small for the antisymmetric case, and very large for the symmetric case, when evaluated for coordinates of the two particles which are nearly the same. In classical mechanics a roughly analogous situation could arise in a system containing two identical particles, if no effort were made to distinguish them by measurement, in that the probability function describing the system would be a linear combination of equal parts (one for particle 1 being in a lower energy state and particle 2 in a higher energy state and the other for particle 1 being in the higher state and particle 2 being in the lower state). But the single possible result for this situation has no analogy to the two distinctly different quantum results, because in quantum mechanics we deal with eigenfunctions that can exhibit interferences since they can be of either sign (or even complex), and then we calculate probabilities from them, whereas in classical mechanics we deal directly with probabilities which are necessarily positive and so cannot interfere. If the student visualizes similar figures, he will be able to see why the same striking difference between the antisymmetric and symmetric space eigenfunctions is found when the particles are in any two different states of the infinite square well potential, or any other one-, two-, or three-dimensional potential. For a system containing more than two identical particles, these conclusions are also obtained for space eigenfunctions which are antisymmetric, or symmetric, with respect to the exchange of any two particle labels, since the geometry of the terms in the eigenfunctions that involve the two labels can be analyzed in the same way as for a system containing only two particles. The triplet and singlet cases for a system of two electrons is illustrated schematically in Figure 9-5. The requirement that an accurate description of the system must use a total eigenfunction which is antisymmetric in an exchange of their labels leads to a coupling between their spin and space variables. They act as if they move under the influence of a force whose sign depends on the relative orientation of their spins. This is called an exchange force. It is a purely quantum mechanical effect and has no classical analogy. Exchange forces do not arise between two electrons which are always constrained to remain far apart. An example is the electrons in two hydrogen atoms which are well separated from each other. In fact, none of the requirements of indistinguishability need be taken into account for a pair of identical particles which are so widely separated that their wave functions do not overlap. The reason is simply that these particles can be distinguished from each other by appropriate measurements. Exchange forces do arise between two electrons in the same atom, or two neutrons or protons in the same nucleus. We shall show this by considering the low-lying energy levels of the helium atom. Example 9 4. The simplest, but least accurate, treatment of the helium atom involves ignoring the Coulomb interaction between its two electrons, and taking the total energy of the atom to be the sum of the one-electron atom energies of each electron moving about the Z = 2 nucleus. Use this treatment to predict the energies of the ground and first excited states of the atom. - Triplet Figure 9 5 Singlet A schematic illustration of the tendency for electrons in a triplet spin state to be relatively far apart, and the tendency for electrons in a singlet spin state to be relatively close together. - _ •From (7-22) for the one-electron atom eigenvalues, we have E 2e4 ttZ 2 e4 ttZ (4ir€0)22h2ni (4i€0)22h2n2 m 4x 13.6 eV 4x 13.6 eV n 2l 2 n2 • Figure 9-7 indicates the origin of the first few energy levels of the helium atom. The left side of the figure shows the energies of the levels that would be found, as in Example 9-4, if there were no Coulomb interaction between its electrons. If this were the case, the total energy would be just the sum of the one-electron atom energies of each electron moving about the Z = 2 nucleus in states described by the one-electron atom eigenfunctions with the quantum numbers indicated. The center of the figure shows, in part, the effect of the Coulomb interaction between the electrons. Since this interaction energy is positive because both electron charges have the same sign, the levels are raised. Furthermore, the upper level is split into two. The reason is that the two electrons are somewhat more widely separated on the average when one has n = 1, l = 0, and the other has n = 2, l = 0, than when one has n = 1, / = 0 and the Figure 9-6 Left: Helium energy levels predicted by a treatment in which the electron-electron interaction is ignored. Right: The ground state and first four excited states of helium, as determined from the observed spectrum. EXCHAN GE FORC ES AN D THE HELIUM AT OM where we have set Z = 2. In the ground state, the quantum numbers n 1 and n2 are both equal to 1, and we obtain E= —(4+4) x 13.6 eV= —109 eV In the first excited state, one of these quantum numbers equals 1, and the other equals 2. For this we obtain E= —(4+1)x 13.6 eV= —68 eV The energies predicted are shown on the left side of the energy-level diagram of Figure 9-6. The right side of that figure shows the energies of the first few levels of helium obtained from measurements of the optical spectrum emitted by that atom. The predictions are quite inaccurate because the Coulomb interaction between the two electrons in the atom is really not negligible compared to the Coulomb interactions between each electron and the nucleus, as was assumed in this simple treatment, and also because the treatment ignores exchange forces. CO MU LTIELECTRO N ATOMS- GROUND STATESAND X- RAY EXCITATIO NS T —50 — ( n —60 =1,1=0 ; n=2,l = Singlet __ - %' 7 `n=1,1=0;n=2,1=0 Triplet Singlet Triplet % / / —70 -n=1;n=2 — >. 'ao n= —80 i i 1,l=0;n= 1,1=0 -‹—Singlet —90 —100 —110 - n= 1;n=1 Figure 9-7 The low-lying energy levels of helium. Left: The levels that would be found if there were no Coulomb interaction between its electrons. Center: The levels that would be found if there were a Coulomb interaction but no exchange force. Right: The levels that would be found if there were a Coulomb interaction and an exchange force. These levels are in excellent agreement with the experimentally observed levels shown on the right in Figure 9-6. other has n = 2, l = 1. This can be seen by inspecting the one-electron atom radial probability densities of Figure 7-5. As the energy associated with the Coulomb interaction between the electrons is inversely proportional to their separation, the energy of the atom is raised less for the first set of quantum numbers, and the degeneracy with respect to the l quantum number (found in one-electron atoms) is removed by this interaction. The right side of Figure 9-7 shows the effect of the exchange force. In the triplet states the electrons tend to keep apart, and in the singlet state they tend to keep together. Therefore, the Coulomb interaction between them is relatively less effective in raising the energy of the atom in the triplet states, and relatively more effective in the singlet state. Part of the m s degeneracy (of one-electron atoms) is also removed by the Coulomb interaction between the electrons, and the levels are further split into singlet state and triplet state levels. These are the energy levels that are observed from measurements of the spectrum of the helium atom. Quantitative results in good agreement with the measurements can be obtained from quantum mechanics by adding to the energies obtained in Example 9-4 the expectation values of the energies due to the Coulomb repulsion between the two electrons. Antisymmetric total eigenfunctions, composed of one-electron atom eigenfunctions for Z = 2, are used to calculate the expectation values. It is particularly interesting to note from Figure 9-7 that there is no triplet level corresponding to the singlet level in the ground state of helium. It is absent because the antisymmetric space eigenfunction, which must be used to multiply the symmetric triplet spin eigenfunction, has the form ) 1 )] = 0 C4' a( 1 )C,(2) — Y'a( 1 //a(2 9-5 THE HARTREE THEORY We begin here the quantum mechanical study of multielectron atoms that will occupy us for the remainder of this chapter, and the next chapter. Compared to simplified one-dimensional systems, or even to the one-electron atom, multielectron atoms are quite complicated. But it is possible to treat them in a reasonable way by using a succession of approximations. Only the most important interactions experienced by the atomic electrons are treated in the first approximation, and then the treatment is made more exact in succeeding approximations that take into account the less important interactions. In this way the treatment is broken into a series of steps, none of which is too difficult. The results obtained will certainly justify the effort expended because we shall have a detailed understanding of the atoms that are the constituents of everything in the universe. Furthermore, the procedures used are worth studying for their own sake because they are typical of those used in solving the real problems of professional science and engineering, in contrast to the artificial problems of much textbook science and engineering. In the first approximation used in treating a multielectron atom of atomic number Z, we must consider the Coulomb interaction between each of its Z electrons of charge e and its nucleus of charge + Ze. Due to the magnitude of the nuclear charge, this is the strongest single interaction felt by each electron. But even in the first approximation we must also consider the Coulomb interactions between each electron and all the other electrons in the atom. These interactions are individually weaker than the interaction between each electron and the nucleus, but, as we saw for the case of the helium atom in Example 9-4, they are certainly not negligible. Furthermore, in a typical multielectron atom there are so many interactions between an electron and all the other electrons that their net effect is very strong except if the electron is quite near the nucleus. This is illustrated in Figure 9-8. — Surface of atom 4r Electronic repulsive forces {R R r Nuclear `attractive force Figure 9 8 Left: The strong a tt ractive force exerted by the nucleus on an electron near the surface of an atom, and the weak repulsive forces exerted by the other electrons. The net effect of the repulsive forces is important because they tend to reinforce each other. Right: The very strong attractive force exerted by the nucleus on an electron near the center of an atom, and the weak repulsive forces exerted by the other electrons. Here the repulsive forces tend to cancel each other. - A1:1O3H1 33 1:i ladH 3E11 The value is identically equal to zero in the ground state since the space quantum numbers for both electrons have the same values, n = 1, 1 = 0, ml = O. In agreement with the exclusion principle, only the singlet level is found in the ground state since the spin quantum numbers of the two electrons must be different, i.e., the two electrons must have "antiparallel" spins. Historically the argument was made in the opposite order. The experimental fact that the helium spectrum shows this triplet level to be absent provided the primary evidence that led Pauli to the discovery of the exclusion principle. 0 MULTIELECTRO N ATOMS-GROUND STATES AND X- RAY EXC ITATIONS N M On the other hand, the first approximation must not be so complicated that the Schroedinger equation to which it leads is unsolvable. In practice, this requirement means that in the first approximation the atomic electrons must be treated as moving independently so that the motion of one electron does not depend on the motion of the others. Then the time-independent Schroedinger equation for the system can be separated into a set of equations, one for each electron, which can be solved without too much difficulty since each involves the coordinates of a single electron only. Note that this is how the solutions, (9-3), were obtained to the time-independent Schroedinger equation, (9-1), for two particles moving independently in a box. The requirements of the last two paragraphs are in conflict—the Coulomb interactions between the electrons must be considered, but the electrons must be treated as moving independently. A compromise between the requirements is obtained by assuming each electron to move independently in a spherically symmetrical net potential V(r), where r is the radial coordinate of the electron with respect to the nucleus. The net potential is the sum of the spherically symmetrical attractive Coulomb potential due to the nucleus and a spherically symmetrical repulsive potential which represents the average effect of the repulsive Coulomb interactions between a typical electron and its Z — 1 colleagues. It can be seen from Figure 9-8 that very near the center of the atom the behavior of the net potential acting on an electron should be essentially like that of the Coulomb potential due to the nuclear charge + Ze. The reason is that in this region the interactions of the electron with the other electrons tend to cancel. It can also be seen from the figure that very far from the center the behavior of the net potential should be essentially like that of the Coulomb potential due to a net charge + e, which represents the nuclear charge + Ze shielded by the charge —(Z — 1)e of the other electrons. The procedure of introducing a net potential is one that is encountered in the study of many fields of physics. For instance, in Chapter 15 we shall find that a net potential is the basis of the "shell model" which provides a relatively simple, but very useful, description of the behavior of neutrons and protons in a nucleus. It might seem that there is no way to find the net potential of an atom at intermediate distances from its center. The problem is that it obviously depends on the details of the charge distribution of the atomic electrons, and this is not known until solutions have been obtained to the Schroedinger equation that contains the net potential. But it can be taken care of by demanding that the net potential be selfconsistent. That is, if we calculate the electron charge distribution from the correct net potential, and then evaluate the net potential from the charge distribution, we demand that the potential with which we end up must be the same as the potential with which we started. As we shall see, this condition of self-consistency is enough to determine the correct net potential. Most of the work in this field has been done by Douglas Hartree and collaborators, starting in 1928 and continuing to this day. It involves solving the time-independent Schroedinger equation for a system of Z electrons moving independently in the atom. This equation is analogous to the equation for two electrons moving independently in a box, (9-1), in that the total potential of the atom can be written as the sum of a set of Z identical net potentials V(r), each depending on the radial coordinate r of one electron only. Consequently, the equation can be separated into a set of Z timeindependent Schroedinger equations, all of which are of the same form, and each of which describes one electron moving independently in its net potential. A typical time-independent Schroedinger equation for one electron is —- h2 \72 11/(r,0,9) + V(r)tk(r, 8",o) = EtP(r,e,9) m (9-22) Ze e V(r) _ r 0 4TcE Or e2 4nEo r (9-23) r oo and by taking any reasonable interpolation for intermediate values of r. This guess is based on the idea, mentioned previously, that an electron very near the nucleus feels the full Coulomb attraction of its charge + Ze, while an electron very far from the nucleus feels a net charge of +e because the nuclear charge is shielded by the charge (Z 1)e of the other electrons surrounding the nucleus. 2. The time-independent Schroedinger equation for a typical electron, (9-22), is solved for the net potential V(r) obtained in the previous step. This is not easy to do because the radial part of the equation must be solved by numerical integration, as in Appendix G, since V(r) is a complicated function. The eigenfunctions for a typical electron, found in this step, are: ilr a(r,80), ty a(r,d,cp), t1y(r,0,9), .... They are listed in order of increasing energy of the corresponding eigenvalues: Ea, E,2 , Ey, .... Each of the symbols, a, /3, y, ... , stands for a complete set of three space and one spin quantum numbers for the electron. 3. To obtain the ground state of the atom, the quantum states of its electrons are filled in such a way as to minimize the total energy and yet satisfy the weaker condition of the exclusion principle. That is, the states are filled in order of increasing energy, with one electron in each state, as illustrated schematically in Figure 9-9. Then the eigenfunction for the first electron will be i/i «(r l ,O ,9 l), the eigenfunction for the second will be IIR(r2i02 ,rp2), and so forth through the Z eigenfunctions corresponding to the Z lowest eigenvalues, obtained in the previous step. 4. The electron charge distributions of the atom are then evaluated from the eigenfunctions specified in the previous step. This is done by taking the charge distribution for each electron as the product of its charge — e times its probability density — — Figure 9 9 A schematic energy-level diagram illustrating the effect of the exclusion principle in limiting the population of each quantum state of an atom with six electrons. Note that the total energy of the atom would be much more negative if the exclusion principle did not operate. The diagram does not indicate that many quantum states are actually degenerate, nor are the spacings between the levels meant to be realistic. - Ey ES • Ea A1:1 O3 H1 33 1:111:IdH 3H1 Here r, B, qP are the spherical polar coordinates of the typical electron; V2 is the Laplacian operator in these coordinates, of (7-13); E is the total energy of the electron; V(r) is its net potential; and /i(r,B,cp) is the eigenfunction of the electron. The total energy of the atom is the sum of Z of these total energies. The total eigenfunction for the atom is composed of products of Z of these eigenfunctions that describe the independently moving electrons. Initially, the exact form of the net potential V(r) experienced by the typical electron is not known, but it can be found by going through a self-consistent treatment comprised of the following steps: 1. A first guess at the form of V(r) is obtained by taking MULTIELECTRON ATOM S- GROUNDSTATESAND X- RAY EXCITATION S function *i. The justification is that *0 determines the probability that the charge would be found in various locations in the atom. The charge distributions of Z — 1 representative electrons are added to the nuclear charge distribution, a point charge + Ze at the origin, to determine the total charge distribution of the atom as seen by a typical electron. 5. Gauss's law of electrostatics is used to calculate the electric field produced by the total charge distribution obtained in the previous step. The integral of this electric field is then evaluated to obtain a more accurate estimate of the net potential V(r) experienced by a typical electron. The new V(r) that is found generally differs from the estimate made in step 1. 6. If it is appreciably different, the entire procedure is repeated, starting at step 2 and using the new V(r). After several cycles (2 -4 3 -4 4 -4 5 —> 2 —* 3 -* 4 --> 5 —> • • • ) the V(r) obtained at the end of a cycle is essentially the same as that used in the beginning. Then this V(r) is the self-consistent net potential, and the eigenfunctions calculated from this potential describe the electrons in the ground state of the multielectron atom. In the Hartree procedure, the weaker condition of the exclusion principle is satisfied by the requirement of step 3 that only one electron populates each quantum state. But the stronger condition is not satisfied since antisymmetric total eigenfunctions are not used. The reason is that an antisymmetric eigenfunction would involve a linear combination of Z! = Z(Z — 1)(Z — 2) • • • 1 terms, which is an extremely large number for all atoms except those of very small Z. The procedure is difficult enough as is, and the use of antisymmetric eigenfunctions would make it very much more difficult. Anyway, the main effect of using antisymmetric total eigenfunctions would be to decrease the separation between certain pairs of electrons, and increase it between others. This leaves the average electron charge distribution of the atom essentially unchanged. Since the average electron charge distribution is the important quantity in the approximation treated by Hartree, the use of eigenfunctions which are not of a definite symmetry does not introduce a significant error. This has been verified by Fock. He made calculations using antisymmetric total eigenfunctions for a restricted selection of atoms, and he compared his results with those obtained by Hartree. When we discuss in the next chapter the excited states of atoms, however, it will be necessary for us to take into account the fact that antisymmetric total eigenfunctions must be used to give a completely accurate description of a system of electrons. Fock's calculations, and the ones we shall consider in the next chapter, are feasible because, for reasons we shall see, it is really only necessary to antisymmetrize the part of the total eigenfunction describing the behavior of a limited number of electrons in a "partially filled subshell." It is an interesting bit of history to recall that one of the first large digital computers was employed to perform Hartree calculations. It used relays as switching elements, instead of the transistors of modern computers. But even with modern computers the calculations are so time consuming that results for a wide variety of atoms were obtained only in the 1960s by Herman and Skillman. These results provide a very satisfactory explanation of the essential features of all multielectron atoms in their ground states. As we shall find, the explanation is not unduly complicated. 9 6 RESULTS OF THE HARTREE THEORY - The eigenfunctions that are found in the Hartree theory, for the electron in the spherically symmetrical net potential of a multielectron atom, are closely related to the eigenfunctions discussed in Chapter 7 for the electron in a one-electron atom. In fact, all the discussion of Chapter 7 concerning the 0 and cp dependence of the eigenfunctions for an electron in a one-electron atom applies directly to the 0 and cp dependence of the eigenfunctions for an electron in a multielectron atom. As an example, (7-32) shows that the sum of the probability densities for the oneelectron atom eigenfunctions with n = 2, l = 1, and all possible values of m l, is spherically symmetrical. This statement is certainly also true for n = 2, l = 0, and it can be shown to be true for any given n and 1. From the previous discussion, we conclude that the same statement applies to the eigenfunctions for a multielectron atom. Now, when a multielectron atom is in its ground state, the lowest energy quantum states of its electrons are completely filled. This means that for almost all values of n and / there are electrons in states with all possible values of m I . Since the sum of the probability densities for these electrons is spherically symmetrical, their total charge distribution is also. At most, only a few electrons in the highest energy states, that is states where all possible values of m I might not be filled, can contribute to any asymmetries in the charge distribution. In step 4 of the Hartree procedure the charge distribution used is taken to be completely spherically symmetrical; i.e., it is the best fit of a spherically symmetrical distribution to the distribution actually obtained. The r dependence of the eigenfunctions for an electron in a multielectron atom is not the same as for an electron in a one-electron atom. The reason is that the net potential V(r), which enters the differential equation that determines the functions R„I(r), does not have the same r dependence as the Coulomb potential. Typical examples of the radial behavior of the multielectron atom eigenfunctions are shown in Figure 9-10. In this figure we plot the results of a Hartree calculation for the argon atom, Z = 18, in terms of the quantities 2(21 + 1)4nr2Ri(r) = 2(2l + 1)P„I(r). Here P„I(r) is the radial probability density of (7-28), which specifies the probability of finding an electron, with quantum numbers n and 1, in a location with a radial coordinate near r. Since there are (21 + 1) possible values of m I for each 1, and since for each of these there are 2 possible values of ms, the quantity 2(2/ + 1)P„I(r) is the radial probability density for the quantum states with quantum numbers n and 1, times the total number of electrons which the exclusion principle allows to populate those states. In the ground state of argon, two electrons populate the states for n = 1, l = 0; two for n = 2, l = 0; six for n = 2, l = 1; two for n = 3, l = 0; and six for n = 3, 1 = 1. These are the states which are filled in the ground state of the atom because, as we shall see later, they have the lowest energy. Figure 9-11 shows the total radial probability density P(r) for the argon atom. This is the sum, over the n and / values populated in the atom, of the radial probability density for each state times the number of electrons it contains. That is, P(r) gives the probability of finding some electron with radial coordinate in the region of r. Figure 9-11 also shows the radial dependence of the net potential V(r) in which each electron of the argon atom is moving, as obtained from Hartree calculations A1:1O3H1 3H1. 3 OSli f1S3a the Hartree eigenfunctions can be written (( ( ( (9-24) nlmim s lr 8 ^^) = RnI(r)®Imi( 0)^mi\(P)(ms) The eigenfunctions are labeled by the same set of quantum numbers n, 1, m l, ms, as are used for the one-electron atom eigenfunctions, and these quantum numbers are related to each other just as before. The spin eigenfunction, which we indicate schematically as (ms), is exactly the same as for a one-electron atom. Furthermore, the functions describing the angular dependence, O lmi(0) and O mi(ç ), are also exactly the same. The reason is that the time-independent Schroedinger equation for an electron in a spherically symmetrical net potential, (9-22), is of exactly the same form as the time-independent Schroedinger equation for an electron in the spherically symmetrical Coulomb potential, (7-12), as far as 0 and cp are concerned. Therefore, (9-22) leads directly to (7-15) and (7-16), whose solutions are O 1mi(0) and (1)mi (9). Consequently, N MULTIELECTRON ATOM S- GROUND STATESAND X- RAY EXCITATIONS co 20 18 Argon = 1,1=0 =2,1=0 n= 3,1=1 /^^--^ \ .. % n= 3, 1=0 .. • 0.5 ^^^ -.....„. ( 1.0 1.5 . ......... ....1 ......... 2.0 r/ao 2.5 ....:^--- 3.0 I- --I - 3.5 4.0 Figure 9-10 The Hartree theory radial probability densities for the filled quantum states of the argon atom, plotted as functions of r/a o , the radial coordinate in units of the hydrogen atom first Bohr orbit radius a o . For each n the probability density is largely concentrated in a restricted range of rla o , called a shell. Note that the characteristic radius of the outermost shell (n = 3) has an rla 0 , value only a little larger than 1.0, while the characteristic radius of the innermost shell (n = 1) has an r/a o value much smaller than 1.0. That is, the outermost shell of argon is only a little larger in radius than a o , which is the radius of the single shell in hydrogen. The innermost shell of argon is of much smaller radius than the hydrogen shell. Argon 1.0 1.5 2.0 r/a o 2.5 3.0 Figure 9-11 The total radial probability density that specifies its net potential. 3.5 4.0 P(r) of the argon atom, and the quantity Z(r) for that atom. The net potential is not displayed directly, but indirectly in terms of a convenient quantity Z(r). The relation between the two is given by the equation V(r) = Z(r)e 2 47rEOr (9-25) Zne2 4irEOr (9-26) where Zn is a constant equal to Z(r) evaluated at the average value of r for the shell (the "radius" of the shell.) In the crude approximation of (9-26), the one-electron atom equations specifying the total energy, and other quantities of interest, can be used if we replace Z by Zn . The quantity Zn is sometimes called the effective Z for the shell. This approximation is useful because it allows us to discuss many results of the Hartree theory in terms of some very simple equations with easily understandable properties, although the Hartree theory actually uses purely numerical procedures and so leads to results which must be expressed in cumbersome tables or graphs. Example 9-5. Determine the values of Zn for the argon atom, and then use these values to estimate the total energy of the electrons in the three shells populated in the ground state of the atom. ^ Inspecting Figure 9-11 to estimate the average values of r characteristic of the populated shells, obtaining the values of Z(r) for these r from the same figure, an d setting the Z n equal to these values of Z(r), we find that for the argon atom with Z = 18 Z 1 16 Z2 = 8 and and Z3 = 3 As indicated earlier, we may use the one-electron atom energy formula, (7-22), with Z = Zn Zn 2 liZ2 e4 __ E ) x 13.6 eV — (4ite0)22h2n2 to obtain an estimate to the electron energies yielded by the Hartree theory calculations. Doing this, we obtain z E1 ^ —(161 x 13.6 eV = —3500 eV (n I/\ J2 E 2 ^ —I ) x 3 E 3 . — ( )2x 13.6 eV = —220 eV 13.6 eV= These energies agree within somethinglike 20% with the Hartree results. • In Example 9-5 we found that for the argon atom, with Z = 18, the effective Z of the innermost shell (n = 1) is Z 1 ^ 16. Hartree calculations show that in all multielectron atoms Z 1 has a value of about Z 1 ^ Z — 2. The reason is that for all atoms a sphere surrounding the nucleus, of radius equal to the average radial coordinate of an RES ULTS O F THE H ARTREE THE ORY Note that the figure shows Z(r) -* Z as r —* 0, and Z(r) -. 1 as r —+ co, in agreement with the ideas discussed in connection with (9-23). By inspecting the plots of Pnl(r) in Figure 9-10, we see that, for all the electrons in states with common values of the quantum number n, the probability densities are large only in essentially the same range of r. All these electrons are said to be in the same shell—terminology we have used before in connection with one-electron atoms. Furthermore, the range of r in which the probability densities are large (the "thickness" of each shell) is restricted enough that Z(r) has a reasonably well-defined value in that range. These circumstances form the basis of a crude, but useful, approximate description of the results of the Hartree theory, in which all the electrons in the shell labeled by n of a multielectron atom are considered to be moving in a Coulomb potential Vn(r) = w ^ ^ N MULTIELE CTRON ATOMS-GROUNDSTAT ES A ND X- RAY EXCITATIO NS M electron in the n = 1 shell, contains a negative charge of about — 2e, due to the charge distributions of all the other electrons. According to Gauss's law of electrostatics, this spherically symmetrical distribution of negative charge shields the n = 1 electron from part of the nuclear charge + Ze, effectively reducing it to about + Ze — 2e = +(Z — 2)e. Thus the n = 1 electron experiences an effective Z of about Z 1 = Z 2. We also found in Example 9-5 that for the outermost shell of the argon atom (n = Z has the small value Z,, ^ 3. This is because an3fortham),eciv electron in the outermost shell is almost completely shielded from the nuclear charge by the intervening charge distributions of all the other electrons. The result is comparable to what is found in all Hartree calculations. But with increasing Z the value of Z„ obtained from the calculations for the outermost shell slowly increases; i.e., it increases about as slowly as the increase in n itself. The reason it increases is that the shielding of the nuclear charge by the electrons in the intervening shells is not perfect. To an accuracy consistent with the crude approximation we are considering, we may describe these results by saying that in all multielectron atoms Z„ has a value of about Z„ ^ n, if n specifies the outermost shell populated in the atom. We shall now use the facts stated in the last two paragraphs to describe and explain a number of important results of the Hartree theory: 1. In multielectron atoms the inner shells of small n are of very small radii because for these shells there is little shielding, and the electrons feel the full Coulomb attraction of the highly charged nucleus. In fact, the Hartree theory predicts that the radius of the n = 1 shell is smaller than that of the n = 1 shell of hydrogen by approximately a factor of 1/(Z — 2). (This prediction is not too accurate for atoms of very large Z become important because inner shell electrons in large atoms have energies comparable to their rest mass energies mc 2 ^ 5 x 105 eV.) The prediction can be understood in our crude description of the Hartee theory results by setting Z = Z 1 Z — 2 and n = 1 in the one-electron atom equation for the radial coordinate expectation value, (7-29) Z _ n2ao Y ^' z yielding r rhydrogen rhydrogen 2 2. The electrons in the inner shells are in a region of large negative potential energy, so their total energies are correspondingly large and negative. The results of the Hartree theory predict that the magnitude of the total energy of an electron in the n = 1 Z1 Z- shell is more negative than that of an electron in the n = 1 shell of hydrogen by approximately a factor of (Z — 2)2. (Relativistic effects limit the accuracy for high Z.) This can be understood by setting Z = Z 1 ^ Z — 2 and n = 1 in the one-electron atom energy equation, (7-22) itZ 2e4 E= (47cE 0)2 2h2 n2 yielding E ^—' Z1Eh ydrogen (Z — 2 )2Ehydrogen 3. Electrons in the outer shells of large n are almost completely shielded from the nucleus, and so they feel an attraction to it not so different from that felt by an electron to the singly charged nucleus of a hydrogen atom. The radius of the outermost shell can be obtained from our crude description by setting Z = Z„ ^ n in the one- becausofrltiv ,naketocuihHaretoy,wic electron atom radial expectation value equation, yielding n2ao n 2 ao r^ ' nao Z„ n tizn e4 (9-27) (47rE0)22h2n2 and in this set Z n ^ n, we obtain a predicted energy which is approximately equal to the ground state hydrogen energy. The basic reason for this is the shielding of the outer shell electron from the full nuclear charge by the charges of the intervening inner shell electrons. 5. Finally, we can use (9-27) to describe crudely the dependence, for a given atom, of the total energy of an electron on its quantum number n. Due both to the Zn in the numerator and the n 2 in the denominator, E becomes less negative with increasing n in going through the shells of a given atom. The total energy of an electron in a given multielectron atom becomes less negative very rapidly with increasing n for small n, but much less rapidly for large n. The behavior for large n reflects the fact that the energy cannot become positive since the electron is bound. This prediction of the Hartree theory, and all the others just mentioned, are verified by experiment. E We close our discussion of the results of the Hartree theory by describing its predictions for the total energies of the atomic electrons more accurately than can be done on the basis of the crude description we have been using. In a one-electron atom, all the quantum states corresponding to a certain shell have exactly the same total energy, if the very small energy associated with the spin-orbit interaction is ignored. That is, all states in a shell of a particular n are degenerate since the total energy depends only on n. But in a multielectron atom this is not the case. As mentioned in Section 7-5, the fact that the total energy of a one-electron atom does not depend on 1 is a consequence of the fact that its potential is Coulombic, i.e., exactly proportional to — 1/r. In a multielectron atom the electrons are moving in a net potential V(r) which is definitely not proportional to — 1/r, and so the total energy of these electrons depends on l as well as on n. (Since we are here ignoring the spin-orbit and certain other weak interactions, the total energy of the electrons does not depend on the quantum number m s which determines the space orientation of the spin, nor on the quantum number m 1 which determines the space orientation of the "orbit") cn CD ^ ^ RESULTS OF THE HARTREE THEO RY If we check the predictions of this equation with the actual Hartree results for the argon atom shown in Figure 9-10, we see that the equation overestimates by a factor of 2. About the same factor of 2 overestimate is found in a similar comparison with Hartree results for elements of the highest atomic number. The effective Z description of the Hartree results is crude, but still useful, because it correctly describes the fact that the radius of the outermost populated shell increases only very slowly with increasing atomic number. The Hartree results themselves show that this radius is only about three times larger for elements of the highest atomic number than it is for hydrogen. Since the radius of the outermost populated shell is essentially the size of the atom, the previous statements apply directly to the sizes of various atoms. Nevertheless, it is a common misconception to think that atoms of high atomic number are very much larger than atoms of low atomic number. Measurements made on atoms, molecules, and solids show this is not true. The Hartree theory explains that it is not true, basically because, as the nuclear charge Z increases in going from one atom to the next, the inner atomic shells rapidly contract. 4. We can also see, from our crude description of the Hartree theory results, that the theory predicts that the total energy of an electron in the outermost populated shell of any atom is comparable to that of an electron in the ground state of hydrogen. If we set Z = Z„ in the one-electron atom energy equation to obtain co MU LTIELECTRO N ATOMS-G ROU ND STATESAND X- RAY EXC ITATIONS rn â O The results of the Hartree theory show that the total energy of an atomic electron is actually somewhat more negative than would be predicted from (9-27), the energy equation obtained from our crude description of the theory. The difference is largest for 1= 0, and it diminishes progressively with increasing 1. Thus in the Hartree approximation we write the energy of an electron in a multielectron atom as Ent, to indicate that it depends on both n and 1. The explanation for the 1 dependence concerns the behavior of the electron probability density telP, in the region of small r near the nucleus of the multielectron atom. According to (7-31) r—> 0 te tP cc 1 21 This was demonstrated for one-electron atom eigenfunctions, but it is equally true for multielectron atom eigenfunctions. The reason can be seen by inspecting (7-17), which is the differential equation for the function R governing the radial behavior of the eigenfunctions. Note that as r —* 0 the term [1(1 + 1)/r2]R completely dominates the other term (2,u/h2)[E — V(r)]R since the factor 1/r 2 makes it increase so rapidly with decreasing r for small r. Consequently, for small r the exact form of V(r) is unimportant as long as it increases in magnitude less rapidly than 1/r 2. In all atoms the eigenfunctions have a radial dependence proportional to r' for small r, and therefore the probability density is proportional to r21 for small r. So if we consider, as an example, two electrons in the same shell n of a multielectron atom, one with l = 0 and the other with 1 = 1, there is much more chance of finding the 1 = 0 electron in the region of small r than of finding the 1 = 1 electron in that region. This is true since r° » r2 for small r. Similarly, the chance of finding an 1 = 1 electron is much larger than the chance of finding an 1 = 2 electron of the same n at small r since there r2 » r4, etc. This property can be seen by carefully inspecting Figure 9-10. Before using the property to explain the dependence of Ent on 1, we indicate its physical origin by going through a semiclassical argument involving Figure 9-12. An electron with quantum number 1 has an orbital angular momentum of fixed magnitude L = N/l(1 + 1)h. But L = rp 1 , where p l is the magnitude of its component of linear momentum perpendicular to its radial coordinate vector whose length is r. If the electron moves into a region where r becomes small, then p1 must become large. Since the kinetic energy K of the electron contains a term proportional to pi, it becomes more positive with decreasing r in proportion to 1/r 2, for small r. But for small r the net potential approaches the Coulomb potential of an unshielded nuclear charge, so the potential energy V of the electron becomes more negative with decreasing r in proportion to 1/r. Since K cc + 1/r2 and V cc —1/r for small r, its kinetic energy increases more rapidly than its potential energy decreases, as r —* 0. Thus the electron avoids that region because there it cannot maintain a constant value of its total energy E = K + V, as is required by energy conservation. However, the tendency to avoid the region of small r is not present for 1 = 0 since then L = 0. So there is much more chance of finding an 1 = 0 electron at small r than of finding an / = 1 electron in that region. Since the tendency to avoid small r is more pronounced with increasing 1, there is much more chance of finding an / = 1 electron than an / = 2 electron at small r, etc. Now we can understand the / dependence of Ent . The crude description of the results of the Hartree theory underestimates how negative the total energy of an atomic electron is because it assumes essentially that the electron stays within its shell. In fact, there is a small probability that the electron will be found inside its shell in the region of small r near the nucleus. When the electron is in this region it has penetrated the intervening charge distributions of the other electrons, and it feels nearly the full unshielded nuclear charge. Then it has a very much more negative potential energy than it has when it is in its shell. The electron will also occasionally be found out. Al:IO31-11 3H1 3O Slif1Sa1 r p Figure 9-12 Top: The linear momentum p of an electron can be decomposed into a component p 11 parallel to the radial vector from the nucleus r, and a component p 1 perpendicular to the radial vector. The product of p l and r is equal to the constant magnitude of the angular momentum L. Bottom: An electron moving about a nucleus with constant L. When the electron is relatively near the nucleus (illustrated on the left), r is small sop l must be large. When the electron is relatively far away (illustrated on the right), p 1 is smaller. Note that the magnitude of the total momentum p will also be large when p s is large. Therefore the kinetic energy of the electron will be large when it is near the nucleus, in order to allow the angular momentum to be a constant of the motion. side its shell where its potential energy is less negative than in its shell, but the change is considerably smaller than the change in potential energy occurring when it is inside its shell. The overall effect of the excursions of an electron inside and outside its shell is to make the expectation value of its potential energy somewhat more negative, and therefore to make its total energy somewhat more negative than it would be if it stayed in its shell. Since we have learned that the probability of an electron with a given n being inside the shell in the region near the nucleus is larger the smaller its value of 1, we can see that for a given value of n, the total energy En, of an electron in a multielectron atom is more negative for 1 = 0 than for 1 = 1, more negative for 1 = 1 than for 1 = 2, etc. For outer shells with large values of n, where the n dependence is not very strong, the values of Ent can actually depend in a more sensitive way on 1 than on n. But for a one-electron atom there is no 1 dependence at all in the total energy because there is never any shielding so an electron always feels the full nuclear charge, and the expectation value of its potential energy is independent of 1. All the electrons in a particular shell have radial probability densities which are of approximately the same form in the region of the shell, but which are significantly different in the region of small r. We have seen that the second property causes the total energies of the electrons in the shell to depend on 1. Consequently, it is convenient to speak of each shell as being composed of a number of subshells, one for each value of 1. All the electrons in the same subshell have the same quantum numbers n and 1. Therefore, all have exactly the same total energy (in the Hartree approximation which neglects spin-orbit and other weak interactions). Also, all the electrons in the same subshell have exactly the same radial probability density Pnl (r). 330 Chap. 9 MULTIELECTRON ATOMS—GROUND STATES AND X-RAY EXCITATIONS 1 2 H Is He 3 4 Li 2s 11 19 4s Mg 3p 20 21 Ca 3d Rb Sr 4d Cs 56 Ba 5d 37 V 40 Y 42 41 Nb 5s1 4d 4 Zr La Lanthasicles Cr 4s 1 3d 5 72 26 43 Mo Ta 75 30 47 48 28 44 45 46 Ru 5s 1 4d 7 Rh Pd Ag 5s 1 4d 8 5s 0 44 10 5s1 4d 10 76 77 Re W 29 Cu 27 Co Fe Tc 74 73 Hf 25 Mn Os Ni 4s 1 3d 10 79 78 Ir Pt 4p Hg 6p 36 53 84 A Kr Br , Te 83 82 18 35 52 Sb Ne CI Se 51 Sn 81 80 Au 50 In 5p As 10 F 17 34 33 Ge 9 S P 32 49 Cd 0 16 15 Si Ga 8 N 14 Al 31 Zn 7 C 54 Xe I 85 86 Ti Pb Bi Po At Rn P1 P2 p3 p4 p5 p6 6s 1 5d 9 6s 1 5d 10 89 Fr Ra sl s2 6d Ac Actinides d3 d4 d5 d6 d7 d8 d9 d 10 67 71 59 60 61 62 63 64 65 66 Ce 5d o 4f 2 Pr 5d 0 4f 3 Nd 5d o4f 4 Pm 5d o4f 5 Sm 5d o4f 6 Eu 5d04f7 Gd 5d14f7 Tb 5d0419 Lu Er Tm Yb Dy Ho 5 d04f10 5d a4fu 5 do4f 12 5d04/13 5d o4f14 5d 141 14 90 91 92 93 94 95 96 97 98 f1 Pa U 6d 1 5f 2 6d 1 5f 3 f2 f3 Np Pu 6d 1 5f 4 6d 1 5f- 5 f4 f5 Am 6d15f6 6 68 70 58 Th 5f Actinides 6d 2 5f 9 Figure 9 13 7p d2 d1 4f Lanthanides - 24 23 Ti 57 88 87 22 Sc 39 38 55 6s 6 B 13 K 5s 7s 2p 12 Na 3s 5 Be 99 Bk Cf Es 6d15f7 6d15f8 6d05f10 6d05f11 f7 f8 f9 Cm f10 69 100 Fm 101 Md 6do5f1z 6d ° 5113 fll [12 102 No 103 Lw 6d05(14 6d15f14 f13 The periodic table of the elements, showing the electron configuration for each element. f14 9-7 GROUND STATES OF MULTIELECTRON ATOMS AND THE PERIODIC TABLE Table 9-2 The Energy Ordering of the Outer Filled Subshells Quantum Numbers n, 1 6, 2 5, 3 7, 0 6, 1 5, 2 4, 3 6, 0 5, 1 4, 2 5,0 4, 1 3, 2 4, 0 3,1 3, 0 2, 1 2, 0 1,0 Designation of Subshell 6d 5f 7s 6p 5d 4f 6s 5p 4d 5s 4p 3d 4s 3p 3s 2p 2s is Capacity of Subshell 2(21 + 1) 10 14 2 6 10 14 2 6 10 2 6 10 2 6 2 6 2 2 i Increasing energy (less negative) — Lowest energy (most negative) GROUND STATES OF M ULTIELECTR ON ATOMS AND THE PERI ODIC TABLE Most of the properties of the chemical elements are periodic functions of the atomic number Z that specifies the number of electrons in an atom of the element. It was first emphasized by Mendeleev in 1869 that these periodicities can be made most apparent by constructing a periodic table of the elements. A modern version of his table is presented in Figure 9-13. Each element is represented in the table by its chemical symbol, and also by its atomic number. Elements with similar chemical and physical properties are in the same column. For instance, all elements in the first column are alkalis and have a valence of plus one; all elements in the last column are noble gases and have a valence of zero. The discovery of the periodic table was a great breakthrough of chemistry. Its interpretation was an equally significant development of physics. We assume that the student has some familarity with the periodic properties of the elements from his study of elementary chemistry. For this reason, we do not need to stress their importance to chemistry. Our task here is to interpret these properties in terms of the Hartree theory of multielectron atoms. That is, in this section we shall present the quantum mechanical interpretation of the basis of inorganic chemistry, plus that of much organic chemistry and solid state physics. The interpretation of the periodic table is based on information about the ordering according to energy of the outer filled subshells of multielectron atoms. The required information can be obtained from the results of the Hartree calculations, described in the last section, which yield the ordering according to energy of the outer filled subshells as is shown in Table 9-2. The first column identifies the subshell by the quantum numbers n and 1. The second column of Table 9-2 identifies the subshells by giving the spectroscopic notation for n and 1. This notation is commonly used in discussing the spectra and MULTIELECTRONATOM S- GROU ND STATESAND X- RAYEXCITATIO NS Table 9-3 The Spectroscopic Notation for 1 l 0 1 2 3 4 5 6 Spectroscopic notation s p d fg h i energy levels of atoms. The number gives the value of n, and the letter gives the value of l according to the scheme shown in Table 9-3. In this scheme the l = 0 state is called an s state; the l = 1 state is called a p state; etc. The third column of Table 9-2 is equal to 2(21 + 1). As mentioned in the last section, that quantity is the number of possible combinations of m l and mS, for the value of l characteristic of the subshell. Thus the third column gives the maximum number of electrons that can occupy different states in the same subshell without violating the exclusion principle. In our discussion of the last section we found that the Hartree theory predicts that the energy of the subshell becomes more negative with decreasing values of n and with decreasing values of l. We see this immediately in Table 9-2. The i s subshell, which is the only subshell in the n = 1 shell, has the lowest energy. The two subshells of the n = 2 shell are both of higher energy and, of these, the 2s subshell is of lower energy than the 2p subshell. In the n = 3 shell the subshells 3s, 3p, 3d are also ordered in energy according to the predictions of the Hartree theory. However, the energy of the 4s subshell is actually lower than the energy of the 3d subshell because, for reasons described in the last section, the l dependence of the energy Ent of the subshells can be more important than the n dependence for outer subshells with large values of n. Continuing up the list, we see that the ordering of the outer subshells always satisfy the following rule: For a given n, the outer subshell with the lowest l has the lowest energy. For a given 1, the outer subshell with the lowest n has the lowest energy. Near the top of the list, the l dependence of Ent becomes so much stronger than the n dependence that the energy of the 7s subshell is lower than the energy of the 5f subshell. It should be noted that Table 9-2 does not necessarily give the energy ordering of all subshells in any particular atom, but only the energy ordering of the subshells which happen to be the outer subshells for that atom. For instance, the energy of the 4s subshell is lower than that of the 3d subshell for K atoms and the next few atoms of the periodic table. But for atoms further up in the periodic table the 3d subshell is of lower energy than the 4s subshell because for these atoms they are inner subshells and the n dependence of Ent is so strong that it dominates the l dependence. Additional information of this type is presented in Figure 9-14. Now the characteristics of an atom depend on the behavior of its electrons. The behavior of an electron is specified by the set of four quantum numbers which specify its quantum state. However, in the approximation represented by the Hartree theory only the quantum numbers n and l are important. Therefore, in this approximation an atom can be characterized by specifying the n and 1 quantum numbers of all the electrons. This specification of the subshells occupied by the various electrons is called the configuration of the atom. The ordering according to energy of the outer filled subshells being known, it is trivial to determine the configuration of any atom in its ground state. In the ground state the electrons must fill all the subshells in such a way as to minimize the total energy of the atom and yet not exceed the capacity 2(2l + 1) of any subshell. The subshells will fill in order of increasing energy, as listed in Table 9-2. Consider first the H atom. The single electron occupies the i s subshell, with its spin either "up" or "down". For the He atom both electrons are in the is subshell, one 5d 5 ^° w 4p 3d 4s • 4p 4d \^ 4s 3p 3d 3p 3s ^ 3s 2p 2s 2p 2s is • is I 0 I I 20 - I I Z 40 I I 60 I I 80 Figure 9-14 A schematic representation of the energy ordering of all the subshells in an atom, as a function of its atomic number Z. Each curve begins at the Z for which the subshell begins to be occupied. Only subshells occupied in atoms through mercury are shown, so all curves stop at Z = 80. The ordering of the outer filled subshel Is in various atoms is found on the left side of the diagram. The ordering of all filled subshells in mercury is found on the right side of the diagram. The energy scale is non-linear and, furthermore, varies with Z. with spin "up" and the other with spin "down". The configuration of H is written 1 H: ls1 The configuration of He is written 2 He: 1s 2 The superscript on the subshell designation specifies the number of electrons which it contains; the superscript on the chemical symbol specifies the Z values for the atom. In the 'Li atom one of the electrons must be in the 2s subshell because the capacity of the is subshell is only 2. The configuration of this atom is 3 L1: is22s 1 The 4Be atom completes the 2s subshell and has the configuration 4 Be: 1s22s2 In the six elements from 5 B to 10Ne the additional electrons fill the 2p subshell. The configurations of 5 B and 10Ne are 1s22s 22p1 5 B: 10 Ne : 1s22s 22p6 Note that the periodic table of the elements presented in Figure 9-13 is divided vertically into a series of blocks with each row labeled by the subshell which, according to Table 9-2, the elements of the row are filling. Knowing this, it is easy to write the configuration of any atom, with a procedure that will become more apparent in Example 9-6. But there are certain atoms for which the last few electrons are observed. 318 `d1OIdOIa3d3H1 aMd S WO1d N O1:110313I 11 f1013OS31b'1Sd Nf1O 1:19 5s p MULTI ELECTRON ATOMS- GROU ND STATESAND X- RAY EXCITATIONS to be in different subshells than would be predicted by this scheme. The configurations for these atoms are indicated in the periodic table by the entries below their chemical symbol. Write the configurations for the ground states of 19K, 23V, 24 Cr, 43 Tc, 44Ru, 46 Pd 57 La, 58Ce, and 59Pr. ■ From the absence of any entry below 19 K in the periodic table of Figure 9-13, we conclude that there is nothing exceptional about its configuration. The configuration is then obtained by inspecting the periodic table and listing in order the lowest energy subshells, and their populations, for the 19 electrons of the atom. It is 19 K: 1s22s2 2p6 3s2 3p64s 1 Example 9-6. The first 18 electrons completely fill the subshells of lowest energy, and the last electron partly fills the 4s subshell. Adding four more electrons to obtain 23 V completes the filling of the 4s subshell and puts three electrons in the 3d subshell, which is the one of next highest energy. The configuration is 23 V: is22s2 2p6 3s2 3p64s 2 3d 3 The entry 4s 1 3d 5 for 24Cr in Figure 9-13 means that the configuration of this atom does not end with the symbols 4s 2 3d4, as would be expected, but instead is 24 Cr: 1s 22s 22p6 3s2 3p64s 1 3d 5 The reason for this behavior will be explained later. Inspection shows that the configurations of the other atoms of interest are 43 Tc: is2 2s2 2p6 3s2 3p64s2 3d1°4p6 5s 24d 5 44 Ru: 1s2 2s22p6 3s2 3p64s2 3d 104p6 5s 14d7 46 P d: 1s2 2s2 2p 6 3s2 3p64s2 3d 1°4p64d' o 57 La: 1s 22s22p6 3s2 3p64s23d 104p6 5s24d' ° 5p6 6s2 5d 1 Ce: 1s2 2s2 2p6 3s 2 3p64s2 3d 1°4p6 5s24d' ° 5p66s 24f2 59 Pr: 1s2 2s2 2p6 3s 2 3p64s 2 3d 1°4p6 5s24d' ° 5p66s 24f3 58 We see from Example 9-6 that in certain cases the actual configurations observed for the elements do not strictly adhere to the predictions of Table 9-2. For instance, this table says that the energy of the 3d subshell is greater than the energy of the 4s subshell when these subshells are filling Yet in Z4Cr, and also in "Cu, one of the electrons that could be in the 4s subshell is actually in the 3d subshell. Similar situations are observed to occur for the 5s and 4d subshells. In 43Tc the 5s subshell is filled in the normal manner. But in 45 Rh there is only one electron in the 5s subshell; in 46 Pd both electrons have left the 5s subshell and moved to the 4d subshell. The 78 13t and 79Au configurations show that the same kind of thing can happen for the 6s and 5d subshells. From these circumstances we conclude that the energy separations between the 4s and 3d, the 5s and 4d, and the 6s and 5d subshells must be so small while they are being filled that, although generally the ordering of these subshells is as shown in Table 9-2, in certain cases the ordering can actually be reversed. This can be seen in Figure 9-14. Configurations which disagree with Table 9-2 are also observed in 57La and in the lanthanides (Z = 58 to 71), more commonly called the rare earths. Table 9-2 predicts that after the completion of the 6s subshell the 4f subshell should fill, but in two of the rare earths there is one 5d electron. A similar situation occurs in the group of elements following 89Ac, which are called the actinides (Z = 90 to 103). From the same argument we used previously, we interpret these observations to mean that the energy differences between the 5d and 4f subshells, and between the 6d and 5f subshells, are very small while these subshells are being filled. On the other hand, certain predictions of Table 9-2 are always obeyed. Since none of the configurations is exceptional for elements in the first two and last six columns GR OUNDSTATES OF M ULTIELECTR ON ATOM S AND THE PERI ODIC TABLE of the periodic table, we conclude that every p subshell is always of higher energy than the preceding s or d subshell while these subshells are being filled, and that in these circumstances every s subshell is always of higher energy than the preceding p subshell. Therefore there must be large energy differences between the subshells concerned while they are being filled. In fact, the energy differences between every s subshell and the preceding p subshell are particularly large as can be seen in Figure 9-14, and it is easy to understand why. Since for a given n the energy of a subshell becomes higher with increasing 1, an s subshell is always the first subshell to be occupied in a new shell. Consequently, when an electron is added to a configuration with a completed p subshell and goes into the subshell of next highest energy, which according to Table 9-2 is always an s subshell, the electron will be the first one in a new shell. Compared to the electrons in the preceding subshell, its average radial coordinate will be considerably larger, its average potential energy will be considerably less negative, and its total energy will be considerably higher—much higher than for the usual increase in total energy in going from one subshell to the next. The fact that there is a particularly large energy difference between every s subshell and the preceding p subshell has some important consequences. Consider atoms of the elements 10Ne,' 8A, 36Kr, 54 X e , and 86Rn, in which a p subshell is just completed. Because of the very large difference between the energy of an electron in the p subshell and the energy it would have if it were in the s subshell, the first excited state of these atoms is unusually far above the ground state. As a result, these atoms are particularly difficult to excite. In their ground state, Gauss's law shows they produce no electric field external to the atom since they consist of sets of completely filled subshells, and so they have spherically symmetrical charge distributions with zero net charge because they are neutral overall. Furthermore, these atoms produce no external magnetic fields in their ground state since, as we shall see later, the total angular momenta of electrons in completely filled subshells couple to zero, and this coupling yields zero total magnetic dipole moment. Because of the absence of external fields (at least on a time-averaged basis), it is very difficult for these atoms to interact with other atoms to produce chemical compounds. They also have very low boiling and freezing points because they have little tendency to condense into liquids or solid form. These are the noble gas elements. The atom 'He is also a noble gas because for it the first unfilled subshell is an s subshell (even though it does not contain a filled p subshell) so it has an unusually high first excited state, and because in its ground state the atom consists of completely filled subshells and so produces no external fields. That 'He is a noble gas is indicated by its being listed in the last column of the periodic table instead of the second column. An element such as 20Ca is not a noble gas, even though it consists of completely filled subshells, because in its first excited state an electron goes to a 3d subshell. So the excited state is not far above the ground state and very little energy is required to make the atom produce an external field which will allow it to interact with other atoms. Another aspect of the particular inertness of the noble gases can be obtained by plotting, for the various elements, the measured values of the magnitude of the total energy of an electron in the highest-energy filled subshell. This is equal to the energy required to remove the electron from the atom, which is the ionization energy of the atom. Figure 9-15 shows such a plot. We see that the ionization energy oscillates about an average value which is essentially independent of Z, in agreement with our conclusion of the previous section that the total energy of electrons in the outer shells is roughly the same throughout the periodic table. The oscillations are quite pronounced, however, and it is apparent that the total energy of an electron in the highest-energy filled subshell of a noble gas is considerably more negative than average. These electrons are very tightly bound, and the atoms are very difficult to ionize. 25 ^ C^) MU LTIELECTRON ATOM S- GROUND STATESAND X-RAY EXCITATIONS CO *He — Ne • f 20 S ^ A • Kr —• • • • Xe • H• • • Rn Ç • • • • • 10 —• Ç • ° 5 ^• •.• • ' r %•' f. ..:•; « .• ^ • ^•^ • i U • — Li Na K • • Rb Cs 15 ô I 0 I i I i I I I I I I I I I I I I I I I 10 20 30 40 50 60 70 80 90 100 Z Figure 9-15 The measured ionization energies of the elements. We also see that the ionization energy is particularly small for the elements 3Li, 55 Cs, and 87Fr. These are the alkalis. They contain a single weakly 11 Na, 19 K , 37Rb, bound electron in an s subshell. Alkali elements are very active chemically because it is energetically favorable for them to get rid of the weakly bound electron and revert to the more stable arrangement obtained with completely filled subshells. These elements are said to have one valence electron, and a valence of plus one. At the other extreme are the halogens, 9 F, 17 C1, 35Br, 53I, and 85At, which have one less electron than is required to fill their p subshell. These elements have a high electron affinity; i.e., they are very prone to capture an electron. They have a valence of minus one. In 1962 it was discovered that in special circumstances noble gases could be made to combine with the halogen 9F to form stable molecules. Before that time it was believed that the noble gases were completely inert. These molecules can be formed only because 9F has such a high electron affinity that it can remove one of the very tightly bound electrons from the filled outer subshells of the noble gases. For the first three rows of the periodic table, the properties of the elements, such as valence and ionization energy, change uniformly from the alkali element with which the row begins to the noble gas with which it ends. In the fourth row of the periodic table this situation is no longer always true. The elements 21 Sc through 28 Ni, which are called the first transition group, have quite similar chemical properties and almost the same ionization energies. These elements occur during the filling of the 3d subshell. The radius of this subshell is considerably less than that of the 4s subshell, which is completely filled for all the first transition group except "Cr. The filled 4s subshell tends to shield the 3d electrons from external influences, and so the chemical properties of these elements are all quite similar, independent of exactly how many 3d electrons they contain. The point is that the chemical properties of the elements depend on the electrons in the outer subshells of their atoms, since these are the electrons responsible for producing the electric and magnetic fields that interact with electrons in other atoms. The chemical properties of 29Cu are somewhat different from those of the first transition group because it has only a single 4s electron in the outermost subshell. To a lesser extent this is also true for 'Cr. The element 30 Zn consists of a set of completely filled subshells and so is somewhat more inert, as can be seen from its ionization energy Similar transition groups occur in the filling of the 4d and 5d subshells. An extreme example of the same situation is found in the rare earths 58Ce through 71 Lu. These are the elements in which the 4f subshell is filling. This subshell lies deep within the 6s subshell, which is completely filled in all the rare earths. The 4f electrons are so well shielded from the external environment that the chemical properties of these elements are almost identical. The same thing happens in the actinides, 90Th Make an order of magnitude estimate of the ionization energy of 92U, if the exclusion principle did not operate so that all of its electrons were in its n = 1 shell. For this purpose assume that the typical electron feels the nuclear charge shielded by the charge of half the other electrons in the shell. Compare the results of the estimate with the actual value of the ionization energy shown in Figure 9-15. •An estimate of the total energy of a typical electron can be obtained from the one-electron atom energy formula uZ2e4 _ Z2 E —n 2 x 13.6 eV (47zEO)2 2 h n2 Example 9 - 7. If we set n = 1 and use an effective Z with the value Z 1 = Z/2 = 92/2 = 46, the absolute value of the result is the ionization energy. So we obtain 1E1 = (46)2 x 13.6 eV 3 x 104 eV From Figure 9-15 we find that the actual ionization energy is IEI = 4 eV Without the exclusion principle the ionization energy of 92 U would be something like four orders of magnitude larger than it actually is. 9-8 X-RAY LINE SPECTRA In an x-ray tube such as the one shown in Figure 2-9, electrons are emitted from a heated cathode, accelerated in a beam to kinetic energies of the order of 10 4 eV by a voltage applied between the cathode and anode, and then strike the anode. While traveling through the atoms of the anode, a beam electron occasionally passes near an electron in an inner subshell. By means of the Coulomb interaction between the energetic beam electron and the atomic electron, the latter can be given enough energy to remove it from its very negative energy level and eject it from the atom. This leaves the atom in a highly excited state because one of its electrons that had a very negative energy is missing. The atom will eventually return to its ground state by emitting a set of high energy, and therefore high-frequency, photons which are members of its x-ray line spectrum. (The interaction between a beam electron and an outer subshell atomic electron leading to low-energy excited states, and the production of the optical spectrum, is discussed in the next chapter.) The total spectrum of x radiation emitted by an x-ray tube consists of the discrete line spectrum, superimposed on a continuum, as is illustrated for a typical case in Figure 9-16. The continuum is due to the bremsstrahlung processes occurring when the beam electrons suffer accelerations in scattering from the nuclei of the atoms in the anode. As we saw in Section 2-6, the shape of the bremsstrahlung continuum depends mainly on the energy of the electron beam. But the shape of the x-ray line spectrum is characteristic of the particular atoms composing the anode. t/1:I10 3dS3NI1 l.b'a -X through 1o3 Lw. In this group the 5f subshell is filling inside the filled 7s subshell. Some of the most exciting work in contemporary chemistry is the study of the actinides of highest atomic number, which have only recently been discovered. It is appropriate to close our discussion by emphasizing the importance of the exclusion principle. If it were not obeyed, all the electrons in a multielectron atom would be in the is subshell because this is the subshell of lowest energy. If this were the case, all atoms would have spherically symmetrical charge distributions of very small radii that would produce no external electric fields, and furthermore they would also have very high first excited states. Then all atoms would be much like noble gases, and therefore there would be no molecules. In fact, the entire universe would be completely different if electrons did not obey the exclusion principle! MULTI ELECTRO N ATO MS-G ROU ND STATES AND X- RAY EXCITATI ON S I I 0.5 Wavelength 0 1.0 1.5 (Â) Figure 9-16 A typical x-ray spectrum. The lines are characteristic of the atoms of the x - ray tube anode (tungsten for the case illustrated). The continuum arises from bremsstrahlung by electrons accelerated in scattering from the nuclei of these atoms. X-ray line spectra are of practical interest because they are significant features of x rays, which have so many useful applications in technology and science. These spectra are of theoretical interest because they provide information about the energies of electrons in the inner subshells of atoms. We shall see that this information is in good agreement with the predictions of the Hartree theory. As an example of the production of an x-ray line spectrum, assume that an electron is initially removed from the is subshell of an atom in the anode of the tube. In the first step of the deexcitation process an electron from one of the subshells of less negative energy drops into the hole in the is subshell; for instance, a 2p electron could drop into the hole. This would leave a hole in the 2p subshell, but the excitation energy of the atom would be considerably reduced. Energy is conserved by the emission of a photon of energy equal to the decrease in the excitation energy of the atom, that is, the difference between the energies associated with an electron missing from the is and 2p subshells. Typically there would be several subsequent steps in the deexcitation process. For instance, the hole in the 2p subshell could be filled by a 3d electron, leaving a hole in the 3d subshell which is then filled by a 4p electron, etc. The net effect of each step is that a hole jumps to a subshell of less negative energy. When the hole works its way to the subshell of the atom of least negative energy, which is usually the outermost shell, it is filled by the electron initially ejected from the is subshell or, more typically, by some other electron in the anode. The atom is then neutral again, and in its ground state. The energy levels of an atom which are involved in the emission of its x-ray line spectrum are most conveniently represented in terms of an energy-level diagram that is rather different from the standard type with which we have become familiar. Figure 9-17 shows such a diagram for the 92U atom, including all its x-ray energy levels through n = 4. Because of the wide range of energies involved, it is conventional to use a logarithmic energy scale. Because it simplifies the discussion, it is also conventional to define the total energy of the atom to be zero when the atom is in its ground state. Since the energy scale is logarithmic, the zero energy level representing the ground state cannot be displayed on the diagram, but this does no harm. The most important difference between an x-ray energy-level diagram and a standard energy-level diagram is that the x-ray diagram gives the energy of the atom when one electron of the indicated quantum numbers n, 1,1 is missing. That is, the diagram describes the energy levels of the hole, with quantum numbers n, 1, j, that jumps from one subshell to the next when the atom emits its x-ray line spectrum. As a hole re- L series V n l j -c---K1 1 0 1/2 lLI f- LII 2 2 2 3 0 1 1 0 1/2 1/2 3/2 ^LIII ` 104 V M series V i P23 a) C W ^( v 103 /MI 3 3 3 3 1 1 2 2 4 4 NIII 4 Niv 4 Nv 4 NvI 4 NvII 4 0 1 1 2 2 3 3 wMII ^MIII v MV ks*NI NII v V f ^ 10 2 1/2 1/2 3/2 3/2 5/2 1/2 1/2 3/2 3/2 5/2 5/2 7/2 Figure 9-17 The higher energy x-ray levels for the uranium atom and the transitions between these levels allowed by the selection rules. presents the absence of an electron of negative energy, the energy associated with a hole is positive. So the energies of all the levels of an x-ray diagram are positive. The energy levels in Figure 9-17 are also identified by a notation commonly used in discussing x-ray spectra. In this notation the value of the quantum number n is specified by capital letters, according to the scheme shown in Table 9-4. That is, an n = 1 level is called a K level, an n = 2 level is called an L level, etc. Similarly, the n = 1 shell is called the K shell, etc. Roman numeral subscripts are used to label levels of the same n, according to decreasing energy. That is, in order of decreasing energy the three L levels are called L I, L11, and Linn. If the energy of an atom with an electron of quantum numbers n, 1, j is particularly negative, the energy of an atom with a hole of the same quantum numbers is particularly positive since more energy must be given to the atom to remove the electron. In other words, the lack of a large negative energy is equivalent to the presence of a large positive energy. Keeping this inversion in mind, we see from Figure 9-17, which was obtained from an analysis of the measured x-ray line spectrum of 92U, that the n, 1, j dependences of the x-ray energy levels are as would be expected from the Hartree theory. The energies of these levels increase with decreasing values of n and of 1, in agreement with an inversion of the rule describing the theoretical predictions that was stated in the preceding section. The x-ray energy level for j = 1 + 1/2 has lower energy, and the level for the other possibility, j = 1 1/2, has higher energy. This is the expected inversion of the splitting of the energy levels according to j, discussed in connection with one-electron atoms in Section 8-6. In the L shell (n = 2) of 92U this splitting is more than 2000 eV, and it is larger than the dependence on 1. So it is hardly appropriate to call the j dependence of x-ray energy levels "fine-structure splitting." The strong j dependence, which is characteristic of the inner shells of all atoms except those of very low Z, is partly due to the increase in the magnitude of the spin-orbit interaction because of the high value of the term (1/r)dV(r)/dr in (8-35). It also involves the other relativistic effects that become very large for the high velocity electrons that populate the inner shells of these atoms. — The Spectroscopic Notation for n Table 9-4 n 1 2 3 4 5 Spectroscopic notation K L M N O b1:I10 3dS3N11 AbLl -X 105 K series 0 MULTIE LECTRON ATOM S- GROUND STATES AND X- RAY EXCITATIONS M As we have indicated, it is convenient to think of the production of the x-ray line spectra in terms of the creation of a hole in one of its higher-energy levels, and the subsequent jumping of the hole through its lower-energy levels. With each jump, an x-ray photon is emitted that carries off the excess energy. The frequency y of the photon bears the usual relation to the energy E which it carries, E = hv. But not all transitions occur. There is the following set of selection rules for the change in quantum numbers of the hole: Al = + 1 (9-28) Aj = 0, ±1 (9-29) These are the same as the selection rules of (8-37) and (8-38), for an electron in a one-electron atom, and they have the same explanation as presented in Section 8-7. The x-ray energy-level diagram for 92 U, of Figure 9-17, shows the transitions that obey these selection rules. The totality of x rays which are emitted .in such transitions (plus a few which are observed to be emitted very infrequently in violation of the selection rules) constitute the x-ray line spectrum of the atom. All transitions from the K shell produce lines of the so-called K series, with K a corresponding to a transition to the L shell, K R to the M shell, etc. All transitions from the L shell produce lines of the L series, and so forth. Example 9-8. Estimate the minimum accelerating voltage required for an x-ray tube with a 26 Fe anode to emit a K a line of its spectrum. Also estimate the wavelength of a K a photon. ■ We can use the crude description of the results of the Hartree theory to estimate the excitation energy of a 26 Fe atom with a hole in its K shell. Equation (9-27) tells us that this energy is l-cZ e 4 E 13.6 n2K eV (47LE0)22h2n2 ~+ Zn 13.6(Z — 2) 2 eV = 13.6 x (24)2 eV +7.8x10 3 eV where we have set n = 1 and Z n = Z1 = Z — 2. A beam electron bombarding an atom in the anode must have this much energy to produce the hole. The voltage V required to accelerate the beam electron to this energy is just V^ 7.8 x 10 3 V After the atom emits a K a photon, the hole is in its L shell. Then its energy is EL + 13.6 nz eV 13.6 (26 — 10)2 eV ^ + 8.7 x 10 2 eV 4 where we have set n = 2 and, following the results of Example 9-5, set Zn = Z2 = Z — 10. The photon carries away energy hv = EK — EL But since the value of EL is only about 10% of the value of EK, and since the crude approximation we have used to obtain EK is generally not accurate to 10%, we might as well take hv EK The wavelength , of the photon is related to its frequency 1 v hv c hc y and its velocity c by the expression so 1 EK , hc ti ke 4 (47CE0)24iwh3 ( Z — 2) 2 The term multiplying (Z — 2) 2 is Rydberg's constant, R M , defined in (4-22). Therefore 1 R M (Z — 2) 2 ^ 1.1 x 10' x (24) 2 m -1 = 6.3 x 10 9 m -1 (9-30) co and A ^ This wavelength is about the size of a typical molecule, or the spacing of atoms or molecules in a crystal. Thus the Ka x rays from 26 Fe can be used in diffraction experiments to study the structure of molecules or crystals. A striking feature of x-ray line spectra is that the frequencies and wavelengths of the lines vary smoothly from element to element. There are none of the abrupt changes from one element to the next which occur in atomic spectra in the optical frequency range. The reason is that the characteristics of x-ray spectra depend on the binding energies of the electrons in the inner shells. With increasing atomic number Z, these binding energies simply increase uniformly, owing to the higher nuclear charge, and they are not affected by the periodic changes in the number of electrons in the outer shells of the atom that affect the optical spectra. The regularity of x-ray spectra was first observed by Moseley. In 1913 he made a survey of x-ray spectra and obtained data for a number of elements on the wavelengths of the Ka line. (There are really two closely spaced Ka lines, as can be seen from Figure 9-17, but it was difficult for Moseley to resolve this structure.) The measured wavelengths could be fitted within experimental accuracy by the empirical formula 1 C(Z — a)2 (9-31) where C is a constant with a value approximately equal to the Rydberg constant RM , a is a constant with a value of about 1 or 2. This formula, and some of the data, and are plotted in Figure 9-18. Moseley interpreted the empirical formula on the basis of the Bohr model, which had been proposed just before he made his measurements. He performed a calculation essentially the same as our calculation in Example 9-8 to obtain (9-30), which agrees well enough with (9-31), but he took the basic energy equation, (9-27), from the Bohr model instead of the Hartree theory. That is, he adapted the Bohr energy equation into (9-27) by replacing Z by Zn , as a way of describing the shielding of the nuclear charge by electron charges in a multielectron atom. His arguments concerning shielding were similar to ours of Section 9-6, except that he thought the electrons travel in well-defined Bohr orbits and concluded that Z 1 ^ Z — 1 instead of Z 1 ^ Z — 2. Moseley's work, carried out when he was a graduate student, was an important step in the development of quantum physics. His simple and successful application of the Bohr model to x-ray line spectra provided one of its earliest confirmations. By using the empirical formula to determine Z, he established unambiguously the a 5 10 15 20 z 25 30 35 40 Figure 9 18 Points representing Moseley's data, and a curve representing his empirical formula. The curve is a straight line since the square root of the reciprocal of the wave lengths of the x-ray lines is plotted versus the atomic number of the atoms producing the lines. - b1:1103dS3N11 Ab'a - X8-6'0aS ^1.6x10 -10 m=1.6 N MULTIELEC TRON ATOMS- GROUND STATES AND X- RAY EXCITATIO NS M correlation between the nuclear charge of an atom and its ordering in the periodic table of the elements. For instance, he found that the atomic number of 'Co is one less than that of 28Ni, even though its atomic weight is greater. He also showed that there were gaps in the periodic table, as it was then known, at Z = 43, 61, 72, and 75. Elements of these atomic numbers have subsequently been discovered. Moseley's contributions were brought to a halt by service in World War I, from which he did not return. Example 9-9. Measured values of the probability that a 82 Pb atom will absorb by the photoelectric effect an x-ray photon from an incident beam of photons, are displayed in Figure 9-19 by plotting the absorption cross section as a function of the energy hv of the photon. The prominent discontinuity just below 10 5 eV is called the K absorption edge. Show that it occurs at an energy for which the incident photon can just produce a hole in the K shell of 82Pb. Then explain the origin of the discontinuities a little above 10 4 eV. ^ According to (9-27), the energy required to produce a hole in the K shell of 82 Pb is approximately Zz EK ^ + 13.6 Z eV ^ 13.6(Z — 2) 2 eV = 13.6 x (80) 2 eV = 8.7 x 104 eV n This agrees within a few percent with the measured energy of the K absorption edge. A photon whose energy is slightly above this edge can be absorbed by the photoelectric effect on any electron of the atom. But a photon of energy slightly below the K absorption edge does not have enough energy to eject a K shell electron, so for it the photoelectric effect cannot occur on a K shell electron. Thus the photoelectric absorption cross section drops abruptly at the K absorption edge. At energies a little above 10 4 eV there are three L absorption edges. These occur at the energies required to produce holes in the L shell of the atom. There are three because "fine structure", due to spin-orbit and other relativistic effects, splits the L level into three levels, L1, L11, L111, as can be seen in Figure 9-17. • 1 0 -1 ^ O 10 4 10 5 Photon energy hv (eV) 106 Figure 9-19 The probability that a lead atom will absorb an x-ray photon by the photoelectric effect, as a function of the energy of the photon. The probability is expressed in terms of the absorption cross section. 1. Why is there difficulty in distinguishing the two electrons in a helium atom from each other, but not the two electrons in separated hydrogen atoms? What about a diatomic hydrogen molecule? 2. Explain, without reference to the time-independent Schroedinger equation, why the product form of the eigenfunction of (9-3) immediately implies that the two particles it describes move independently. 3. Can you write a time-independent Schroedinger equation for two identical particles, without using particle labels? 4. Are particle labels themselves objectionable, in working with quantum mechanical systems containing identical particles? If not, explain precisely what care must be exercised in using them. 5. Since the value of an antisymmetric total eigenfunction changes when its particle labels are exchanged, why can such eigenfunctions be used to give an accurate description of a system of electrons? 6. Does the exchange degeneracy increase the number of degenerate states in an atom containing two electrons? Ex4plain. 7. Do you think the sign of the charge of an elementary particle, like an electron or proton, is a more, or less, fundamental property than the "sign" of its symmetry? 8. Would atoms be affected more by reversing the signs of the charges of all their constituent particles, or by reversing all their symmetries? 9. Exactly what is meant by the statement that the spin variable is not continuous? 10. Would it be possible to measure effects of the exchange force acting between two electrons if there were no Coulomb interaction between them to produce an interaction energy of magnitude dependent on the sign of the exchange force? 11. Why would it be much more difficult to solve the time-independent Schroedinger equation for a system of interacting particles than for a system of independently moving particles? 12. Describe the steps in a cycle of the self-consistent Hartree treatment of a multielectron atom. Why is the estimate of the net potential V(r) obtained at the end of a cycle more accurate than the estimate used at the beginning? 13. Why is the angular dependence of multielectron atom eigenfunctions the same as for oneelectron atom eigenfunctions? Why is the radial dependence different, except near the origin where it is the same? 14. Just what is the justification for using one-electron atom equations with an effective Z to discuss multielectron atoms? 15. What are the consequences of the fact that the sizes of all atoms are about the same? What are the reasons for this fact? 16. Devise a purely mechanical system in which a classical particle would exhibit the tendency, illustrated in Figure 9-12, to avoid the point about which it rotates. 17. Explain all aspects of the Z dependence of the subshell energies, plotted in Figure 9-14. 18. Why is it particularly difficult to separate mixtures of the rare earth elements by chemical techniques? 19. How can we be sure that if there were no molecules there would be no life? 20. What property of x rays makes them so useful in seeing otherwise invisible internal structures? 21. Give an example in the classical world where the concept of a hole might be used in a way comparable to the way it is used in discussing x-ray line spectra. 22. What argument might Moseley have used to conclude that the effective Z for the K shell is Z1 Z — 1? Can Gauss's law of electrostatics be applied to evaluate the shielding produced by electrons moving in Bohr orbits? 4= w, CO SN OIlS3f1O QUESTIONS MU LTIELECTRON ATOMS-GROUND STATESAND X- RAY EXCITATIO NS 23. What features of the periodic table of Figure 9-13 would Mendeleev fail to recognize? 24. Do the properties of the electrons in multielectron atoms provide any explanation of why the element of highest atomic number found in nature is 92U? 25. In your opinion, what is the most important consequence of the exclusion principle? PROBLEMS 1. By going through the procedure indicated in the text, develop the time-independent Schroedinger equation for two noninteracting identical particles in a box, (9-1). 2. By applying the technique of separation of variables, show that, for a potential of the additive form of (9-2), there are solutions to the two-particle time-independent Schroedinger equation, (9-1), in the product form of (9-3). 3. Exchange the particle labels in the two probability density functions, obtained from the symmetric and antisymmetric eigenfunctions of (9-8) and (9-9), and show that neither is affected by the exchange. 4. Verify that the expanded form of the three-particle eigenfunction of Example 9-2 is antisymmetric with respect to an exchange of the labels of two particles. 5. Verify that the expanded form of the three-particle eigenfunction of Example 9-2 is identically equal to zero if two particles are in the same space and spin quantum state. 6. Verify that the 1/ J3! normalization factor quoted in Example 9-2 is correct. 7. Verify that the expanded form of the three-particle eigenfunction of Example 9-3 is symmetric with respect to an exchange of the labels of two particles. 8. An a particle contains two protons and two neutrons. Show that if each of its constituents is antisymmetric then it must be symmetric, as stated in Table 9-1. (Hint: Consider a pair of a particles, and the effect of exchanging the labels of all the constituents in one with those of all the constituents in the other.) 9. Write an expression for the expectation value of the energy associated with the Coulomb interaction between the two electrons of a helium atom in its ground state. Use a space eigenfunction for the system composed of products of one-electron atom eigenfunctions, each of which describes an electron moving independently about the Z = 2 nucleus. Do not bother to evaluate the expectation value integral, but instead comment on its relation to the energy levels shown in Figure 9-7. 10. Prove that any two different nondegenerate bound eigenfunctions i/i i(x) and tki(x) that are solutions to the time-independent Schroedinger equation for the same potential V(x) obey the orthogonality relation J t//7(x)0 i (x) dx = 0 i j (Hint: (i) Write the equations to which t/i i and tk i are solutions, and then take the complex conjugate of the second one to obtain the equation satisfied by 4. (ii) Multiply the equation in tii by tJ'7, the equation in ti* by tpi, and then subtract. (iii) Integrate, using a relation such as 0i* d2 i /dx 2 — tfrid 2 e/dx 2 = (d/dx)(07 dii/dx — tfüdI'7/dx).) The proof can be extended to include degenerate eigenfunctions, and also unbound eigenfunctions that are properly normalized. Can you see how to do this? 11. (a) By going through the procedure indicated in Section 9-5, develop the time-independent Schroedinger equation for a system of Z electrons of an atom moving independently in a set of identical net potentials V(r). (b) Then separate it into a set of Z identical timeindependent Schroedinger equations, one for each electron. (c) Verify that the form of a typical one is as stated in (9-22). (d) Compare this form with the time-independent Schroedinger equation for a one-electron atom, (7-12). 12. (a) Show that there are N! terms in the linear combination for an antisymmetric total eigenfunction describing a system of N independent electrons. (Hint: Consider Example 9-2, and use the mathematical technique of induction.) (b) Evaluate the number of such 14. 15. 16. 17. 2L2 E 2m + [ V(r) 2 V (r) + 2mr2 ] 2m + where p H is its component of linear momentum parallel to its radial coordinate vector of length r. (b) Explain why this indicates that its radial motion is as it would be in a one-dimensional system with potential V'(r). (c) Then show that V'(r) becomes repulsive for small r because of the dominant behavior of the term L 2/2mr2, sometimes called the centrifugal potential. 18. (a) Sketch the potentials V'(r) for the argon atom with 1 = 0 and 1 = 1, defined in Problem 17, by adding the corresponding centrifugal potentials to the V(r) obtained in Problem 13. (b) Also sketch the energy level E2. (c) Show the classical limits of motion, within which E 2 > V'(r). (d) Compare these limits with the radial probability densities of Figure 9-10, for n = 2, l = 0, and n = 2, l = 1. 19. Write the configurations for the ground states of 28Ni, 29 Cu, 30 Zn, 31Ga. 20. Write the configurations for the ground states of all the lanthanides, making as much use as possible of ditto marks. 21. Recent work in nuclear physics has led to the prediction that nuclei of atomic number Z = 110 might be sufficiently stable to allow some of the element Z = 110 to have survived from the time the elements were created. (a) Predict a likely configuration for this element. (b) Make a prediction of the chemical properties of the element. (c) Where would be a likely place to start searching for traces of it? 22. (a) From information contained in Figures 9-6 and 9-15, determine the energy required to remove the remaining electron from the ground state of a singly ionized helium atom. (b) Compare this energy with the energy predicted by the quantum mechanics of oneelectron atoms. 23. (a) Draw a schematic representation of a standard energy-level diagram for the 22Tî atom, showing the states populated by electrons for a case in which one electron is missing from the K shell. The diagram should be comparable to the one in Figure 9-9 in that it should not attempt to give the energies of the levels to an accurate scale, and no distinction should be made between L I, L11 , and L111 levels, etc. (d) Do the same for a case in which one electron is missing from the L shell. (c) Draw a schematic representation of an x-ray energy-level diagram showing the energies of the atom when a hole is in the sw31 eoad 13. terms for the case of the argon atom with Z = 18. (Hint: Use a mathematical table to evaluate N!, or use Stirling's formula, found in most mathematical references, to approximate it.) (c) State briefly the connection between the results of (b) and the procedure used by Hartree to treat the argon atom. (a) Use information from Figure 9-11 to make a sketch, on semilog paper, of the net potential V(r) for the argon atom. Be sure to determine several values for r/ap between 0 and 0.25, as this information will be used in Problem 18. (b) Also show the energy levels E 1 and E2, using estimates from Example 9-5, and the energy level E3, using measured data from Figure 9-15. (a) Find the value of Z 1 for the helium atom which, when used in the energy equation, (9-27), leads to agreement with the ground state energy shown in Figure 9-6. (b) Compare Z 1 with Z. (c) Is Z 1 meaningful for an atom with as few electrons as helium? Explain briefly. From Figure 9-6 estimate the average distance between the two electrons in a helium atom (a) in the ground state and (b) in the first excited state. Neglect the exchange energy. (a) Use the Z, z for the argon atom obtained in Example 9-5 in the one-electron atom equation for the radial coordinate expectation value, to estimate the radii of the n = 1, 2, and 3 shells of the atom. (b) Compare the results with Figure 9-10. Develop a mathematical argument for the tendency, illustrated in Figure 9-12, of an atomic electron with angular momentum L to avoid the point about which it rotates. Treat the electron semiclassically by assuming that it moves around an orbit in a fixed plane passing through the nucleus. (a) Show that its total energy can be written MULTIELECTRON ATOMS- GROU ND STATES AND X- RAY EXCITATIONS 24. 25. 26. 27. 28. K or L shells. (d) Compare the utility of the standard and x-ray energy-level diagrams for cases in which a hole is in an inner shell. (e) Also make such a comparison for cases in which a hole is in an outer shell. The wavelengths of the lines of the K series of 74W are (ignoring fine structure): for K a, = 0.210 A; for K R , 2= 0.184 A; for K y, 2 = 0.179 A. The wavelength corresponding to the K absorption edge is A= 0.178 A. Use this information to construct an x-ray energylevel diagram for 74W. (a) Make a rough estimate of the minimum accelerating voltage required for an x-ray tube with a 26Fe anode to emit a La line of its spectrum. (Hint: As in Example 9-5, Z2 ^ Z — 10.) (b) Also estimate the wavelength of the L a photon. (a) Use Moseley's data of Figure 9-18 to determine the values of the constants C and a in his empirical formula, (9-31). (b) Compare these values with those of (9-30), which was derived from the results of the Hartree theory. It is suspected that the cobalt is very poorly mixed with the iron in a block of alloy. To see regions of high cobalt concentration, an x-ray is taken of the block. (a) Predict the energies of the K absorption edges of its constituents. (b) Then determine an x-ray photon energy that would give good contrast. That is, determine an energy of the photon for which the probability of absorption by a cobalt atom would be very different from the probability of absorption by an iron atom. The Lyman-alpha lifetime in hydrogen is about 10 -8 sec. From this, find the lifetime for the K a x-ray transition in lead. (Hint: For the inner electrons in lead the vvavefunctions are hydrogenic with appropriate effective Z; lifetime = 1/R; see (8-43).) 10 MULTIELECTRON ATOMS-OPTICAL EXCITATIONS 10-1 INTRODUCTION 348 interactions experienced by atomic electrons; production of optical excitations 10 2 - ALKALI ATOMS 349 optically active electron; hydrogen, lithium, and sodium energy levels; Hartree interpretation; fine structure; selection rules 10 3 - ATOMS WITH SEVERAL OPTICALLY ACTIVE ELECTRONS 352 limitations of Hartree approximation; residual Coulomb and spin-orbit interactions; tendency for spins to couple together; tendency for orbital angular momenta to couple together; opposing tendency for each spin to couple to its orbital angular momentum; LS coupling; total spin angular momentum and quantum number s'; total orbital angular momentum and quantum number l'; total angular momentum and quantum number j'; JJ coupling 10 4 - LS COUPLING 356 geometrical representation; quantum number m i; conditions satisfied by s', l', and j'; energy levels in typical LS coupling configuration; spectroscopic notation; multiplets; Landé interval rule; test for LS coupling; experimental assignment of quantum numbers 10 5 - ENERGY LEVELS OF THE CARBON ATOM 361 treated as example of preceding discussion; hyperfine splitting; exclusion principle in LS coupling; properties of filled subshells; selection rules 10 6 - THE ZEEMAN EFFECT ' 364 normal and anomalous effects; qualitative discussion; derivation of Landé g factor; selection rules; electron spin resonance; experimental assignment of quantum numbers; Paschen-Bach effect; selection rules 10 7 - SUMMARY 370 tabulated properties of interactions experienced by atomic electron in less than half-filled subshells; more than half-filled subshells QUESTIONS 371 PROBLEMS 372 347 co M MU LTI ELE CTR ONATO MS- OPTICAL EXCITATIONS 10-1 INTRODUCTION A description of the behavior of electrons in multielectron atoms involves a succession of increasingly accurate approximations. In the first step only the strongest interactions felt by the atomic electrons are considered. This is the Hartree approximation, discussed in the preceding chapter, in which each electron is treated as if it were moving independently in a spherically symmetrical net potential that describes the average of its Coulomb interactions with the nucleus and the other electrons. In the next steps the description is made more and more accurate by taking into account successively the weaker interactions which the electrons feel. In a typical multielectron atom these weaker interactions include two that involve departures of the actual Coulomb interactions experienced by an atomic electron from the average described by the net potential. One of these leads to couplings between the orbital angular momenta of the electrons, and the other leads to couplings between the spin angular momenta of the electrons through an interesting effect of the exchange force. A third weaker interaction involves the internal magnetic fields of the atom, and leads to couplings between the spin and orbital angular momenta. A fourth weaker interaction is present if the atom is placed in an external magnetic field, as in the so-called Zeeman effect. In this chapter we discuss qualitatively the steps in this succession of approximations, and we use the discussion to describe the behavior of the atomic electrons. That is, we shall consider the four weaker interactions experienced by these electrons, and we shall see that they provide a very satisfactory explanation of the important properties of the ground states and low-energy excited states of all atoms. An atom is raised from its ground state to one of its low-energy excited states when an electron in one of its outer subshells is given a small amount of energy. As an example, this can happen when an atom collides with another atom in a gas discharge tube. The Coulomb field of the incident atom can act on an electron in an outer subshell of the struck atom and give it a few electron volts of excitation energy. In the deexcitation process, the atom that has received energy goes from the state initially excited to its ground state by emitting a set of low-energy photons whose frequencies constitute its optical line spectrum. The initial excitation is therefore called an optical excitation. Note the contrast between an optical excitation, which involves giving a small amount of energy to an electron in an outer subshell, and an x-ray excitation, which involves giving a large amount of energy to an electron in an inner subshell. The low-energy excited states of atoms that enter into the production of optical line spectra are certainly worth studying. One reason is that a study of these excited states of atoms leads to an extremely complete description of their ground states. Another reason is that the general ideas behind the successive approximation procedure used in the study are similar to those behind the procedures used throughout science and engineering to break down a complicated problem into a sequence of not too complicated steps. The details of the procedure are of particular interest to students who will continue in physics beyond the level of this book because they are closely related to those used in the theory of molecules, nuclei, and elementary particles. (Such students should read Appendix J, which provides a theoretical foundation for the procedure.) Furthermore, optical line spectra are themselves of great practical interest because they are valuable experimental tools in many fields. Certainly the best example is astronomy. Much of what is known about the stars has come from measurements- and analysis of optical line spectra. The pattern of lines observed in emission spectra is used to identify the composition of stars; the intensity of lines observed in absorption spectra is used to measure the temperatures of stellar surfaces; the Doppler shift of the spectral lines is used to measure the velocities of stars; and the Zeeman effect is used to measure the magnetic fields produced by stars. 10-2 ALKALI ATOMS 0 6_ 5 ...._ —_ -5 5s 4 —1 6p 6d —675p 5d 5f — — —6s 6p 6d 6f 5d 5f 5p 4d 4f 4p 4d 4f 5s 4s 4p 733d 3 —2 11 Na 3 Li 1H r--\ 3d 4s 3s 3p 2 2p —4 —5 3s 2s —6 Figure 10 1 - Some of the energy levels of hydrogen, lithium, and sodium atoms. m P SW OlV I1 `d>17d We begin our study of the optical excitations of multielectron atoms with the simplest case, alkali atoms. In their ground states, these atoms contain a set of completely filled subshells, the highest energy one being a p subshell, plus a single additional electron in the next s subshell. As discussed in Section 9-7, the energy of the electrons in a filled p subshell is quite a bit more negative than the energy of an electron in the next s subshell. Consequently, the p subshell electrons are not excited in any of the low-energy processes which lead to the production of the optical spectra. In essence, an alkali atom consists of an inert noble gas core plus a single electron moving in an external subshell. The analysis of the optical line spectrum of an alkali atom in terms of its excited states is fairly simple since the excited states can be described completely by describing the single so-called optically active electron, and the core of filled subshells can be ignored. The total energy of the core does not change, so the total energy of the atom is a constant plus the total energy of the optically active electron. It is convenient in discussing the excited states of an alkali atom to define the zero of total energy in such a way that the total energy of the atom is equal to that of the optically active electron. Using this definition, we present in Figure 10-1 diagrams showing the energies of the ground state and the first few excited states of the alkali atoms 3Li and "Na, obtained from an analysis of the optical line spectra of these elements, and also the energy levels of 1H for n = 2, 3, 4, 5, and 6. Each energy level is labeled by the quantum numbers n and / of the optically active electron, i.e., by its configuration. These diagrams do not show fine-structure splittings, which will be discussed shortly. MULTIELECTRON ATOMS- OPTICAL EXCITATION S The Hartree theory works particularly well as a first step in calculating the energy levels of the optically active electron of an alkali element because the net potential V(r), due to the nucleus plus the electrons of the core, actually is spherically symmetrical as assumed in the theory. The energies predicted by the theory are in excellent agreement with those shown in Figure 10-1. Furthermore, the theory makes it easy to understand the structure of these energy-level diagrams and their relation to the diagram for 'H. The dependence of the energy of the optically active electron on its quantum numbers n and l is just as we have described in the previous chapter. For a given n, the energy is most negative for the smallest value of l because the electron spénds more time near the center of the atom, where it feels the full nuclear charge. In the ground state of the 'Li atom, the optically active electron is in the 2s subshell and its energy is about 2 eV more negative than an n = 2 electron in a 1H atom. In the first excited state, the optically active electron is in the 2p subshell and its energy is only about 0.2 eV more negative than an n = 2 electron in 'H. For "Na the / dependence makes the 4s level more negative than the 3d level. However, for the large radii subshells with large values of n, the l dependence becomes less important, and the energy levels of the optically active electron become very close to the energy levels of an electron in a 1 H atom. The reason is that the shielding of the nuclear charge +Ze by the charge —(Z — 1)e of the electrons in the core of the alkali atom becomes practically complete for an electron in a subshell of radius large compared to the radius of the core, so the electron experiences essentially the same Coulomb potential due to a single charge + e as an electron in a 'H atom. The lines of the optical spectra emitted by alkali elements show a fine-structure splitting which indicates that all energy levels are double, except those for 1 = 0. This is due to a spin-orbit interaction acting on the optically active electron, i.e., due to the coupling between the magnetic dipole moment of the electron and the internal magnetic field it feels because it moves through the electric field of the atom. Other relativistic effects, which are just as important as the spin-orbit interaction in the case of a one-electron atom, are generally quite negligible for the optically active electrons in all multielectron atoms. We can see this by using the Bohr model result of (4-17) v Ze e = 47rEO nh to estimate the average velocity y of an optically active electron, providing we replace Z by Zn . As Zn/n is about equal to one for the optically active electrons of all atoms, the equation shows that the average value of v/c is about equal to its value in the ground state of the 1H atom; that is v/c ^ 10 -2. The associated relativistic effects for optically active electrons thus are of the same order of magnitude throughout the periodic table. In contrast, we shall see below that the spin-orbit interaction increases in magnitude rapidly in going from 1H to elements further up the periodic table, and so it dominates the other relativistic effects. The splitting of the energy levels of an alkali element due to the spin-orbit interaction acting on the optically active electron can be understood by considering the interaction energy, (8-35) AE _ h2 1 dV(r) 4m2c2 [j(j + 1) — l(l + 1) — s(s + 1)] r dr The arguments leading to this equation apply as well to the optically active electron in an alkali atom as to the electron of a one-electron atom, providing that V(r) is equated to the Hartree net potential and the expectation value of (1/r)dV(r)/dr is calculated using the probability density obtained from the Hartree eigenfunctions. As is true for a one-electron atom, when the spin-orbit interaction is included the eigenfunctions describing the optically active electron of an alkali atom are labeled by the quantum numbers n, 1, j, mi . These quantum numbers obey the same rules as before. Specifically s =1/2 (10-1) 1 — 1 /2, 1 + 1/2 J 1/2 _ mi = -j,-j+1,..., +j- 1, +j (10-3) For 1 = 0, (8-35) shows that the spin-orbit interaction energy is AE = 0. For other values of 1, it shows that AE assumes two different values, one positive and the other negative, according to whether j = 1 + 1/2 or j = 1— 1/2. Except for 1 = 0, each energy level is thus split into two components, one of slightly higher energy for the spin and orbital angular momenta "parallel," and one of slightly lower energy for these angular momenta "antiparallel." The energy difference is the work required to turn the electron magnetic dipole moment from one orientation to the other in the internal magnetic field of the atom. The magnitude of the energy splitting is proportional to the expectation value of (1/r)dV (r)/dr, which determines the strength of the magnetic field. Since both 1/r and the derivative of the net potential V(r) become large for small r, the expectation value is dependent primarily on the behavior of V(r) near r = 0. According to (9-25) for the net potential V(r) of the Hartree theory, the larger the value of Z the more rapidly V(r) becomes negative as r becomes small. Thus the magnitude of dV(r)/dr increases with increasing Z, near r = 0. Consequently (1/r) dV(r)/dr, and also the spin-orbit splitting, should increase in magnitude with increasing Z. This behavior can be found in the experimental data of Table 10-1, which lists the observed splittings of the energy levels of an electron excited to the first p subshell of various alkali atoms. The spectral lines of an alkali atom are emitted in transitions between energy levels whose quantum numbers satisfy the selection rules: Al = +1 (10-4) Aj = 0, + 1 (10-5) These selection rules for the transitions of the single optically active electron of an alkali atom are the same as those for the electron of a one-electron atom, and they have the same explanation. Of course, the frequencies of the spectral lines are the energy differences of the levels involved in the transition, divided by Planck's constant. If an alkali atom is not placed in an external magnetic field, only one of the weaker interactions, mentioned in Section 10-1, acts on the optically active electron. This is the spin-orbit interaction that arises from the presence of the internal magnetic field of the atom. There are no weaker interactions arising from departures of the actual Coulomb interactions experienced by the optically active electron from the average described by the spherically symmetrical net potential V(r). The reason is that the potential experienced by the optically active electron really is spherically symmetrical since all the other electrons in the alkali atom are in the spherically symmetrical core. We shall soon see that this simplification does not hold for a typical atom. Table 10-1 Spin-Orbit Splittings in a Number of Alkali Atoms Element 3 Li ''Na 19K 37 Rb 55 Cs Subshell 2p 3p 4p 5p 6p 0.42 x 10 -4 21 x 10 -4 72 x 10 -4 295 x 10 -4 687 x 10 -4 Spin-orbit splitting (eV) SWOltJ Il `d>Ilb' l^ 0 (10-2) l =0 MULTIELE CTRON ATOMS- OPTICAL EXCITATIONS ° s U Example 10-1. The yellow light of sodium vapor lamps frequently employed in highway illumination is a spectral line arising from the 3p to 3s transitions in 11 Na. (a) Evaluate the wavelength of this line by using information contained in Figure 10-1. (b) The line is split by the spin-orbit interaction. Evaluate the separation in wavelength of its two components from information contained in Table 10-1. (c) Also comment on the application of the selection rules to the transitions involved in emission of the two components of the line. ^ (a) Careful inspection of Figure 10-1 shows that the energy difference between the 3p and 3s levels of 11 Na is E3 p —E3 s ^—'(-3.0 eV) —(-5.1 eV) =2.1 eV The photons emitted in transitions between these levels carry away energy hv = E 3 — E3s, and have frequency y and wavelength 2, where c he 6.6 x 10 -34 joule-sec x 3.0 x 108 m/sec v hv 2.1 eV x 1.6 x 10 -19 joule/eV The value obtained directly from accurate measurements is 2 = 5893 A. (b) According to Table 10-1, the spin-orbit interaction splits the 3p level by an energy dE = 2.1 x 10 -3 eV. Since 2= cv -1 it follows that dl _ — cv 2 dv and that the magnitude of the separation in wavelength of the two components of the spectral line is d^ = c v2 dv = hch dv (hv) 2 he dE (hv) 2 6.6 x 10 -34 joule-sec x 3 x 10 8 m/sec x 2.1 x 10 -3 eV x 1.6 x 10 -19 joule/eV (2.1 eV x 1.6 x 10 -19 joule/eV)2 =5.7x10 -10 m=5.7 A (c) The 3p level of higher energy corresponds to j = l + 1/2 = 1 + 1/2 = 3/2, and the 3p level of lower energy corresponds to j = 1 — 1/2 = 1 — 1/2 = 1/2. The 3s level is not split since 1 = 0, and j = 1/2 only. For transitions from the higher 3p level to the 3s level, 4l = —1 and 4j = —1; for transitions from the lower 3p level to the 3s level, 4l = —1 and 4j = 0. 4 So both of these transitions are allowed by the selection rules of (10-4) and (10-5). 10-3 ATOMS WITH SEVERAL OPTICALLY ACTIVE ELECTRONS We turn now to the more typical case of an atom containing a core of completely filled subshells surrounding the nucleus, plus several electrons in a partially filled outer subshell. Since any of these electrons can participate in the excitations leading to the emission of the optical spectrum of the atom, all the electrons in the partially filled subshell are optically active. The excited states of such an atom are treated by first using the Hartree approximation, which accounts for the stronger interactions felt by its optically active electrons, and by then including the effects of other interactions which are weaker but still important. It should be emphasized that we shall consider here, and in the remainder of the chapter, only atoms in which the outer subshell is less than half filled. If the subshell is more than half filled, the optical excitations of the atom are discussed in terms of the behavior of holes—not electrons—as in our discussion of x-ray line spectra. Since a hole is the absence of a negative charge, it is equivalent to the presence of a positive charge. Because of this sign reversal, certain effects that we shall deal with have a sign reversal in atoms with outer subshells that are more than half filled. In the Hartree approximation, the energy of each independently moving optically active electron is determined by its quantum numbers n and 1. The dependence of its There are also relativistic corrections, corrections for interactions between the spin of one optically active electron and another because of magnetic interactions between the associated magnetic moments, etc.; but these are all very small and can usually be ignored. We are by now quite familiar with the spin-orbit interaction since it is found in studying the optical excitations of one-electron atoms and alkali atoms. The residual Coulomb interaction is something new (except for our brief discussion of the 'He atom in Section 9-4) since it is found only in studying the optical excitations of atoms with two or more optically active electrons. In such atoms the Coulomb interactions felt by an optically active electron include those due to the presence of the other optically active electrons in the same subshell. Since the charge distribution of the other optically active electrons is not spherically symmetrical because the subshell is only partly filled, the effect of their Coulomb interactions is not spherically symmetrical. Therefore, the spherically symmetrical net Hartree potential V(r) cannot accurately describe the actual Coulomb interactions felt by an optically active electron, but only the best spherically symmetrical average of these interactions. For accuracy, we must consider the departures from this average of the actual Coulomb interactions. We must also take into account the requirement that an eigenfunction describing accurately the optically active electrons be antisymmetric in an exchange of the labels of any two of them, since this requirement alters their charge distribution. A quantitative treatment can be given by adding, to the energies obtained from the Hartree theory, the expectation values of the energies of the residual Coulomb and spin-orbit interactions. This is rather like the treatment of the 1H atom energy levels described in Section 8-6, but in the present case antisymmetric eigenfunctions must be used for the optically active electrons. Since there are, at most, only a few optically active electrons, these antisymmetric eigenfunctions are not too complicated to be handled by a large computer. Of course, we cannot present the quantitative treatment here; we present instead a qualitative discussion of the excited states of typical atoms. SNO1:110313 3/1IlOb Al lb011d O lb>=13n3S 1-111MSW Olb energy E„1 on these two quantum numbers is similar to that of a single optically active electron in an alkali atom with the same core, since its net potential is not very different from the net potential due to the core alone. The total energy of the atom is the constant total energy of the core, plus the sum of the total energies of the optically active electrons. Consequently, the energy of the atom is determined completely in the Hartree approximation by the configuration of the optically active electrons, which specifies the n and 1 quantum numbers of each of these electrons. Since there are 2l + 1 possible values of m 1 for every 1, and since there are also 2 possible values of ms , every configuration has a number of different quantum states of the same energy. Thus, in the Hartree approximation there are a number of degenerate energy levels associated with each configuration. Many of these degeneracies are removed when weaker interactions, ignored in the Hartree approximation, are finally taken into account. This is just what happens when the spin-orbit interaction is applied to alkali atoms, removing some of the degeneracies of its energy levels. The weaker interactions experienced by optically active electrons must be included in a treatment of the low-energy excited states of typical atoms. They can be thought of as corrections for effects ignored in the Hartree approximation. The two most important corrections are for: 1. The residual Coulomb interaction, an electric interaction that compensates for the fact that the Hartree net potential V(r) acting on each optically active electron describes only the average effect of the Coulomb interactions between that electron and all the other optically active electrons. 2. The spin-orbit interaction, a magnetic interaction that couples the spin angular momentum of each optically active electron with its own orbital angular momentum. We have laid the groundwork for a qualitative discussion of one aspect of the residual Coulomb interaction in Section 9-4. The student will recall that the requirement that the total eigenfunction describing two electrons be antisymmetric, in an exchange of their labels, introduces a connection between the relative orientation of the spins of the electrons and their relative space coordinates (the exchange force). The average distance between the two electrons is larger in the triplet states where the spins are "parallel" than it is in the singlet state where they are "antiparallel". Consequently, the positive Coulomb repulsion energy acting between the two electrons is smaller in the triplet states, for which the magnitude of the total spin has the constant value of S' = ./1(1 + 1) h, than it is in the singlet state, for which it has the constant value S' = 0. We have seen an example of this in our consideration of the low-energy excited states of the 'Ile atom at the end of Section 9-4. In that atom the spin angular momenta of the two optically active electrons couple together so as to yield a total spin angular momentum with either the constant magnitude S' = V1(1 + 1)h or the constant magnitude S' = 0, while maintaining constant magnitudes for their individual spin angular momenta. Due to the connection between the spin orientation and space coordinates, and also to what we now call the residual Coulomb interaction, the energy of the atom is lowest for the state in which S' is largest and the electrons are furthest apart. It is found in analyses of the experimentally observed spectra, and it is also found in the quantitative theoretical treato ment, that essentially the same effect is important in all atoms with two or more optically active electrons. That is, for such atoms the residual Coulomb interaction ^n MU LTIELE CTRON ATO MS-OPTIC AL EXCITATI ONS M •c produces a tendency for the spin angular momenta of the optically active electrons to couple in such a way that the magnitude of the total spin angular momentum S' is constant, and the energy is usually lowest for the state in which S' is largest. It is easy to see that another aspect of the residual Coulomb interaction is to produce a tendency for the orbital angular momenta of the optically active electrons to couple in such a way that the magnitude of the total orbital angular momentum L' is constant. This happens simply because in most quantum states the charge distributions of the electrons are not spherically symmetrical, and so they exert torques on each other. Since the space orientation of the charge distribution of an electron is related to the space orientation of its orbital angular momentum vector, there are torques acting between the angular momentum vectors. The torques do not tend to change the magnitude of the individual orbital angular momentum vectors, but only tend to make them precess about the total orbital angular momentum vector in such a way that its magnitude L' remains constant. The question then arises: Which of the possible values of L' corresponds to the state of lowest energy? There are opposing tendencies, but the basis of the one which usually dominates can be understood even from classical physics by considering two electrons in a Bohr atom, as illustrated in Figure 10-2. Because of the Coulomb L' Two optically active electrons moving in the same Bohr orbit tend to remain at opposite ends of a diameter so as to minimize their Coulomb repulsion. As a result, their orbital angular momenta tend to couple in such a way as to yield a maximum total orbital angular momentum. Figure 10-2 repulsion between the electrons, the most stable arrangement is obtained when the electrons stay at the opposite ends of a diameter. In this state of lowest energy, the electrons rotate together with individual orbital angular momentum vectors parallel, and therefore with the magnitude L' of the total angular momentum vector a maximum. This conclusion is confirmed by an analysis of the spectra produced by atoms with several optically active electrons. That is, for such atoms the residual Coulomb familiar with this tendency in one-electron atoms and in alkali atoms. We know that it is due to torques arising from the interaction of the magnetic dipole moment connected with the spin angular momentum and the magnetic field connected with the orbital angular momentum. We also know that the energy is lowest for the state in which J is smallest (for a less than half-filled subshell). The residual Coulomb and spin-orbit interactions tend to produce effects which are in opposition to each other. But for atoms of small and intermediate Z the effects of the residual Coulomb interaction are much larger than the effects of the spin-orbit interaction. Except for atoms of large Z, the residual Coulomb interaction is treated first, since it is the most important, and the spin-orbit interaction is temporarily ignored. Then the individual spin angular momenta S i of the optically active electrons are considered to couple to form a total spin angular momentum S', where (10-6) S'=S 1 + S2+•••+Si+••• and where S' has a constant magnitude satisfying the quantization condition (10-7) S' = s'(s' + l)h Also, the individual orbital angular momenta L i of the optically active electrons are considered to couple to form a total orbital angular momentum L', where (10-8) L'=L 1 + L2+•••+ Li +••• and where L' has a constant magnitude satisfying the quantization condition (10-9) L' = + 1)h These vectors couple in such a way that all their magnitudes S i and Li also remain constant. Because of the residual Coulomb interaction, the energy of the atom depends on S' and L', so quantum states of the same configuration, but associated with different values of S' and L', no longer have the same energy. The state with the maximum possible values of S' and L' usually has the minimum energy. Having taken the dominant residual Coulomb interaction into account, the weaker spin-orbit interaction is then included. This is done by considering a spin-orbit interaction between the angular momentum vectors S' and L'. The interaction couples these two vectors in such a way that the magnitude J' of the total angular momentum J'=L'+S' (10-10) is constant, and S' and L' remain constant. The magnitude of J' is also quantized according to the usual condition J' = ^J(J' + 1) h (10-11) As a result of the spin-orbit interaction, the energy of the atom depends also on J'. J' has the minimum energy. The pro- Thestawi mnuposblevaf cedure described in the last two paragraphs is commonly named LS coupling. But ATOMS WITH SEVERAL OPTIC ALLY ACTIVE ELE CTRONS interaction produces a tendency for the orbital angular momenta of the optically active electrons to couple in such a way that the magnitude of the total orbital angular momentum L' is constant, and the energy is usually lowest for the state in which L' is largest. In constrast to the tendencies produced by the residual Coulomb interaction, the spin-orbit interaction produces a tendency for the spin angular momentum of each optically active electron to couple with its own orbital angular momentum, in such a way as to leave the magnitudes of these vectors constant, while they precess about their resultant total angular momentum vector that is of constant magnitude J. We are MULTIELECTRON ATOMS- O PTICAL EXCITATIONS sometimes it is named Russell-Saunders coupling after the two astronomers who first used it in studying atomic spectra emitted by stars. The procedure is valid except for atoms of large Z. The student should be warned that the common name frequently causes confusion because it seems to imply that the coupling between the L and S vectors is the most important. In fact, just the opposite is true. In LS coupling the coupling of the individual L vectors to form the total L vector, and also the coupling of the individual S vectors to form the total S vector, are the most important because they have the largest effect on the energy. The coupling of the total L vector to the total S vector is less important because it has a smaller effect on the total energy. If Z is large, the spin-orbit interaction is too strong (see Table 10-1) to justify ignoring it even temporarily. This complicates the situation because both the residual Coulomb and the spin-orbit interactions must then be treated simultaneously. For atoms of the largest Z, the spin-orbit interaction begins to dominate the residual Coulomb interaction, and the treatment simplifies because a sequential procedure again becomes possible. This procedure, called JJ coupling, involves first treating the relatively strong coupling of the spin and orbital angular momenta of each optically active electron of the large Z atom, to form its total angular momentum, and then treating the relatively weak coupling of these angular momenta to form the total angular momentum for all the electrons. Since most atoms are either good or fair examples of LS coupling, it is the only procedure we shall consider in this chapter. In Chapter 15, we shall consider JJ coupling in connection with the behavior of protons and neutrons in nuclei, since in all nuclei these particles move under the influence of a very strong spin-orbit interaction. 10-4 LS COUPLING Figure 10-3 illustrates the way the various angular momentum vectors combine in LS coupling in the state which is normally the one of minimum energy for two optically active electrons with quantum numbers l l = 1, s 1 = 1/2, and 12 = 2, s2 = 1/2. The spin angular momenta S i and S2 precess about their sum S', and S' has its maximum possible magnitude (corresponding to s' = 1). The precession is rapid because their coupling is relatively strong. The orbital angular momenta L 1 and L2 precess rapidly about their sum L' because their coupling is also relatively strong, and L' also has its s Figure 10 3 The coupling of various angular momentum vectors in a typical LS coupling state of minimum energy. Left: The orbital angular momenta L 1 and L 2 of the two electrons precess rapidly about their vector sum L'. Similarly, their spins S 1 and S2 precess rapidly about their sum S'. Right: The total orbital angular momentum L' and the total spin angular momentum S' precess slowly about their sum J', the total angular momentum. Finally, J' can be found anywhere on a cone symmetrical about the z axis. - s2 = 1/24NA s =1/2 T s' =1 sl = 1/2 ^^ s2 = 1/2 s' =0 12 = 2 1' li = 11=1t l '=2 l '=3 s' = 1 1 s' = 1 ^ / '=3 l' = j' =3 =1 =2 j' =2 l'=2 j' =3 j' =2 s ° =1,1'=3 s' =1,1'=2 1'=3 l '=2 j l' =1 1 '=3 ^ j' =4 s' 12=2 j' j' =1 j' =1 j' = 2 j' =0 s' =1,1'=1 l' =1Itj' s' = 0, 1'=1, j' =1 s' = 0, 1'=2, j' =2 s' = 0, 1'=3, j' =3 Figure 10-4 Vector addition diagrams for the quantum numbers I L = 1, s 1 = 1/2; / 2 = 2, s 2 = 1/2. JNIld noJS7 maximum possible magnitude (corresponding to l' = 3). In addition, there is a slow precession of S' and L' about their sum J', with J' having its minimum possible magnitude (corresponding to j' = 2). This precession is slow because the coupling between S' and L' is relatively weak. Finally, J' can be found anywhere on a cone symmetrical about the z axis, with its component Jz along that axis a constant given by the quantization condition (10-12) Jz= m'J^i where (10-13) m; = —j',—j' + 1,...,+j'-1,+j' Figure 10-3 is drawn for m' = j'. The quantization of the magnitude of the total angular momentum J', and of its z component Jz, is a necessary requirement of the absence of external torques acting on the atom. Figure 10-3 shows only one of the quantum states that can be formed in LS l = 1, s 1 = 1/2,couplingbytwacvelronswithquamberl and l2 = 2, s2 = 1/2. In fact, there are twelve different sets of states, with different quantum numbers s', l', j', that can be formed by these two electrons; and each of these twelve sets contains states of 2j' + 1 different possible values of m ;. The rule specifying the possible values of m i is expressed by (10-13). The rules specifying the possible values of s', l', j' are conveniently expressed with reference to vector addition diagrams employing vectors whose lengths are proportional to the quantum numbers, just as we have done in Section 8-5. For the two electrons in question, these diagrams have the form indicated in Figure 10-4. The student may verify that the possible values of s', l', j' shown in the vector diagrams agree with those obtained from the MU LTIELECTRON ATO MS-OPTI CAL EXC ITATIONS s3 = 1/2 s2 = 1/2 sl = 1/2 s' = 3/2 52 = 1/2 sl = 1/2 53 = 1/2 s' = 1/2 13 = 4 1 '=7 12 = 2 /3 = 4 12 = 2 11 _1 Figure 10-5 Vector addition diagrams for the maximum and minimum values of s' and l' in a configuration of three optically active electrons with / = 1, / 2 = 2, / 3 = 4. equations s' = Is l - s2 1, Isl - s2 I + 1, . . . , sl + s2, l'=111 -12 1,111 -12 1+ 1,...,1 1 + 12 (10-14) j' = is' - l'I , is' -1'1+1,...,s'+l' Since s 1 = s2 = 1/2, the first equation gives s' =0,1 This is the same as (9-21). The other two equations can be proved by the same type of vector inequality arguments we used to prove (8-33). Obvious generalizations of the vector diagrams can be used to find the possible quantum numbers for cases with more than two optically active electrons. Find the possible values of s', l', and j' for a configuration with three optically active electrons of quantum numbers l l = 1, 12 = 2, and 1 3 = 4. ^^ With the aid of the constructions shown in Figure 10-5, we conclude that the minimum value of s' is 1/2 and that the maximum value of s' is 3/2. Therefore, the possible values are s' = 1/2, 3/2. The constructions also show that the minimum value of l' is 1, and that the maximum value of J' is 7. So the possible values are 1' = 1, 2, 3, 4, 5, 6, 7. The possible values of j' are then j' = 1/2, 3/2, 5/2, 7/2, 9/2, 11/2, 13/2, 15/2, 17/2. Not indicated in Figure 10-5, or in Figure 10-4, are the 2j' + 1 possible values of m for each value of j'. In the absence of external fields, the energy of the atom does not depend on mp • Example 10 2. - Figure 10-6 illustrates the splitting of the single degenerate level of a particular configuration of an atom with two optically active electrons, due to the residual Coulomb and spin-orbit interactions. The configuration is 3d 1 4p 1 , or in abbreviated form 3d4p, which involves the same quantum numbers, l l = 1, sl = 1/2; 12 = 2, s2 = 1/2, considered in Figures 10-3 and 10-4. Also illustrated in the figure is the notation used by spectroscopists to label the quantum numbers of the levels. For instance, the lowest energy level is identified by the symbol 3d4p 3F2 . The first part of the symbol gives the configuration. The second part gives the values of s', 1' j'. The letter specifies the value of l' according to the scheme of Table 9-3 (except that it is conventional to use capitals); that is, F means l' = 3. The subscript gives the value of j'; that is, j' = 2. The superscript is equal to 2s' + 1 (and, if s' < l', is also equal to the number of components into which the levels are split by the spin-orbit interaction); that is, 2s' + 1 = 3 so s' = 1. The second part of the symbol is read "triplet F 2." , s' // % / 3d4p / -~\ \ / \ \ =0 s' = 1 l ^ i/ ^... i i Gi _ \ ■ \■ j' _2 l' = 1 _2 l' = 3 =1 j -2 j' = 3 l' =3 l' ^■ =1 1 1D , 1F3 j' - 2,1,0 _ ^^^sP 2 .r 3P1 ^s=-^_ j ' = 3, 2,1 \---3p0 D ^ -_%+ ,3 ^_ E - gD3 ^3D1 j ' = 4, 3, 2 ,3F4 __ ^^^..E-3F3 ^^ ` _ ..‘"•-?F2 The splitting of the energy levels in a typical LS coupling configuration. Figure 10-6 We cannot present explicit equations from which the energies of all the levels in Figure 10-6 can be evaluated, but we can write an equation which gives the j' dependence of the spin-orbit interaction energy. This dependence splits the levels for s' = 1, and a given l', into triplets of levels. We consider again (8-35) for the spin-orbit interaction energy, writing it as AE= K[j'(j'+1)—l'(l'+1)—s'(s'+1)] (10-15) This equation predicts the expectation value of the interaction energy of the total spin and orbital angular momentum vectors S' and L', providing LS coupling is valid so that these vectors are meaningful. The quantity K is not simply proportional to a term like (1/r) dV(r)ldr, as might be expected from earlier applications of (8-35), because the potential is more complicated in the present situation. However, K does have the same value for all the energy levels of a so-called multiplet; i.e., for all the energy levels of a configuration with common values of s' and 1'. Therefore, we can calculate from (10-15) the separation in energy between the adjacent levels of a multiplet. If the quantum number associated with the level of lower energy is j', the quantum number associated with the level of higher energy is j' + 1, and the separation e in the energy of the two levels is e=K[(j' +1)(j'+ 2)— l'(l' +1)— s'(s' +1)] K[j'(j' +1)— l'(l' +1)— s'(s' +1)] =K[(j + 1)(j'+2)—j (j + 1 )] This yields the simple result (10-16) e = 2K(j' + 1) Thus we see that the separation f in the energy of adjacent levels of a multiplet is proportional to the total angular momentum quantum number of the level of higher energy. This prediction of (10-16) is called the Landé interval rule. It is widely used in atomic physics, as we shall see in Examples 10-3 and 10-4. Essentially the same rule is used in molecular and nuclear physics. ' ' ' In the 3d3d configuration of the 20Ca atom there is a multiplet (in this case a triplet) of levels: 3P0 , 3 P1 , 3 P2 . The lowest energy level is observed to be 3P0 , the next is 3 P 1 , and the highest is 3P2 . The measured separation g in energy between the 3 P 1 and 3P0 levels is 16.7 x 10 -4 eV, and g between the 3P and 3P 1 levels is measured to be 33.3 x 10 -4 eV. Compare these values of e with the predictions of the Landé interval rule, (10-16). ^ The theory does not predict an accurate value for the K in (10-16), but it does predict that K has the same value for all the levels of a multiplet. So we can obtain an accurate prediction for the ratio of the two values of g. For the lowest energy level j' = 0; for the next j' = 1; and Example 10 3. - JNIldf1 00S7 l' 0 co Fine-Structure Splittings in the Calcium Atom Table 10-2 MU LTIELECTRO N ATO MS-OPTI CAL EX CITATIO NS Ratio Configuration 3d3d 4s4p 4s3d 3d4p Levels 31) 1, 3 P0 3 P1 , 3 P0 3D 2, 3D 1 3 3 D2, D1 Separation Separation Exp. Theo. 33.3 x 10 -4 eV 131.2 x 10 -4 eV 26.9 x 10 -4 eV 49.6 x 10 -4 eV 1.99 2.02 1.59 1.50 2/1 2/1 3/2 3/2 Levels 16.7 x 10 -4 eV 64.9 x 10 -4 eV 16.9 x 10 -4 eV 33.1 x 10 -4 eV 3P 2 , 3P 1 3P 2 , 3 P1 3 D3, 3 1)2 3 D3, 3 13 2 for the highest j' = 2. Thus the Landé interval rule predicts g(3 P2, 3 P1) _ 2 K(j' + 1 )i, =1 1 , =0 6'(3P 1 , 2 3P0) 2 K(j' + 1) 1 The ratio of the measured values of e is e( 3P2, 3P1) 33.3 x 10 -4 eV eV = 1.99 3P0) 16.7 x 10 -4 613/3 1, 3P0) This excellent agreement between the experimentally measured and theoretically predicted ratios of ' provides evidence for LS coupling in the 20Ca atom. In other words, the Landé interval rule can be used as a test for the presence of LS coupling. • ^ The first row in Table 10-2 summarizes the successful Landé interval rule test for the presence of LS coupling, carried out in Example 10-3, for a triplet in one of the configurations of the 20Ca atom. The other rows show the equally successful results of the same test applied to triplets in other configurations of that atom. All together, these tests provide convincing evidence for the presence of LS coupling in the 20Ca atom. When the same tests are applied to multiplets in various configurations of other atoms with more than one optically active electron, they show that LS coupling is present in all such atoms of small and intermediate Z. Example 10 4. Measurements made on the line spectrum emitted by a certain atom of intermediate Z show that the separations between adjacent energy levels of increasing energy, in a particular multiplet, are approximately in the ratio 3 to 5. Use the Landé interval rule to assign the quantum numbers s', l', j' to these levels. This example gives some insight into the procedure used by the experimental spectroscopist in analyzing his measurements. ^ The experimental information is indicated in the energy-level diagram of Figure 10-7. If the separation between the lowest energy pair of levels is ', then the separation between the higher energy pair is approximately (5/3)e. Although the values of j' for the levels are not initially known, it is known that the possible values differ by one, and that the lowest energy level is obtained for the lowest j'. So if that quantum number has the value j' for the lowest level, it has the values j' + 1 and j' + 2 for the successively higher levels. Now the Landé interval rule says that the separation between adjacent levels is proportional to the j' value of the upper level. So the separation between the lower pair of levels should be = 2K(j'+1) and the separation between the higher pair of levels should be (5/3)6' = 2K(j' + 2) Dividing the first equation by the second, to eliminate the unknown K, we obtain - 3^ 5^ 2K(j' + 1) 2K(j' + 2) j' +2 j' + 1 j' Figure 10 7 Illustrating the assignment of quantum numbers in a multiplet from the observed level separations. - which gives + 5=3j' + 6 or 2j'=1 j' = 1/2 Thus the j' values of the levels are, in order of increasing energy, j' = 1/2, 3/2, 5/2. To determine the values of s' and 1' for the multiplet, we use the third of equations (10-14) j' = Is ' —l'I, Is' —l I +l ,..., s + l ' ' ' Since the minimum value of j' is 1/2 and the maximum is 5/2, we have Is — l'I = 1/2 and s'+l' = 5/2 To handle the absolute value, we consider two cases. In the first case s' > l', and these two equations are s' — l' =1/2 and s' + l' = 5/2 Adding gives 2s' = 6/2 or s' = 3/2 Subtracting gives 2l'=4/2 or l'=1 In the second case s' < l', and the equations we must solve are — (s' — 1') = 1/2 and s' +l' =5/2 Adding gives l' = 3/2 or 2l' = 6/2 But this is not possible, as the total orbital angular momentum quantum number l' cannot have a half-integral value. Therefore, the first case, s' > l', is the correct one, and we conclude that s' = 3/2 and l' = 1. The spectroscopist carries out this procedure on all the multiplets of a particular configuration, the levels being grouped into configurations by the similarity of their energies. Having thereby obtained the l' values for the multiplets of the configuration, the 1 quantum numbers of the configuration are identified by using the second of (10-14) (or by using an obvious extension of the equation if he knows that there are more than two optically active electrons because some of the s' values are larger than 1). Identification of the n quantum numbers associated with the various 1 quantum numbers is not difficult, if the n quantum numbers of the ground state configuration are known, by making use of the fact that the energy of the subshells with common values of 1 increases monotonically with increasing n. The identification of the n quantum numbers of the ground state configuration of the atoms is based on the same fact. 10-5 ENERGY LEVELS OF THE CARBON ATOM As yet another example of LS coupling, we consider in this section the energy-level diagram of the 6C atom, shown in Figure 10-8. The ground state of this atom has the configuration 1s22s22p2, so that there are two p electrons which are optically active. The zero of the energy scale in the diagram is defined such that the magnitude of the total energy of the atom in its ground state is equal to the energy required to WOlt/ NOBab'O 2H1 3O S13/01 A01:13N3 and N CO ti MULTI ELE CTRON ATOMS- OPTICAL EX CITATION S ^ 0 1 —1 — —2 ^ 1 1 2p5s— 2p4s —3 —4 .-i ^ 1 ,-. 1 m 1 m 1 m 1 — _ — — — 2p4p ,-a I .-i .-^ ^^ m N m A m ^^ m I —— — — 2p4d 2p3d — — — — 2p3p 2p3s —5 ^ T Gp —6 ai W —7 —8 —9 —10 2172 —11 —12 Figure 10-8 Some energy levels of the carbon atom. singly ionize the atom. Consequently, the diagram is directly comparable with energy-level diagrams for alkali atoms and 1 H, in which the zero of energy is defined in the same way. The energy levels are labeled by the configuration of the two optically active electrons, and by the spectroscopic symbol specifying s', l' j'. Consider first the average energy of the levels of the various configurations. In the configuration of lowest energy, 2p 2, both electrons remain in the same subshell that they occupy in the ground state of the atom. In other configurations, one electron remains in that subshell and one is in a subshell of higher energy. Note that the average energies of the configurations depend on the n and 1 quantum numbers of the electron in the higher energy subshell in essentially the same way as if this electron were the single optically active electron in an alkali atom. In the 2p2 configuration, the one of lowest average energy, the 3 P0 1,2 states are of lower energy than the 1S0 and 1D2 states because they correspond to a larger value of s', and the 1 D2 states are of lower energy than the 1S0 state because they correspond to a larger value of 1'. Note that the s' dependence is stronger than the l' dependence. It is almost always found that the energy associated with the residual Coulomb interaction coupling of the spin angular momenta is somewhat larger than the energy associated with the residual Coulomb interaction coupling of the orbital angular momenta. Of the three closely spaced energy levels for the 3P0, 1 , 2 states that would be resolved on a larger diagram, the one for the 3P0 state is of lowest energy because it corresponds to the smallest value of j'. Thus the ground state of the atom is the state 2p 2 3P0 . That is, in the ground state of carbon there are two electrons in the partially filled third subshell (the 2p subshell), which are coupled so that they have one unit of total spin angular momentum, one unit of total orbital angular momentum, and zero total angular momentum. The study of the low-energy excited states of atoms leads to an extremely complete description of their ground states! In the 2p3s configuration of 6 C the level corresponding to maximum s' is lowest in energy, just as in the 2p 2 configuration. Deviations from this rule, and from the , allowed state is one in which the total spin angular momentum, total orbital angular momentum, and total angular momentum are all zero. A consequence of the fact that there are no total angular momenta in a completely filled subshell is that it has no net magnetic dipole moment. Therefore, only the few electrons in an atom that are not in filled subshells are involved in its interaction with external magnetic fields—an important simplification. This particular restriction of the exclusion principle applied to LS coupling is exactly what would be expected from the exclusion principle applied to the Hartree approximation. To see that this is so, assume that the electrons in a completely filled subshell are not interacting at all with each other. Then the behavior of each can be described by values of the quantum numbers m, and ms. Since the subshell is filled, electrons would be found with all possible combinations of m1 and m,, but since all the electrons have the same n and 1, each combination of m1 and m, would occur only once. The result is that for each electron having a certain WOlb' N O9ab'O 31-11 3OSi3A31 A 01:13N3 rule that the maximum l' gives the lowest energy, are seen in the configurations of higher average energy, but in 6C there are no deviations from the rule that the minimum j' gives the minimum energy. Not shown in Figure 10-8 are a few energy levels of the configuration 2s2p 3, which are not usually excited. Also not shown is the spin-orbit splitting of the energy levels, since it is much too small to be seen on the scale of the diagram. Although not present in 6C, in many atoms there is a hyperfine splitting of the energy levels. It is smaller than the spin-orbit splitting by about three orders of magnitude. Hyperfine splitting is due to either or both of the following: (1) the interaction between an intrinsic magnetic dipole moment of the nucleus and a magnetic field produced by the atomic electrons, and/or (2) the interaction between a nonspherically symmetrical nuclear charge distribution and a nonspherically symmetrical electric field produced by the atomic electrons. These effects are of interest principally because they can provide very useful information about the nucleus, and they will be discussed in Chapter 15. ` 1f Note the absence in the 6 C energy-level diagram, of Figure 10-8, of levels for the 1 P, and 3S1 states in the 2p 2 configuration. This is an effect of the exclusion principle. In all other configurations of the diagram the exclusion principle is automatically satisfied by the fact that the n quantum numbers of the optically active electrons differ. But in the 2p 2 configuration both the n and 1 quantum numbers are the same, so the exclusion principle puts restrictions on the possible values of the remaining quantum numbers. In the Hartree approximation these are sets of the quantum numbers m1 , m s, one set for each of the independent optically active electrons having common values of the quantum numbers n and 1. In this approximation the restrictions of the exclusion principle are simply that no two electrons can have the same set of all four quantum numbers. In LS coupling, where the m 1 and ms are not useful and the quantum numbers l', s', j', mi are used instead to specify the way the optically active electrons are interacting, the restrictions of the exclusion principle are more complicated. For the general situation the arguments used to work out the LS coupling exclusion principle restrictions are very involved, and even in simpler special situations they are somewhat involved. (Interested students will find a sample of these arguments, and a complete statement of their conclusions, in Appendix P.) Here we shall only mention two of the conclusions obtained from the arguments. One is that the absence of the 1 P, and 3S1 states in a 2p2 configuration, and of other states in other configurations in which the electrons have the same n and 1 quantum numbers, can be understood on the basis of the exclusion principle. Another conclusion is that when there are as many electrons having the same n and 1 quantum numbers as is allowed by the exclusion principle, then the only state that occurs is 'S o . This restriction can be expressed by saying that when a subshell is completely filled, the only MULTIELECTR ON ATO MS-O PTICAL EXC ITATIONS co co positive z component of orbital angular momentum (because it has a certain positive m 1), there would be an electron having the corresponding negative z component (because it has the corresponding negative m1). Thus the total orbital angular momentum of the electrons in the filled subshell would sum up to zero. The same would be true for their total spin angular momentum. Therefore, their total angular momentum would also have to be zero. The optical line spectrum of the 6C atom, or of any other LS coupling atom, can be constructed from its energy-level diagram by evaluating the energy and frequency of photons emitted in all possible transitions that do not violate the following LScoupling selection rules: 1. Transitions can occur only between configurations which differ in the n and 1 quantum numbers of a single electron. This means that two or more electrons cannot simultaneously make transitions between subshells. 2. Transitions can occur only between configurations in which the change in the 1 quantum number of that electron satisfies the same restriction that applies to oneelectron atoms, (8-37) Al =+1 3. Transitions can occur only between states in these configurations for which the changes in the s', l', j' quantum numbers satisfy the restrictions As' =0 (10-17) Ai' =0, +1 Aj' =0, + 1 (but not j' =0to j' =0) The first of (10-17) prohibits transitions between singlet (s' = 0) and triplet (s' = 1) states, and vice versa. Nevertheless, transitions are observed between the 2p 21 D 2 3P0,1,2 states of 6C. The reason is that all excitations of that atom staendh2p to singlet states eventually lead to the population of its 2p 2 'D 2 states, since Figure 10-8 shows them to be the lowest energy singlet states. When they are highly populated, the total number of transitions per second to the 2p 2 3P01,2 states becomes appreciable, even though the probability is very small that any single atom will make this transition since it violates the As' = 0 selection rule. Physically, this rule says that if the coupling of the electron spins changes in an atomic transition, the atom cannot emit radiation of the type produced by oscillating electric dipole moments. If the spin coupling does change, radiation is emitted, but at a very low rate. The radiation is produced inefficiently by oscillating spin magnetic dipole moments, associated with the change in the spin coupling. The last two selection rules of (10-17) are similar to those of (8-37) and (8-38). 10-6 THE ZEEMAN EFFECT In 1896 it was observed by Zeeman that, when an atom is placed in an external magnetic field, and then excited, the spectral lines it emits in the deexcitation process are split into several components. Examples of the Zeeman effect are illustrated in Figure 10-9. For fields less than several tenths of 1 tesla, the splitting is proportional to the strength of the field. The Zeeman splitting in such fields is smaller than the fine-structure splitting, which is proportional to the strength of the more intense internal fields of the atom. Clearly, the Zeeman effect indicates that the energy levels of the atom are split into several components in the presence of an external magnetic field. In certain special cases, which were called "normal," these energy-level splittings could be understood in terms of a classical theory developed by Lorentz. But in general cases, which were called "anomalous," even a qualitative explanation of the observed splittings could not be given until the development of quantum mechanics and the introduction of electron spin. Transitions between any singlet states in atom with even number of optically active electrons. Transitions between doublet first excited state and doublet ground state in the sodium atom. 2I.1/2 to 25 1/2 2P312 to 25 1/2 t■ INN Weak field TTT Normal Anomalous Representations of photographic plates showing the splitting of several spectral lines in the normal and anomalous Zeeman e ff ect. The arrows show the splittings predicted by a classical theory of Lorentz. Figure 10-9 In terms of the modern theory, both the normal and the anomalous Zeeman splittings are easy to understand. Except when it is in an 1S0 state, an atom will have a total magnetic dipole moment, µ, due to the orbital and spin magnetic dipole moments, µ i and µs, of its optically active electrons. (The other electrons are in completely filled subshells which have no net magnetic dipole moments.) When this magnetic dipole moment of the atom is in the external magnetic field B it will have the usual potential energy of orientation AE= —µ• B (10-18) Each of the atom's energy levels will be split into several discrete components corresponding to the various values of'AE associated with the different quantized orientations of µ relative to the direction of B. In other words, because it has a magnetic dipole moment the energy of the atom depends upon which of the possible orientations it assumes in the external magnetic field. To see qualitatively what is behind the distinction between normal and anomalous splittings, we evaluate µ by using (8-9) and (8-19) to obtain p i and its for each optically active electron in terms of its orbital and spin angular momenta, and then summing over all these electrons. That is, we take µ= giµb h Ll _ gsµb h S 1_ h [(L1 + gllUb L2 h — . . . h S 2 _... gslib L2 + ... )+2(S1 +S2+ ...)] We have inserted the values g1 = 1 and gs = 2 for the orbital and spin g factors that determine the ratios of the magnetic dipole moments to the angular momenta. Now, if the atom obeys LS coupling, the individual orbital angular momenta couple to give the total orbital angular momentum L', and the individual spin angular momenta couple to give the total spin angùlar momentum S'. Then the expression for the total magnetic dipole moment of the atom immediately simplifies to µ =— b [L' + 2S'] (10-19) We see that the total magnetic dipole moment of the atom is not antiparallel to its total angular momentum J' = L' + S' (10-20) 103JJ3 Nb'W33Z 3H1 No field MULTIELECTRON ATOMS- OPTI CAL EXCITATIONS The basic reason is that the orbital and spin g factor have different values. The result is that the behavior of pi is quite complicated because its orientation is not simply related to the orientation of J'. But if S' = 0, i.e., if the spin angular momenta of the optically active electrons couple to zero, then µ is antiparallel to J', and the behavior of µ, and thus the term µ • B that produces the energy level splittings, is simpler. In fact, in this case where the nonclassical phenomenon of spin is effectively not involved, the behavior of µ • B can be explained satisfactorily by the old theory of Lorentz. This is the case of normal Zeeman splitting. In the general case, S' 0 and the theory of Lorentz fails. This is the case of anomalous Zeeman splitting. The terminology was introduced long before quantum theory provided a complete understanding of all aspects of the Zeeman splittings and, from the modern point of view, it is not very appropriate because there is really nothing anomalous about any of the splittings. It is interesting to note that the anomalous splittings could have been used at a very early date to show that spin exists and to show that the spin g factor differs from the orbital g factor. Now we shall evaluate quantitatively the Zeeman splittings for typical energy levels of LS coupling atoms by applying what we have learned about the behavior of the various angular momentum vectors in such atoms. From (10-20) we see that L', S', and J' always lie in a common plane. But that plane precesses about J' because of the Larmor precession of S' in the internal atomic magnetic field associated with L' (i.e., because of the spin-orbit interaction). Equation (8-14) shows that this precessional frequency is proportional to the strength of the internal magnetic field of the atom. From (10-19) we see that It also lies in the precessing plane, and is typically not antiparallel to J'. So µ must also precess about J' with a precessional frequency proportional to the internal magnetic field of the atom. If an external magnetic field B is applied to the atom, there will in addition be a tendency for µ to precess about the direction of this field, with a precessional frequency proportional to its strength. If the external field is weak compared to the atomic field, the precession of µ about B will be slow compared to its precession about J'. Then the motion of it is something like that illustrated in Figure 10-10. Even in the case of a relatively weak external field the motion of It is complicated, but not too complicated to prevent the evaluation of the orientational potential energy AE. In Example 8-4 we saw that the strength of an internal magnetic field acting on an optically active electron is typically of the order of 1 tesla. So we assume that the external magnetic field B is weak compared to 1 tesla. To evaluate the potential energy AE of the orientation of µ in the field B, we must evaluate — p • B = — µBB, where µB is the component of it along the direction of B. Since It precesses much more rapidly about J' than about B, we may evaluate µB by first finding pi., which is the average component of µ in the direction of J'. We do this by multiplying µ by the cosine of the angle between µ and J'. Then we find µB by multiplying ,u,, by the cosine of the angle between J' and B. That is µ •J' _ µ b (L'+2S')•(L'+S') uJ — µ µJ' h J' and J' • B JzJIB = µa J, B = ux J, _ µb (L' + 2S') • (L' + S')Jz J,z h where we have chosen the z axis to be in the direction of B. Evaluating the dot product gives µB = — 14 12 (L ' J' +2S' 2 + 3L ' •S') J2 OD rn ^ 103333 N `dW33Z 3Hl z B Figure 10 10 Left: The total orbital angular momentum L' and total spin S' couple together to form the total angular momentum J' of a typical atom. The total orbital magnetic dipole moment µ I- and total spin magnetic dipole moment µs, similarly couple together to form the total magnetic dipole moment µ. Since the proportionality constant connecting L' and is only half the magnitude of the constant connecting S' and pe , the total dipole momentµ^. will not be exactly antiparallel to J'. And since L' and S' precess rapidly about J', µi, and s. precess rapidly as well, causing µ to precess about J' at the same rate. Thus the µ component of µ perpendicular to J' averages to zero, and the component parallel to J' remains a constant of magnitude pr . Right: In a weak applied magnetic field B, a torque is exerted which causes the direction J', on which µ has the constant average component /ix , to precess about the direction of —B. So the average magnitude of this component on the direction of the field has the magnitude It s indicated in the figure. - — — — — Writing (8-34) with primes, we have 3L'•S' = 3(J' 2 —L'2 — S'2)/2 So µB =— __ [ µb h L,2+.2S'2+3(J (3J'2 + S'2 — L'2) 2J'2 ,2—L'2—S^2)l2^ , J,2 Jz Then, according to (10-18) AE = —pi • B = —,u BB the orientational potential energy is ptbB (3J'2 + S'2 — L'2) AE — J' Z h 2J'2 (10-21) MU LTIELE CTRON ATOM S-OPTI CAL EXCITATION S In the state specified by the quantum numbers s', l', j', m; the dynamical quantities Si2, L'2, J'2, Jz have the precise values s'(s' + 1)h2, l'(l' + 1)h2, j'(j' + l)h 2, m h, respectively. Using these values in (10-21) we obtain an expression for the Zeeman effect energy splitting that is most conveniently written as (10-22) AE = µbBgm ; where g=1+f(7 +1)+s'(s'+1)—l'(l' + 1) (10-23) 2f(f + 1 ) The quantity g is called the Landé g factor. Note that its value is g = 1 = g1 , when s' = 0 so j' = 1'. Its value is g = 2 = g3, when l' = 0 so j' = s'. These are just the values that would be expected since if s' = 0 the angular momentum is purely orbital, and if l' = 0 it is purely spin. Thus the Landé g factor is a kind of variable g factor that determines the ratio of the total magnetic dipole moment to the total angular momentum in states where that angular momentum is partly spin and partly orbital. From (10-22) we see that in an external field of strength B each energy level will split into 2j' + 1 components, one for each value of m i'. We also see that the magnitude of the splitting will be different for levels with different Landé g factors. Evaluate the Landé g factor for the 3 P 1 level in the 2p3s configuration of the atom, and use the result to predict the splitting of the level when the atom is in an external ° magnetic field of 0.1 tesla. ^ For the 3P 1 state s' = 1' = j' = 1.So Example 10 5. - 6C Û g— 1 + 1(1 +1)+1(l+1)-1(1+1) — 2x1(1+1) 2 3 1+ 2x2 2 For j' = 1 the possible values of m; are -1, 0, 1, so the level is split into three components, one with the same energy and the others displaced in energy by DE = µ b Bgm ; = ±µbag = +9.3 x 10 -24 amp m 2 x 10 1 tesla x 1.5 = ±1.4 x 10 -24 joule = +8.7 x 10 -6 eV 44 Figure 10-11 shows, to scale, the splittings of the 25112 ground state energy level and the 2P112 and 2P312 lowest-excited-state energy levels of the 11Na atom, when it is placed in a weak external magnetic field. Note that the external magnetic field re- +3/2 +1/2 1/2 3/2 2p3 /2 (g = 4/3) 2P1 /2 +1/2 -1/2 (g = 2/3) A V v v 2 s1/2 !e —91 No external magnetic field C V V V V V V V V +1/2 1/2 Weak external magnetic field Figure 10-11 The Zeeman splittings of the 2P1/2 , 3/2 first excited state levels of sodium, and of its 2S 1/2 ground state level. The transitions allowed by the selection rules are shown. Compare the resulting spectral lines with those shown in Figure 10-9. Example 10 6. The most easily interpreted evidence for the splitting of atomic energy levels in an external magnetic field is electron spin resonance. If 11 Na atoms in their ground state are placed in a region containing electromagnetic radiation of frequency y, and a magnetic field of strength B is applied to the region, electromagnetic energy will be strongly absorbed when the photons have energy hv which just equals the Zeeman splitting of the two components of the ground state energy level. The reason is that these photons are able to induce transitions between the components, indicated in Figure 10-12, in which they are absorbed. In a typical experiment y = 1.0 x 10 1° Hz. Determine the value of B at which the frequency defined by the Zeeman splitting is in resonance with this microwave frequency. ^ The ground state of 11 Na is a 2S112 state, for which g = 2 and = ± 1/2. So (10-22) predicts that the displacement in energy of the components of the ground state level in an external field B will be AE = ubB9m; = ub B2(± 1 /2) = ± µbB Equating hv to the separation in energy between these two components, we have - hv = 2µbB So hv 6.6 x 10 -34 joule-sec x 1.0 x 10 1 °/sec — = 0.35 tesla 2µb 2 x 9.3 x 10 - 24 amp-m2 This effect is widely used by chemists to measure the magnetic fields experienced by an optically active electron in an atom that is part of a molecule. The electromagnetic radiation is supplied by a microwave oscillator, and the power drawn from the oscillator is monitored while its frequency is varied until the resonance condition is observed. • B= The Zeeman effect is very useful in experimental spectroscopy. By analyzing the Zeeman splittings of the spectral lines of an atom, the spectroscopist determines the Zeeman splittings of the energy levels of the atom. These can conclusively confirm the assignment of the quantum number j' of each level, because 2j' + 1 is equal to the number of components into which the level is split. Furthermore, the magnitude of the splitting between any two components gives the value of pbBg and, /Lb and B being known, this gives the value of g for the energy level. Since the value of g depends on s', l' j' if the atom obeys LS coupling, it can be used to confirm the assignment of s' and 1'. The initial assignment of values to these three quantum numbers usually , m; +1/2 2 ' 1/2 1/2 Figure 10-12 Illustrating the transition observed in electron spin resonance involving the ground state energy levels of sodium, split by an external magnetic field. 103d d3Nb'W33Z 9H1 moves the last vestige of degeneracy of the levels, since the energy depends on The figure also shows the transitions allowed by the selection rule for (10-24) Am; = 0, + 1 (but not m; = 0 to m; = 0 if Aj' = 0) This selection rule is very closely related to the one we derived in Example 8-6. Even with its restrictions on the allowed transitions, the Zeeman effect splits each spectral line emitted by the atom into a pattern that generally contains a number of components. The student should compare the allowed transitions, indicated by arrows in Figure 10-11, with the anomalous pattern of lines emitted by "Na in these transitions, shown in Figure 10-9. All spectral lines arising from transitions between singlet states are split into a simple pattern of two components symmetrically disposed about a third component that has the same frequency as the single zero-field line, as can be seen in the normal pattern of lines shown in Figure 10-9. The reason is that s' = 0 for singlet states, so all the g factors have the same value g = 1. It is easy to show that this leads to spectral lines with only three components, by constructing a diagram similar to Figure 10-11. o co MU LTIEL ECTRON ATO MS-OPTICAL EXCITATI ONS ^ comes from application of the Landé internal rule to measured separations of the levels of a multiplet, as in Example 10-4. An external magnetic field B, which is weak compared to the internal atomic magnetic fields that couple S' and L' to form J', cannot disturb this coupling and only causes a relatively slow precession of J' about the direction of B. However, if B is stronger than the atomic magnetic field, it overpowers the field and destroys the coupling of S' to L'. In this case S' and L' precess independently about the direction of B. This is the case of the Paschen-Bach effect, which is observed for external fields somewhat larger than 1 tesla. If the atom obeys LS coupling, its total magnetic dipole moment is still given by (10-19) =— [L' + 2S'] since neither the coupling of the individual spin angular momenta to form S' nor the coupling of the individual orbital momenta to form L' are destroyed by such an external field. But in this case ,uB is simply =— (LZ + 2SZ) where we have chosen the z axis in the direction of B. Then AE= — µ• B= — N,BB=B(LZ+ 2 Sz) and we obtain immediately AE = ,ubB(m'l + 2ms) (10-25) The quantum numbers mi and ms are useful for an atom in an external magnetic field somewhat stronger than the internal magnetic field, because LZ and Sz have definite values in these circumstances. It is observed that the selection rules for the two quantum numbers are: (10-26) ams = 0 Am; = 0, + 1 (10-27) The first selection rule says that the total spin angular momentum, and magnetic dipole moment, do not change orientation in an atomic transition. Since such transitions involve the emission of electric dipole radiation, whereas a magnetic dipole moment of changing orientation would lead to the emission of magnetic dipole radiation, the origin of the selection rule is obvious. The second selection rule was derived in Example 8-6. All the spectral lines are split by the Paschen-Bach effect into three components, just as in the normal Zeeman effect. 10 7 SUMMARY - This chapter is summarized in Table 10-3, which lists, in order of decreasing importance in determining the energy, all of the significant interactions experienced by the optically active electrons in a typical multielectron atom placed in a weak external magnetic field. By typical, we mean an atom with a less than half-filled outer subshell, whose atomic number Z is low enough that it obeys LS coupling. If Z is very high, the atom obeys JJ coupling and the most important weaker interaction is the spinorbit interaction. If the external magnetic field is stronger than the internal magnetic field, the interaction it produces is called the Paschen-Bach interaction, and it is more important than the spin-orbit interaction in LS coupling. External electric fields have effects similar to, but more complicated than, external magnetic fields. If the optically active electrons are in a more than half-filled subshell the sign of the spin-orbit interaction is reversed because the atom acts as if it had positively - Interactions in a Typical (LS Coupling; Less Than Half-Filled Subshell) Atom Placed in a Weak External Magnetic Field Importance in Determining Energy Name Dominant interaction Hartree Most important weaker interaction Residual Coulomb; spin coupling Residual Coulomb; orbital coupling Spin-orbit Slightly less important Appreciably less important Least important Zeeman Nature of Interaction Electric; average potential Electric; departures from average potential Electric; departures from average potential Magnetic; internal field Magnetic; external field Quantum Numbers Determining Energy a set of n, 1 W "NJ Energy Lowest For Minimum n Minimum 1 Maximum s' l ' Maximum 1' j, Minimum j' m^ Most negative m^ charged holes instead of negatively charged electrons, which reverses the relative sults in the energy level with maximum instead of minimum j' lying lowest. But for such atoms maximum s' and maximum l' still give the lowest energy level because the sign of the residual Coulomb interaction is unchanged; it is repulsive between positive holes just as it is between negative electrons. QUESTIONS 1. Give an example of a system studied in science or engineering, other than a multielectron atom, which is best treated by a succession of increasingly accurate approximations. 2. Why are astronomers so dependent on information obtained from optical spectra? 3. Why is it not possible to give a small amount of energy to an electron in an inner subshell of an atom? What happens if a large amount of energy is given to an electron in an outer subshell? 4. Where in the Hartree approximation is the assumption made that the net potential is spherically symmetrical? 5. Explain, in simple terms, why the spin-orbit interaction becomes stronger with increasing Z. 6. Do atoms of high Z generally have more optically active electrons than atoms of low Z? 7. Chemists usually speak of valence electrons. What is the corresponding term usually employed by physicists? S. In studying the residual Coulomb interaction, eigenfunctions are used which are antisymmetric with respect to exchange of the labels of pairs of optically active electrons. What is the justification for not using eigenfunctions which are antisymmetric with respect to the exchange of labels for any pair of electrons in the atom? 9. Does the coupling of the spin angular momentum of one optically active electron in a typical atom to the spin angular momentum of another optically active electron involve a magnetic interaction between their spin magnetic dipole moments? If not, explain why not, and also explain in simple terms what the coupling is due to. orientafhmgcdipolentagurmoevcts.Thir- SNO IlS3 fl O Table 10 3 MU LTI EL ECTRON AT O MS-OPTI CAL EXCITATI ONS 10. Explain the physical origin of the coupling between the orbital angular momenta of the optically active electrons in a typical atom. 11. Why is there a classical explanation for the coupling of orbital angular momenta of optically active electrons, but not for the coupling of their spin angular momenta? 12. In a multiplet with s' > 1', into how many components are the levels split by the spinorbit interaction? Consider the multiplet discussed in Example 10-4. 13. What is the difference between LS coupling and JJ coupling? 14. What is the relation between the quantum states allowed by the LS coupling exclusion principle for a subshell with one hole (i.e., completely filled except for one electron) and the quantum states allowed for a subshell with one electron? Would there be a simple relation between the optical excitations of a halogen atom and the optical excitations of an alkali atom? 15. What would the exclusion principle be like for JJ coupling? 16. Is it possible for a Landé g factor to have a value smaller than 1? Larger than 2? 17. What would be the effect of placing an atom in an external magnetic field of strength very much larger than the strength of the internal magnetic field? 18. Is it possible to completely remove the degeneracy of atomic energy levels without using an external magnetic field? PROBLEMS 1. (a) Calculate the wavelength of the 2p to 2s transition in 3 Li. (b) Find the wavelength difference of the two components into which the line is split by the spin-orbit interaction. 2. Show that the spin-orbit energy splitting of an alkali atom is given by _ DE _ h2 1 dV 2(2l+1) r dr 4m2c except for 1 = 0, in which case the splitting is zero. 3. (a) Construct an energy-level diagram for "Na, similar to Figure 10-1, showing all levels lower in energy than the 5s level. (b) Devise a way of indicating the spin-orbit splitting of the levels. (Hint: See Figure 10-8.) (c) Indicate which transitions between these levels are allowed by the selection rules. 4. (a) Predict the values of s', 1', j', in the state of maximum energy of two optically active electrons with the quantum numbers l l = 1, s l = 1/2; 1 2 = 2, s 2 = 1/2. (b) Make a sketch, similar to Figure 10-3, which shows the motion of the angular momentum vectors in this state. 5. Find the possible values of s', l', j' for a configuration with two optically active electrons with quantum numbers 1 1 = 2, s 1 = 1/2; l 2 = 3, s 2 = 1/2. Specify which j' go with each l' and s' combination. 6. (a) Write down the quantum numbers for the states described in spectroscopic notation as 2S312, 3 D2, and 5 P3. (b) Determine if any of these states are impossible, and if so explain why. 7. Make a sketch, similar to Figure 10-6, which illustrates the LS coupling splittings of the energy levels of a 4s3d configuration. Use the Landé interval rule to predict the ratios of the fine-structure splittings of each multiplet, so that they can be drawn to scale. Label the levels with spectroscopic notation. 8. For an atomic state with quantum numbers l' = 2, s' = 1, j' = 3, find the angle between the total magnetic moment and the direction antiparallel to the total angular momentum. There is no external field present. 9. (a) Use the periodic table of Figure 9-13 to determine the ground state configurations for the atoms 12Mg, 13 A1, and 14Si. (b) Then predict the LS coupling quantum numbers for the ground state of each atom. Express your result in spectroscopic notation. 10. Use the procedure of Example 10-3 to verify the theoretical prediction of Table 10-2 for 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. sw319oad 11. the Landé interval rule test for the presence of LS coupling in the 4s3d configuration of the 20Ca atom. In an atom which obeys LS coupling, the separations between adjacent energy levels of increasing energy in the five levels of a particular multiplet are in the ratios 1:2:3:4. Use the procedure of Example 10-4 to assign the quantum numbers s', 1', j' to these levels. Consider a completely filled d subshell, i.e., one containing the ten electrons allowed by the exclusion principle. Ignore the interactions between the electrons, so that the Hartree approximation quantum numbers n, 1, m 1 , mS can be used to describe each electron. (a) Show that there is only one possible quantum state for the system that satisfies the exclusion principle. (b) Show that in this state the z components of the total spin angular momentum, the total orbital angular momentum, and the total angular momentum, are all zero. (c) Give an argument showing that these conclusions imply that the magnitudes of the total spin angular momentum, the total orbital angular momentum, and the total angular momentum, are also all zero. (Hint: If an angular momentum vector is not of zero magnitude, but has zero z component in one quantum state, then there are other quantum states in which it has a nonzero z component.) (d) Now consider the interactions between the electrons that are actually present. Can they change the conclusion about the total angular momentum of the subshell? What about the total spin angular momentum and total orbital angular momentum? (a) Make a rough sketch of the 6C energy levels in the 2p 2 and 2p3s configurations, using information from Figure 10-8. Indicate the fine-structure splittings of the levels by exaggerating their magnitude. (b) Show all the transitions allowed by the LS coupling selection rules. (a) Find a state with s', 1', j' quantum numbers for which the value of the Landé g factor lies outside the range g = 1 to g = 2. (b) Make a sketch, similar to Figure 10-10, which illustrates the angular momentum and magnetic dipole moment vectors for this state. Consider the 2p3s configuration of the 6 C atom, in which the ordering of the energy levels according to s', 1', j', and the relative strengths of the dependences of the energy on these quantum numbers, are what is normal for LS coupling. Draw a schematic energy-level diagram for this configuration, like Figure 10-6. Use the same (exaggerated) scale for the fine-structure splitting, given by the Landé interval rule, for all the levels within a given multiplet. (b) Label each level with the spectroscopic notation. On the energy-level diagram of Problem 15, draw to the same (highly exaggerated) scale the Zeeman effect splitting, given by the Landé g factor, for each level under the influence of a weak external magnetic field. (a) Count the total number of components obtained in Problem 16, i.e., the total number of different quantum states in the configuration. (b) Show that this equals the degeneracy of the configuration in the Hartree approximation, i.e., the product of degeneracy factors 2(21 + 1) for each of the two optically active electrons in the configuration. Derive an expression for the Zeeman effect splitting of the levels of a singlet. (Hint: Start at the beginning, and take s' = 0 so that a simple expression is obtained for the total magnetic dipole moment.) Give a classical explanation of the normal Zeeman effect based on Faraday's law applied to electrons revolving in circular orbits of constant radius. Show that the correct frequency interval between the three components can be obtained. (a) Construct a diagram, similar to Figure 10-11, which shows transitions allowed by the selection rules between the singlet states 2p3s 1 P 1 and 2p2 1 D2 of the 6C atom. (b) Verify that the normal Zeeman pattern of three spectral lines will be produced in these transitions. (c) Evaluate the differences in wavelength of these three spectral lines when the atom is in an external field of 0.1 tesla. (Hint: Use a formula for the difference in wavelength derived in Example 10-1). (d) Evaluate the wavelength of the single line obtained when there is no external field, using information from Figure 10-8. (a) Redraw the energy levels of Figure 10-11, for a case in which the strength of the external magnetic field is increased to the point where the splitting is described by the Paschen-Bach effect. (Hint: Here j' is no longer a useful quantum number.) (b) Redraw the MULTIELECTRO N ATO MS-OPTICAL EXCITATIONS transitions allowed by the ms and in; selection rules, as in Figure 10-11, and show that they then produce spectral lines which are split into only three components. 22. (a) Use the information contained in Figure 10-8 to estimate the magnitude of the energy associated with the coupling of the two spin angular momenta to form the total spin angular momentum, and with the coupling of the two orbital angular momenta to form the total orbital angular momentum, in the 2p 2 configuration of the 6 C atom. (b) Then estimate the strength of an external field which will produce an energy of orientation with the magnetic dipole moment of each optically active electron larger than the energy estimated in (a). In such a field the couplings of the angular momenta of the optically active electrons are completely destroyed. (c) Is such a field available in the laboratory? 11 QUANTUM STATISTICS 11-1 INTRODUCTION 376 utility of statistical considerations; Boltzmann distribution 11 2 - INDISTINGUISHABILITY AND QUANTUM STATISTICS 377 inapplicability of Boltzmann distribution to quantum systems; review of indistinguishability; restatement of fermion inhibition factor; derivation of boson enhancement factor 11 3 - THE QUANTUM DISTRIBUTION FUNCTIONS 380 thermal equilibrium; detailed balancing; Bose distribution derived by combining detailed balancing, Boltzmann distribution, and boson enhancement factor; Fermi distribution derived by same technique using fermion inhibition factor 11 4 - COMPARISON OF THE DISTRIBUTION FUNCTIONS 384 normalization constants; Fermi energy; qualitative interpretation of lowtemperature behavior of Fermi distribution, merging of classical and quantum distributions at high energies; classical distribution intermediate to quantum distributions at low energies; tabulated comparison of distributions 11 5 - THE SPECIFIC HEAT OF A CRYSTALLINE SOLID 388 Dulong and Petit law; Einstein's treatment; Debye's treatment; elastic vibration modes; applicability of Boltzmann distribution; Debye temperature; Debye formula 11-6 THE BOLTZMANN DISTRIBUTION AS AN APPROXIMATION TO QUANTUM DISTRIBUTIONS 391 Boltzmann factor; applicability to gas molecules; nuclear magnetic resonance 11-7 THE LASER 392 relation between spontaneous emission, stimulated absorption, and stimulated emission; derivation of Einstein A and B coefficients; prediction of emission to absorption ratio; population inversion by optical pumping; coherence; energy levels of a ruby laser; design of laser; lasers as examples of boson enhancement factor 11 8 - THE PHOTON GAS 398 Bose distribution for photons; derivation of Planck's spectrum 11 9 - THE PHONON GAS 399 qualitative discussion of phonons 375 QU ANTUM STATISTICS m 11-10 BOSE CONDENSATION AND LIQUID HELIUM Bose distribution normalization factor evaluated by particle-in-box state count; average particle energy for ideal boson gas; degeneracy effect; degeneracy term related to ratio of interparticle distance to de Broglie wavelength; Bose condensation; degeneracy term estimate for helium; properties of liquid helium; explanation by boson enhancement factor 11 11 THE FREE ELECTRON GAS - - c^ 404 average particle energy for ideal fermion gas; electron gas; conduction electron energy distribution; specific heat of conduction electrons 11 12 CONTACT POTENTIAL AND THERMIONIC EMISSION c U 399 407 observed properties and Fermi distribution explanation; work functions and Fermi energies 11-13 CLASSICAL AND QUANTUM DESCRIPTIONS OF THE STATE OF A SYSTEM 409 phase space; quantum limitations on minimum volume of phase space cell; relation to entropy QUESTIONS 410 PROBLEMS 411 1 11-1 INTRODUCTION As the number of constituents of a physical system increases, a detailed description of the behavior of the system becomes more complex. Thus as we proceed in our studies from one-electron atoms to multielectron atoms, and then to molecules, and finally to solids, we anticipate increasing complexity and difficulty in treating in detail these systems. For a familiar example, consider what would be involved in trying to describe the motion of one molecule of a gas in a system containing a liter of that gas under standard conditions (containing ^ 10 22 molecules). Fortunately, it is generally unnecessary to have such detailed information to determine the most important properties of the system—that is, to determine the measurable properties, like the pressure and temperature of a gas. Furthermore, the very complexity of a system containing a large number of constituents is often responsible for many of the simple properties that we observe, as we now explain. If we apply the general principles of mechanics (such as the conservation laws) to a system of many particles, we can ignore the detailed motion or interaction of each particle and deduce simple properties of the behavior of the system from statistical considerations alone. In fact, even an elementary statistical approach enables us to describe and explain a wide range of physical phenomena and gives us a good deal of insight into the behavior of real physical systems. The reason is that there is a relationship between the observed properties and the probable behavior of the system, if the system contains enough particles for statistical considerations to be valid. Consider, for instance, an isolated system containing a large number of classical particles in thermal equilibrium with each other at temperature T. To achieve, and maintain, this equilibrium, the particles must be able to exchange energy with each other. In the exchanges, the energy of any one of the particles will fluctuate, sometimes having a larger value and sometimes a smaller value than the average value of the energy 11 2 INDISTINGUISHABILITY AND QUANTUM STATISTICS - The Boltzmann distribution predicts the probable number of particles in each of their energy states for a classical system containing many identical particles in thermal equilibrium at a certain temperature. It is a fundamental result of classical physics, not quantum physics. Nevertheless, it is frequently used in discussing quantum physics, as we have seen before and shall see again. For these reasons, in this book we have included two quite different arguments that each lead to the Boltzmann distribution, but we have put these arguments in Appendix C. The student would be well advised to read, or reread, that appendix now. Our first argument in Appendix C involves counting the number of distinguishable ways the identical entities of a system in thermal equilibrium can divide between SO IISIIVISIN f11NHf1 O OMd AlI11 8bH SIf1JNIlSIONI of a particle in the system. However, the classical theory of statistical mechanics demands that the energies successively assumed by the particle, or the energies of the various particles of the system at some particular time, be determined by a definite probability distribution function, called the Boltzmann distribution, which has a form that depends on the temperature T. Knowing the probabilities that the particles of the system will occupy the various energy states, we can then predict a variety of important properties of the entire system by using these occupation probabilities to calculate averages over the system of the corresponding properties of the particles when they are in those states. A more specific example that the student has likely encountered earlier in his studies of physics is the relation between the properties of a classical gas and the Maxwell distribution of speeds of the molecules of the gas. The Maxwell distribution is a consequence of the Boltzmann distribution. It is described by a distribution function N(v), where N(v) dv gives the probability that a molecule has a speed in the interval between y and y + dv. From it we can calculate quantities such as the average speed (which is related to the momentum carried by the molecules), the average squared speed (which is related to the energy they carry), etc., and from these average quantities we calculate observable properties such as the pressure (which is related to the momentum) and temperature (which is related to the energy), etc. Statistical treatments are also applicable as an approximation in systems that contain only moderately large numbers of particles. For instance, we shall in Chapter 15 apply a statistical treatment to a nucleus (containing ^ 10 2 nucleons) in the so-called Fermi gas model of nuclei. But that treatment will not use the Boltzmann distribution, since it is not valid for quantum particles like those found in a nucleus. In this chapter we seek distribution functions that are valid for quantum particles. We shall find that there are two: the Bose distribution, which applies to particles that must be described by eigenfunctions which are symmetric with respect to an exchange of any two particle labels (like a particles or photons); and the Fermi distribution, which applies to particles that must be described by eigenfunctions which are antisymmetric in such an exchange of labels (like electrons, protons, and neutrons). First we shall review the procedures of classical statistical mechanics, developed in Appendix C and used in Chapter 1, that lead to the Boltzmann distribution. Then we shall see how quantum considerations force significant changes in the classical procedures. Next we shall derive the quantum distribution functions in simple equilibrium arguments that start from the Boltzmann distribution. Then we shall obtain useful insights by comparing all the distribution functions with one another. Finally we shall give a variety of examples of the application of each of them, and compare their predictions with experiment. In this process we shall examine many important phenomena, such as superfluidity, electronic and lattice specific heats of solids, and light amplification by stimulated emission of radiation (the laser). Q UANTU M STATISTICS them the fixed total energy of the system. The Boltzmann distribution follows from assuming that all possible divisions occur with the same probability. In this procedure, an energy division is counted as distinguishable from some other division if it differs from that division only by a rearrangement of identical entities between different energy states. That is, identical entities are treated as if they are distinguishable in such rearrangements. In the second argument leading to the Boltzmann distribution, we assume that the presence of one entity in some particular energy state in no way inhibits or enhances the chance that another identical entity will be in that state and, again, that all possible divisions of the system's energy occur with the same probability. These assumptions are perfectly acceptable in classical physics. In quantum physics the assumption that all possible divisions occur with the same probability remains acceptable; but the other assumptions do not. As we saw in Section 9-2, if there is appreciable overlapping of the wave functions of two identical particles in a system, very important nonclassical effects arise from the indistinguishability of identical particles (i.e., identical entities). One is that measurable results cannot depend on the assignment of labels to identical particles. So the classical definition of distinguishable divisions of the energy of a system is in error because if there is no unambiguous way to label the identical particles of the system there is no way to distinguish between two divisions which differ only by rearranging them, even in rearrangements between different quantum states (i.e., energy states). Another effect of the indistinguishability of quantum particles is that the presence of one in a particular quantum state very definitely influences the chance that another identical particle will be in that state. We have seen that if two identical particles are described by an antisymmetric total eigenfunction, that is, if they are particles like electrons which obey the exclusion principle, then the presence of one in some quantum state totally inhibits the other from being in that state. We shall see soon that if the two identical particles are described by a symmetric total eigenfunction, that is, if they are like a particles in that they do not obey the exclusion principle, then the presence of one in some quantum state considerably enhances the chance that the other will be in the same state. Of course, if a system contains identical quantum particles, but the circumstances are such that there is negligible overlap of the wave functions of any two, the particles actually can be distinguished experimentally. In these circumstances the effects of indistinguishability become negligible, as we mentioned before in Sections 9-2 and 9-4, and the assumptions underlying the Boltzmann distribution become valid. An example of such a system is, again, a gas. In the range of density normally encountered in the laboratory, the wave functions of the molecules, which are certainly identical quantum particles, do not overlap appreciably, and so the Boltzmann distribution can be accurately applied to predict the properties of the system. In quantum statistics, particles which are described by antisymmetric eigenfunctions are called fermions, and particles which are described by symmetric eigenfunctions are called bosons. That is, the eigenfunction for a system of several identical fermions changes sign if the labels of any two of them are exchanged, while the eigenfunction for a system of several identical bosons does not change sign in such a label exchange. A partial list of fermions and bosons is found in Table 9-1. These names honor two physicists, Fermi and Bose, who were prominent in the development of quantum statistics. The fact that one fermion prevents another identical fermion from joining it in the same quantum state, i.e., the exclusion principle, and certain of its extremely important consequences, is something we are familiar with from our study of multielectron atoms. This can be described, somewhat formally, by saying that if there are already n fermions in a quantum state the probability of one more joining them is smaller by an inhibition factor of (1 — n) than it would be if there were no quantum mechanical indistinguishability requirements. If n = 0, the factor has the value (1 — 0) = 1, and so WS = G CIYa( 1» (2) + tfrp(1)0a(2)] Recall that Ifr a(1) means the particle labeled 1 is in the quantum state a, Op) means particle 2 is in state /3, etc., and that although particle labels are actually used, measurable quantities like the probability density /s 4' s have values which are independent of the assignment of labels to particles. Recall also that 4's is normalized, by the normalization factor 1/N7i, if we assume that Ii a(1) Ip(2) and tip(1)ilia(2) are normalized. Now we place both bosons in the same state, say the state 13, by setting a = fi. Then the eigenfunction is O s = 1 [0p( 1)0p(2) + 0p( 1)0p(2il = 2 0p( 1)0p(2) = V G tt/ p( 1)tŸp(2) and the probability density is 0s* iks = 24( 1) a(2)0p( 1)0p(2) What would the eigenfunction and probability density for this two identical particle system be like if we had not taken into account the quantum mechanical requirements of indistinguishability of identical particles? The eigenfunction would be in the form given by (9-4) or (9-5), since we obtained those directly from the Schroedinger equation before applying indistinguishability requirements. Let us take (9-4) = 0a(1)00) This eigenfunction /i is normalized since we have assumed that C(1)tji p(2) is normalized. For the case at hand, where a = 13, we have = 0p( 1 )0p(2) and the normalized probability density is (11-2) 0*0 = 01)4(2)0p( 1)0p( 2) It is fair to compare the probability densities of (11-1) and (11-2), since both are properly normalized. Doing so, we see that the probability 0'P/i s of having two bosons in the same quantum state has twice the value of the probability elk of this situation occurring if the system is described by an eigenfunction that does not satisfy the quantum mechanical requirements of indistinguishability. We can express this by saying that the probability of having two bosons in the same state is twice what it would be for classical particles. Thus the presence of one boson in a particular quantum state doubles the chance that the second boson will be in that state, compared to the case of classical particles where there is no particular correlation between the energy states occupied by the particles. INDI STIN GUI SHA BI LITY ANDQUANT UMSTATI STI CS there is no inhibition of the probability for the first fermion entering the state. But for n = 1, the factor has the value (1 — 1) = 0, and so a second fermion is strictly inhibited from entering the same state. Note that the factor automatically limits the number n of fermions in any particular quantum state to the values n = 0 or n = 1, in agreement with the exclusion principle. The use of the plural in the preceding italicized statement may therefore seem somewhat inappropriate; it is used to make the statement analogous to one concerning bosons that will follow, and because otherwise the argument immediately below the statement would be circular. We have not had occasion to show that the presence of one boson in a quantum state enhances the probability of a second identical boson being found in that state, because we have done little with bosons since developing the quantum mechanics of indistinguishable particles. Let us show this now. Consider the symmetric eigenfunction for a system of two identical bosons, (9-8) Example 11 1. Compare the probability for three bosons to be in a particular quantum state with the probability for three classical particles to be in the same state. ^ Inspection of the symmetric eigenfunction for a three boson system, found in Example 9-3, shows that it contains 3! = 3 x 2 x 1 = 6 terms like 1li a(1)0p(2)0 1,(3), and that the normalization constant is 1/ 3!. After setting a = f = y to put all the bosons in the same state, the probability density contains (3!) 2 equal terms, but it is multiplied by the square of the normalization constant, (1/r3!) 2 . So the probability is larger by a factor of (3!) 2/3! than it would be if there were three identical classical particles in the state. The probability for the boson case consequently is larger by a factor of 3!. • QUANTUM STATI STICS - The results of Example 11-1 can obviously be extended to the case of n identical bosons in the same quantum state, and show that the probability of this occurring is larger by a factor of n! = n(n — 1)(n — 2) 1, compared to the probability that it would occur in the case of n identical classical particles. These results can be looked at from a most useful point of view by answering the following question. If there are already n bosons in a particular final quantum state of a system in which bosons are making transitions from various initial to various final states, what is the probability that one more boson will make a transition to that particular final state? Let P 1 represent the probability that the first boson is added to the originally empty state of particular interest. If the enhancement effect we are discussing did not exist, the probability that there be n bosons in that state would be just the nth power of P 1 the additions would take place independently and independent probabilities are multiplicative. That is sincethprobalfdingsucevbowldaethsm,nic Pn = (P1 )n But the actual probability that there are n bosons in the state is enhanced to the value pnoson = n!Pn = n! (pi )n The actual probability that there are n + 1 bosons in the state is Pn+in = (n + 1)! Pn + 1 = Pn P1 , we have Since (n + 1)! = (n + 1)n!, and since Pn+1 = (P1)n+l = P1)nP1 ( Pn+in — (n + 1)n!PnP1 or Pn+in = (1 + n)Pi Pn oson (11-3) pnoson Now is the probability that there actually are n bosons in the state. So the answer to the question posed, "If there are already n bosons in a particular final quantum state ... ?," is (1 + n)P 1 . But P 1 is the probability of adding any one of the bosons if there were no enhancement. So we conclude that, if there are already n bosons in a quantum state, the probability of one more joining them is larger by an enhancement factor of (1 + n) than it would be if there were no quantum mechanical indistinguishability requirements. 11 3 THE QUANTUM DISTRIBUTION FUNCTIONS - The most frequently used procedure for obtaining distribution functions that are consistent with the requirements of the indistinguishability of quantum particles involves modifying the first argument of Appendix C so as to satisfy these requirements, and then extending the calculations to the case of a large number of particles and energy states. Here we shall use a much simpler procedure that is in the spirit of the second argument of Appendix C. As a preliminary, consider a system of identical classical particles in thermal equilibrium. The particles exchange energy, but they act independently in that one does not influence the specific behavior of another. Focus attention on two particular and if the same is true of "forward" and "backward" total transition rates between all pairs of particle energy states, then the average population of each of these states will obviously remain constant in time. But constant average state populations is the condition that characterizes thermal equilibrium. Equation (11-4) is a condition which ensures that the equilibrium we assume in all of our arguments is maintained. In principle, equilibrium can also be maintained by balancing interlocking sets of transition cycles, each involving several energy states, without balancing individual pairs of total transition rates as in (11-4); but there is no evidence that this situation arises in practice. To put the matter another way, (11-4) can be taken as a postulate, called detailed balancing, whose justification is found in the fact that it leads to results which agree with experiment. Note that (11-4) implies n2 (11-5) R1—>2 Now in thermal equilibrium the average, or probable, number n 1 of particles in our classical system that will be found in state 1 is given by the Boltzmann distribution, derived in Appendix C, evaluated at the state energy e 1 . So (11-6) n1 = n(e1) = Ae - 1 IkT and similarly for n 2 . Thus the population ratio has the value n1 e -g1IkT (11-7) n 2 = e ^°z/kT Hence, (11-5) and (11-7) show that the transition rates per particle must be in the ratio R2 -> 1 e R1 , 2 e 1/kT e -G`2/kT (11-8) for classical particles. Now we shall apply the thermal equilibrium condition of (11-4) to a system of bosons. We write it as n 1 i l1->22= n 2 Rboson (11-9) 2-^1 where n 1 and n2 are the average boson populations of two quantum states of interest, and R i_s°2 and RZ21 are the transition rates per boson between these states. These rates can be expressed in terms of the rates for the case of classical particles simply by multiplying the classical rates by the (1 + n) enhancement factor derived at the end of Section 11-2. That is, since there are on the average n 2 bosons in quantum state 2 when the 1 -> 2 transition takes place, the actual probability per second per particle, Rb °ÿ2, is larger by a factor of (1 + n 2) than the value R 1 , 2 , the rate a classical particle that does not satisfy the indistinguishability requirements would have. As n ranges from ^ 0 (for a state which almost never contains a boson) to larger and larger values (for a state which contains more and more bosons), the enhancement factor THEQUANTUM DISTRIB UTI ON FUNCTI ONS and S2 , and let the average numbers of particles energy states of these particles occupying them be n 1 and n2 . Also let the average rate at which a particle of the system that is in state 1 makes a transition to state 2 be R 1 , 2 , and the rate at which a particle that is in state 2 makes a transition to state 1 be R21. Both R1,2 and R2,1 are rates per particle, i.e., probabilities per second per particle. So the total rate at which particles of the system will be making 1 -* 2 transitions is n 1 R1, 2, since n1 is the number of particles that have an opportunity to do so and R1,2 is the probability per second that each will take the opportunity. The total rate at which particles in the system will make 2 1 transitions is n2R2-.1 If these total transition rates are equal, that is if niRl —r2 = n2 R2 „. 1 (11-4) Q UANTU M STATISTICS ci cis ranges from ^ 1 (almost no enhancement) to ever larger values (ever larger enhancement). To summarize, we have 01-10) Rb°y2 = (1 + n2)R1-.2 and, similarly (11-11) R'302,7 = (1 + n1 )R2 . 1 Combining (11-9), (11-10), and (11-11), we obtain n1 (1 + n 2 )R 1 - 2 = n 2 (1 + n1)R2.1 or n1 (1 + n 2) R2..1 e' 1IkT (11-12) n2 (1 + n i ) R1_,2 = e- S2IkT where we have used (11-8) to evaluate the ratio of the classical transition rates per particle in terms of the Boltzmann distribution. Equation (11-12) can be expressed as n 1 e gl/kT = n 2 ee2/kT (11-13) 1+ n 1 1+n2 The left side of this equality does not involve properties of state 2, and the right side does not involve properties of state 1. So the common value of both sides cannot involve properties particular to either state, but only a property common to both. It obviously does, as the common equilibrium temperature T is found on both sides. Thus we conclude that both sides of (11-13) are equal to some function of T, which is most conveniently written as e - ", where a = a(T). Equating the left side to the common value, we have n1 1 + n1 e ti/kT = e -a or ni = e - (a +g1IkT) 1 + ni so ni = n1 e or n 1 [1 — e - (a+SilkT)] = - (a +e1lkT) e — (a + e -(a+e1IkT) +eilkT) Thus nl = e -(OE+e1lkT) 1— e- (Œ+e1IkT) _ 1 e"ee 1IkT — 1 If we use the right side of (11-13), we obtain a completely similar result for the dependence of n 2 on e2 . In fact, this result is obtained for the average, or probable, number of bosons occupying a quantum state of any energy S. So we have 1 n(S) _ e" e^I kT — 1 This is the Bose distribution, which specifies the probable number of bosons, of a system in equilibrium at temperature T, that will be in a quantum state of energy 6. The same sort of argument can be applied to an equilibrium system of fermions. For these particles we write the thermal equilibrium condition, (11-4), as nRfe ,mi " = n Rfe rmi" (11-15) 1-> 2 2 2->1 Here Rie , 2i°n is the rate per fermion for transitions between quantum states 1 and 2, R? li°n is the same for 2 —* 1 transitions, and n 1 and n2 are the average fermion n 1 (1 — n2)R1-, 2 = n 2 (1 — nl)R2-->1 or n1( 1 — n 2 ) e g1/kT e gz/kT R 2_1 (11-18) n2(1 — n 1 ) R 1 _, 2 where we have used (11-8) to evaluate the ratio of the classical transition rates per particle in terms of the Boltzmann probabilities. Equation (11-18) can be expressed as n1 e gtlkT = n2 e gZ1kT (11-19) 1—n2 n1 By the same reasoning that we used previously, we see that both sides of this equation are equal to some function of T, which we again write as e - a, where a = a(T). Equating the left side to the common value, we have 1— n1 1 — e ei/kT = e -a n1 or nl 1 — n1 = e (a+el /kT) so n1 = Or n l ^1 + e — n1 e - (a + e- («+g,/kT) +g,/kT) (a +ei/kT)^ = Thus e - (a +g1/kT) n1 = 1 + e- (a 1 eaegi/k T +eS/kT) + 1 We write this as 1 n(s) = eaeglkT 1 (11-20) where we again drop the subscript 1 because the same results are obtained for all quantum states. This is the Fermi distribution which gives the average, or probable, THEQUANTUMDI STRIB UTI ON F UN CTI ONS populations of these states. Because of the exclusion principle, the instantaneous populations of either state can be only zero or one. The populations fluctuate in time, due to the statistical nature of the processes that maintain thermal equilibrium, and they have average values given by n 1 and n2 . The fermion transition rates can be expressed in terms of the rates for classical particles simply by multiplying the classical rates by the (1 — n) inhibition factor discussed in the middle of Section 11-2. With n being interpreted as the average population of a quantum state, (1 — n) is the average value of the inhibition factor, and this is just what is needed here. As n ranges from ^ 0 (for a state which almost never contains a fermion) to ^ 1 (for a state which almost always contains a fermion), the inhibition factor ranges from ^ 1 (almost no inhibition) to ^ 0 (almost complete inhibition), in agreement with the exclusion principle. Thus we have Rfe`m'o° = (1 — n R ( 2) 1,2 and Rz i'°° = (1 — n 1 )R 2 _ 1 (11-17) where R1_,2 and R2_,1 are the rates for a classical particle that does not satisfy the indistinguishability requirements leading to the exclusion principle for fermions. Combining (11-15), (11-16), and (11-17), we obtain w QUANTU M STATISTICS co d. U number of fermions, of a system in equilibrium at temperature T, to be found in a quantum state of energy e. 11 4 COMPARISON OF THE DISTRIBUTION FUNCTIONS - Consider first the Boltzmann distribution of (11-6) n(s) = Ae -elkT If we set the multiplicative constant A equal to e - a, the Boltzmann distribution is 1 nBoltz(U`) = eaee/kT From (11-14), we know that the Bose distribution is 1 nBose( ) = e aeelkT 1 — and (11-20) tells us that the Fermi distribution is ( 1 nFermi(e) = eaeeIkT 1 (11-22) (11-23) In these relations, k is Boltzmann's constant and T is the equilbrium temperature of the system. The parameter a, for a given temperature and system, is specified by the total number of particles it contains. For instance, at the end of Appendix C we evaluated A = e' for a special form of the Boltzmann distribution that applies to a system of simple harmonic oscillators where we defined nBoltz(e) to be a measure of the probability of finding a particular one of them in a state at energy e. The result was A = 1/k T. If there we defined nBoltZ(e) in terms of the probability of finding any one of the oscillators in the state, or the probable number in the state, we would obviously have found A = ✓r/kT, where Jr is the total number of oscillators in the system. This is essentially the way we define nBoltz(e) here, since it gives the probable number of classical particles in the state of energy e. In other words, A is a normalization constant whose value for a given T is specified by the total number of particles in the system described by the Boltzmann distribution. So the same is true for the parameter ic }kT a 1000 5000 10000 —2.84 —0.42 0.62 I T(°K) tzl.0 C.) e (eV) Figure 11 1 The Boltzmann distribution function versus energy for three different values of T and a. This function is a pure exponential, falling by a factor of 1/e with each increase kT in energy. The energy kT is shown for each temperature at the top of the figure. The figure is drawn for a system of particles with the same density as that used in Figure 11-3. Choosing the density fixes a for any temperature T. - kT 6' (eV) Figure 11 2 The Bose distribution function versus energy for three different values of T, all with a = O. At energies large compared to kT this function approaches the exponential form of the Boltzmann distribution, but at energies small compared to kT it exceeds the Boltzmann values, tending to infinity as the energy goes to zero. The energy kT for each temperature is shown at the top of the figure. - cc appearing in that distribution. It is also true that the cc appearing in the Bose distribution for a given T is specified by the total number of Bosons in the system, and that the distribution gives the probable number of bosons in the state of energy ?. The corresponding statements apply as well for the Fermi distribution. In Figure 11-1 we plot the Boltzmann distribution function versus energy for three different values of T and a. Note that this distribution is a pure exponential which falls by a factor of 1/e for each increase of kT in the energy (, as we discussed at some length in Chapter 1. In Figure 11-2 we plot the Bose distribution function versus energy for three different values of T. We choose cc = 0 in each case, so that e" = 1, a case applicable to the photon gas to be discussed later. Notice that at energies small compared to kT the number of particles per quantum state is greater for the Bose distribution than for the Boltzmann distribution. This is a result of the presence of the —1 term in the denominator of the Bose distribution law. At energies large compared to kT, however, the distribution approaches the exponential form characteristic of the Boltzmann distribution, for in this range the exponential factor in (11-22) overwhelms the term —1. This is the region in which the average number of particles per quantum state is much less than one. In Figure 11-3 we plot the Fermi distribution function versus energy for four different values of T and cc. Because the exclusion principle applies here we cannot have more than one particle per quantum state. This accounts for the distinctly different shape of the curves at low energies compared to the other two distributions in which there was no restriction against multiple occupancy of states. If we define the Fermi energy as 4 = -xkT, so that cc = — gF/kT, we can write (11-23) conveniently as 1 nFermi(‘) = e (g-gF)/kT + 1 (11-24) This facilitates interpretation of the distribution function. For example, for states with e « eF the exponential term in the above equation is essentially zero at low temperatures and nFermi = 1. These states contain one fermion. For states with e » e,, the exponential dominates the denominator at low temperatures and the Fermi distribution approaches the Boltzmann distribution. Note that in this region the average S NOIlJ Nfld N OIlf18I 1:I1S Ia 3H1JONOSIa `dd1A1 00 IC a QUANTU M STATISTICS Hb f Id 1.5 kT T (OK) a b 1. 0 c d 0 1000 5000 10000 a — —3.15 —1.51 —0.69 0.5 1 g (eV) J l 3 Figure 11-3 The Fermi distribution function versus energy for four different values of T and a. The exclusion principle sets the limit of one particle per quantum state. The Fermi energy "F is shown for each curve at the bottom of the figure, and the energy kT is shown at the top. The drop, occurring in a region of width about kT centered on eF , becomes more gradual as the temperature increases. At high temperatures and energies, the function approaches the Boltzmann distribution function. The figure is drawn for a material with electron density similar to that of potassium, whose Fermi energy is about 2.1 eV. Choosing the density fixes the Fermi energy and, for any given T, fixes a as well. '. _ number of particles per quantum state is much less than one. At 4, the average number of particles per quantum state is exactly one-half because of the way °F is defined. When T = 0, the Fermi distribution gives Fermi = 1 for all states with energies below 4 and nFermi = 0 for all states with energies above SF. Thus at T = 0 the lowest energy states are filled, starting from the bottom and putting one fermion in each successively higher state, until the last fermion in the system goes into the highest energy filled state at This obviously minimizes the total energy content of the system, as would be expected at absolute zero temperature. Note from Figure 11-3 that for T « SF SF is at nearly the same energy as it is for T = 0. For these relatively low temperatures, the thermal energy of the system has gone into promoting fermions from states of energy somewhat below the zero-temperature 4 energy to states somewhat above that energy. The population changes are restricted to a region of width about equal to kT, since kT is a measure of the thermal energy content per particle of the system. The depopulation below the zero-temperature 4 energy is quite symmetrical to the population above that energy for very small temperatures, and so (i F, which is always the energy where nFermi = 0.5, hardly changes energy. For increasing temperatures, °F begins to shift downward in energy as this symmetry begins to be lost. Certain general features of the three distribution functions should be cited. At high energies (6 » kT) where the probable number of particles per quantum state for the classical distribution is much less than one, the quantum distributions merge with the classical distribution. That is, nFermi ti nBoltz ^ nose, if nBoltz « 1. At low energies (6 « kT) where this number is comparable to or larger than one, the quantum distributions fall on opposite sides of the classical distribution. That is, nFermi < nBoitz < nose, if nBoltz $ 1. These features are most easily seen in Figure 11-4, which plots the three distribution functions against the energy ratio 6/kT for the same value of a. These features are just what would be expected from our considerations of Section 11-2. When n5o11z « 1 the effects of the indistinguishability of two identical particles eF. /k, Figure 11-4 The Boltzmann, Bose, and Fermi distribution functions plotted versus e/kT for two different values of a, —0.1 and —1.0. It should be noted that the dashed curves, if moved to the left (-0.1) — (-1.0) = 0.9 units, would coincide exactly with the solid curves. This observation may provide some further insight into the physical interpretation of a. will have very little chance to manifest themselves because there is very little chance anyway that two particles will be in the same quantum state. So we expect the quantum distributions to join with the classical distribution for n Boitz « 1. When the classical distribution predicts an appreciable probability of there being more than one particle per quantum state, i.e., when nBo,t, $ 1, then this probability will be inhibited for fermions and enhanced for bosons, and we expect the quantum distributions to diverge from the classical distribution in the manner indicated in Figure 11-4. Table 11-1 summarizes the most important attributes of the three distribution functions. Table 11-1 Comparison of the Three Distribution Functions Bose Boltzmann Basic characteristic Applies to distinguishable particles Example of system Distinguishable particles, or approximation to quantum distributions at e » kT No symmetry requirements Eigenfunctions of particles Distribution function Behavior of distribution function versus e/kT Specific problems applied to in this chapter Ae —glkT Fermi Applies to indistinguishable particles not obeying the exclusion principle Bosons—identical particles of zero or integral spin Applies to indistinguishable particles obeying the exclusion principle Fermions—identical particles of odd half integral spin Symmetric under exchange of particle labels 1 Antisymmetric under exchange of particle labels e" eg/kT — 1 Exponential For e » kT, exponential For e « kT, lies above Boltzmann Gases at essentially any temperature; modes of vibration in an isothermal enclosure Photon gas (cavity radiation); phonon gas (heat capacity); liquid helium 1 e (g — gF)lkT +1 For e » kT, exponential where g » gF If eF » kT, decreases abruptly near f F Electron gas (electronic specific heat, contact potential, thermionic emission) COMPA RI SONOF THE DIST RI BUTI ON FUNCTIONS 3 g/kT Q UANTUM STATISTICS 11-5 THE SPECIFIC HEAT OF A CRYSTALLINE SOLID U In this section we present the first of several examples of applications of the Boltzmann distribution to quantum systems. The specific heat of a solid was found in the early (room temperature) experiments of Dulong and Petit to be very similar for all materials, about 6 cal/mole-°K. That is, the amount of heat energy required per molecule to raise the temperature of a solid by a given amount seemed to be about the same regardless of the chemical element of which it is composed. At the time this result could be understood on the basis of the following classical statistical ideas. There are Avogadro's number, N o , atoms in a mole. Each atom is regarded as executing simple harmonic oscillations about its lattice site in three dimensions, so one mole of the solid has 3N 0 degrees of freedom. Each degree of freedom is assigned an average total energy kT, according to the classical law of equipartition of energy, so that E = 3No kT = 3RT where R is the universal gas constant. Then, the heat capacity at constant volume is c„= dT = 3R =6 cal/mole-°K This is called the law of Dulong and Petit. Later experiments showed conclusively, however, that as we lower the temperature the molar heat capacities vary. In fact, the specific heats of all solids tend to zero as the temperature decreases, and near absolute zero the specific heat varies as T 3 . It was Einstein who saw that the kT factor, from classical equipartition, had to be replaced by a factor that takes into account the energy quantization of a simple harmonic oscillator much as Planck had done in the cavity radiation problem. He represented a solid body as a collection of 3N 0 simple harmonic oscillators of the same fundamental frequency and replaced kT with the result by/(e hvI kT — 1) of (1-26), which was obtained by combining Planck's energy quantization and the Boltzmann distribution. He thus found 3N o hv by/kT 1 = 3RT ehv/kT _ 1 E = ehvIkT (11-25 ) From this he calculated the specific heat as c, = dE/dT and found qualitative agreement with experiment at reasonably low temperatures. Although all substances do have curves of c„ versus T of the same form, we must choose a different characteristic frequency y for each substance to match the experimental results. Furthermore, at very low temperatures the Einstein formula does not contain the T 3 temperature dependence required by experiment. Peter Debye, in a general and simple way, found the theoretical approach that successfully yields the exact experimental results. Earlier treatments dealt with the individual atoms in a solid as if they vibrated independently of one another. Actually, of course, the atoms are strongly coupled together. Rather than N o atoms vibrating in three dimensions independently at the same frequency, we should deal with a system of 3N 0 coupled vibrations. Such a dynamical problem would not only be difficult to handle directly but, because the atoms do interact strongly, we could not use the statistics of noninteracting particles. Debye pointed out, however, that a superposition of elastic modes of longitudinal vibration of the solid as a whole— each mode independent of the others like the independent modes of two coupled pendulums—gives the same individual atom motions as the actual coupling. The temperature vibrations of the atoms of a solid are equivalent to a large combination of standing elastic waves of a great range of frequencies. The atomic vibrations of a crystal lattice appear as macroscopic elastic vibrations of the whole crystal. The prob- N(v)dv= 47rV v v2 dv (11-26) where y is the speed of elastic waves and V is the volume of the solid. This is identical to (1-12), except that y replaces c, and that a factor of 2 is removed because, with longitudinal rather than transverse waves, we do not have two states of polarization. Debye further assumed that the number of modes is limited to 3N 0 per mole, the number of translational degrees of freedom of N o atoms, to account for the actual atomic nature of a crystalline solid. The allowed modes varied in frequency then from zero to some maximum v m. To get vm Debye set Vm f N(v) dv = 3N o o obtaining 3v3 v,r, = 3No (11-27a) or vm v (9N0 47r V / 1/3 (11-27b) If now each mode is treated as a one-dimensional oscillator of average energy c given by Planck's quantization and the Boltzmann distribution hv ehv/kT _1 theoalsicnrgytheod Vm E _ 0 hv 47r 2 ehv/k T _ 1 y3 v dv (11-28) TH E SPECIFIC HEATO F ACRYSTALLINE SOLI D lem remains to determine the frequency spectrum of the elastic modes of longitudinal vibration. Thereafter each mode can be treated as an independent harmonic oscillator, whose quantized eigenvalues we already know. Then by summing we can obtain the total energy of the system. Before carrying out the calculation, we should point out that the Boltzmann distribution is applicable here. The individual atoms, in the original formulation of the problem, may be treated as distinguishable particles; the atoms are distinguished from one another by their location in space at the lattice sites of the crystal. However, the assumption of the earlier formulations that these particles do not interact is clearly wrong. In the Debye model, the atoms are replaced by elastic modes of vibration of the solid as a whole. These are independent, noninteracting elements— independent harmonic oscillators. These elements, furthermore, are distinguishable from one another, for each mode of vibration (standing wave) is characterized by a different set of numbers (nx ,ny ,nz) which correspond essentially to the different number of nodes of each mode of vibration. No two modes of vibration can have identical sets of these numbers. In order to get the frequency spectrum of the modes of vibration, Debye assumed that the solid behaved like a continuous, elastic, three-dimensional body, the allowed modes corresponding to longitudinal vibrations with nodes at the boundaries. This is identical in principle to the calculation of the modes of vibration of electromagnetic waves in a cavity, considered in Section 1-3. Thus the number of modes with frequencies between v and v + dv is Q UANTUM STATISTICS This expression can be put in a more compact form if we change to a dimensionless variable of integration x = hv/kT so that xm = hv m/kT. Then v„, xm E— 4rcV v3 f (kT\4 h f x 3 dx y3 h) ex — 1 hv 3 dv — 4itV e hv/kT — 1 o o and, after substituting 47V/v 3 = 9No/vp3, and consolidating symbols, we obtain x„ E=3RT x3 m 3 ex d1 (11-29) o which is Debye's formula. Because x is a dimensionless quantity, hvm/k has the dimensions of a temperature. It is often called the Debye characteristic temperature, O, of the substance involved. Hence, with x m = Co/T, (11-29) becomes E = 9R ©/T T4 r 03 x3 ex —1 0 (11-30) dx and Debye's formula for the specific heat of a solid is cv = d7 = 9R 4 3 O/T 0 3 ex — 1 dx — H O e°/T - 1 (11-31) 0 Debye's theory involves a parameter O which, because of its connection to the elastic properties of the solid, can be determined independently of specific heat measurements, as we shall see in Example 11-2. Using these independently determined values in the theory, we obtain the excellent agreement with experimental measurements of specific heat illustrated in Figure 11-5. In particular, the theory agrees with the observed T 3 law at very low temperatures. (a) Show how O can be obtained directly from the elastic properties of a solid. IN-Because 0 = hvm/k we must find vm first. From (11-27b), v m = v(9N0/4'tV) 1 /3 so we have 0 = (hv/k)(9N 0 /4n(V) 1 / 3 . All quantities are measurable experimentally so that 0 can be found from measurements of V (the molar volume) and y (the speed of elastic waves). Actually, since both longitudinal (compressional) and transverse (shear) waves can be transmitted by the solid, and since their speeds are different, we replace y by a more general exExample 11 2. - I I xv - x.tIr +-x ^ax 5 + Al o Ca F2 A Cu V KCI • O Pb ❑ Zn CI _ I I I I I I I I 1 I I I I 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2 8 T/0 Figure 11-5 The measured specific heat at constant volume, as a function of temperature, for several materials. Horizontal line I represents the Dulong -Petit law, and curve II represents the predictions of the Debye theory. O/T c„= 9R 4 J (^ C T /3 1 x3 0 - 1 dx Te ° lT - 1 ex o As T decreases, O/T becomes very large. Indeed, as T —> 0, 0/T —> oo, and the last term goes to zero. Hence co T\ 3 x3 I — 1 dx c„ —>9R4 ^ / J ex (' which, because f ô x 3/(ex — 1) dx = 714/15, yields 12m4 R cU __ 5 03 T 3 • the required T 3 law for very low temperatures. (c) Show how O can be obtained from specific heat measurements. ■If T = 0, then from (11-31) we have 1 3 c„ = 9R 4 J exx 1 dx (' e 1 1 = 2.856R = 5.67 cal/mole - °K „ 0 so that the Debye temperature O can be defined as that temperature at which c = 2.856R. For comparison with part (a), the values so obtained are 455°K for iron, 420°K for aluminum, and 215°K for silver. • It is remarkable that so simple a model as Debye's yields such excellent results. The true frequency spectrum of the modes of vibration should depend on the actual lattice structure of the crystalline solid and may differ from the results of Debye's continuum model. Such differences as have been found between experiment and Debye's predictions can indeed be accounted for by expected differences between the actual spectrum and Debye's so that the experimental facts of the specific heats of solids seem to be completely understood. Here we have considered the contributions to the specific heat of a solid from the lattice vibrations alone. In Section 11-11 we shall consider the contribution made by free electrons to the specific heat of a solid conductor. 11-6 THE BOLTZMANN DISTRIBUTION AS AN APPROXIMATION TO QUANTUM DISTRIBUTIONS We have seen that, where the average number of particles per quantum state is much less than one, the quantum distributions merge with the classical distribution. Particularly useful in this region is the Boltzmann factor nBoltz( 62) nBottz('1) = e -ce2-6.1)/kT (11-32) THEBO LTZMANN DISTRIBUTION AS AN APPR OX IM ATIO N TO QUANTUM DI STRIB UTION S pression. In particular, if y 1 represents the speed of longitudinal waves and v t the speed of CO transverse waves in the solid, we require 3N 0 = (4mmV/3vl )vm + (4mV/3vr )2vm instead of (11-27a), where allowance is now made for the two polarization states of transverse waves, as cn well. Then we use in (11-27b) 1 (1 2l J + vt3 v3 { v3 / t/ From the measurements of y 1 and vt, v is computed. Some calculated results for O and v m are: Iron O = 465°K vm = 9.7 x 10 12 sec -1 Aluminum O = 395°K vm = 8.3 x 10 12 sec -1 Silver O = 210°K vm = 4.4 x 10 12 sec -1 • T —* (b) Show that as 0, c„ —> const x T 3 in Debye's (11-31). ^ We have N rn QU ANTU M STATISTICS M co c Ç giving the relative number of particles per quantum state at two different energies g2 and Si, for a system in equilibrium at temperature T. We have already made use of this approximation in Example 4-7. It can be applied in all quantum systems at energies more than several kT above the ground state—the states are sparsely occupied so that nBoltz is very much less than one. For example, when we consider thermal collisions of atoms in a gas in equilibrium at temperature T the excited states of the atoms are normally sparsely populated. Hence we can obtain the relative equilibrium populations of the various excited states as a function of temperature by using the Boltzmann factor. Since the intensities of the spectral lines depend on these populations, we can then predict the variation of spectral intensities with temperature. More often the procedure is reversed; that is, starting with the known relative intensities we can determine the temperature of the source, such as the star considered in Example 4-7. The same idea is applicable to molecular spectra, as we shall see in Chapter 12. The Maxwell distribution of speeds of gas molecules moving freely inside a box is validly deduced from the Boltzmann distribution because nBoltz for all the free particle states is very small under the conditions usually existing in nature for ordinary gases. The technique of nuclear magnetic resonance is used to obtain information about internal magnetic fields in solids. It is more sensitive than chemical techniques, for example, in identifying magnetic impurities in a crystal. Principally, however, it enables us to use the nucleus as a probe to get information about solids, much as radioactive tracers are used in biological systems. For nuclei of nonzero spin the degeneracy of the energy levels with respect to the orientation of the nuclear spin is lifted by the magnetic field. (This is analogous to the Zeeman effect.) A resonance absorption of electromagnetic power occurs when photons bombarding the solid have the proper energy to excite transitions between these levels. The strength of the absorption depends upon the difference in population of the levels involved. To illustrate the sensitivity of the technique, use the Boltzmann factor to compute the difference between the populations n 1 and n 2 of two levels at room temperature, if the resonant absorption is detected at a frequency of 10 MHz. • The Boltzmann factor is Example 11-3. = e -(e 2 - nBoltz(e2) oikr = n 2 nBoltZ(g1) n1 We have T = 300°K, 62 - = hv, and v = 10 7 sec -1 . Hence n2 = e -hv/kT n1 = e - 6.6 x 10 - 34 joule-sec x 10 7 sec -1 /1.4 x 10 - 23 joule-°K- 1 x 300°K = e -1.6 x 10 -6 . 1 - 1.6 x 10 -6 Therefore 1— n2 = 1.6 x 10 -6 nl or nl — n1 n2 = 1.6 x 10 -6 So a difference in populations of less than two parts in a million is detectable, a result which reveals the high sensitivity of the NMR technique. The Boltzmann factor is applicable here since the population is spread over several close levels, so both n 1 and n2 are small. • 11-7 THE LASER We saw in the previous section that the relative number of particles per quantum state at two different energies for a system in thermal equilibrium at temperature T is given, in certain circumstances, by the Boltzmann factor, e - V 2- "Ikr . We use this result Before After Spontaneous emission (a) e2 Stimulated -r\-/A-R-,-) absorption ^i (b) e2 o Stimulated emission rtJVl ei (c) Illustrating (a) the spontaneous emission process, (b) the stimulated absorption process, and (c) the stimulated emission process, for two energy states of an atom. Figure 11 - 6 4:13SV1 3H1 now to explain the behavior of a very important device called a laser, an acronym for "light amplification by stimulated emission of radiation." A maser is the corresponding system operating in the microwave region of the electromagnetic spectrum. Consider transitions between two energy states of an atom in the presence of an electromagnetic field. In Figure 11-6 we illustrate schematically the three transition processes, namely, spontaneous emission, stimulated absorption, and stimulated emission. In the spontaneous emission process, the atom is initially in the upper state of energy g2 and decays to the lower state of energy f1 by the emission of a photon of frequency y = (e2 — 6'1)/h. (The mean lifetime of an atom in most excited states is about 10 - 8 sec. But some decays may be much slower, the excited states then being called metastable; the mean lifetime in such cases may be as long as 10 s sec.) In the stimulated absorption process, an incident photon of frequency y, from an electromagnetic field applied to the atom, stimulates the atom to make a transition from the lower to the higher energy state, the photon being absorbed by the atom. In the stimulated emission process, an incident photon of frequency y stimulates the atom to make a transition from the higher to the lower energy state; the atom is left in this lower state at the emergence of two photons of the same frequency, the incident one and the emitted one. The processes of stimulated absorption and emission of electromagnetic energy in quantized systems can be regarded as analogous to the stimulated absorption or emission of mechanical energy in classical resonating systems upon which a periodic mechanical force of the same frequency as the natural frequency of the system is impressed. In such a mechanical system, energy can be put in or taken out depending on the relative phases of motion of the system and the impressed force. The spontaneous emission process, however, is a strictly quantum effect. As discussed in Section 8-7, quantum electrodynamics shows that there are fluctuations in the electromagnetic field. Because of the zero-point energy of the electromagnetic field, these fluctuations occur even when classically there is no field. It is these fluctuations that induce the so-called spontaneous emission of radiation from atoms in excited states. In all three processes, then, we deal with the interaction of radiation with the atom. We wish to show now how these processes are related quantitatively. Let the spectral energy density of the electromagnetic radiation applied to the atoms be p(v). Consider that there are n 1 atoms in energy state g1 and n2 in state g2 , where g2 > g1. QUANTUM STATISTI CS The probability per atom per unit time, or transition rate per atom, that an atom in state 1 will undergo a transition to state 2 (stimulated absorption) clearly will be proportional to the energy density p(v) of the applied radiation at frequency y = (e2 — S1)/h. In Section 8-7 we argued that the transition rate for stimulated emission is also proportional to p(v). But as we explained in Section 8-7, the transition rate for spontaneous emission does not contain p(v) because that process does not involve the applied electromagnetic field. The transition rates also depend on the detailed properties of the atomic states 1 and 2 through the electric dipole moment matrix element of (8-42). Hence, the probability per unit time for a transition from state 1 to state 2 can be written as R 1-.2 = (11-33) B 12p( v) in which B12 is a coefficient that includes the dependence on properties of the states 1 and 2. The total probability per unit time that an atom in state 2 will undergo a transition to state 1 is the sum of two terms, the probability per unit time A21 of spontaneous emission and the probability per unit time B21 p(v) of stimulated emission. Again, A21 and B21 are coefficients whose values depend on the properties of states 1 and 2, through the appropriate matrix elements. Hence (11-34) R2->1 = A21 + B21p(v) Note again that spontaneous emission occurs at a rate independent of p(v), whereas stimulated emission occurs at a rate proportional to p(v). If now we consider that the n 1 atoms in state 1 and the n 2 atoms in state 2 of the system are in thermal equilibrium at temperature T with the radiation field of energy and the total emisdensity p(v), then the total absorption rate for the system n 1R sion rate n 2R2 _ 1 must be equal, as in (11-4). That is (11-35) n 1 R 1 _, 2 = n 2 R2 _, 1 Thus we have n1B12p(v) = n 2 [A21 + If we solve this equation for p(v) B21p(v)] we obtain p(v) _ A21 B 21 n1 (11-36) B12 1 n2 B21 We now assume we can use the Boltzmann factor, (11-32), with obtain n1 _ = Q (^z-^i)lkT n2 by = 6'2 — e1 , to = e hv/kT so that (11-36) becomes A21 p(v) = (11-37) B21 B12 ehvIkT — 1 B21 This equation, giving the spectral energy density of radiation of frequency y that is in thermal equilibrium at temperature T with atoms of energies e1 and 6 2 , must be consistent with Planck's blackbody spectrum, (1-27) 1 1 87chv 3 PT(v) = e3 (ehvIkT — 1 Hence, we conclude that (11-38) B12 = B21 1 A2 1 87LhV 3 B21 C (11-39) 3 These results were first obtained by Einstein in 1917, and therefore the coefficients are called the Einstein A and B coefficients. Note that the argument does not give us values of the coefficients, but only their ratios. However, if we compute the spontaneous emission coefficient A 21 from quantum mechanics, using the techniques of Section 8-7, we then can obtain the other coefficients from these formulas. There is much of physical interest here. For one thing, we find from (11-38) that the coefficients of stimulated emission and stimulated absorption are equal. For another, we see from (11-39) that the ratio of the spontaneous emission coefficient to the stimulated emission coefficient varies with frequency as y 3. This means, for example, that the bigger the energy difference between the two states, the much more likely is spontaneous emission compared to stimulated emission. Equation (8-43) shows that the y 3 is present in this ratio because A2 1 itself is proportional to y 3. Still another result is that we can obtain the ratio of the probability A21 of spontaneous emission to the probability B 21 p(v) of stimulated emission, namely A21 = ehv/kT B2 1 p(v) - 1 (11-40) This shows that, for atoms in thermal equilibrium with the radiation, spontaneous emission is far more probable than stimulated emission if by » kT. Since this condition applies to electronic transitions in both atoms and molecules, stimulated emission can be ignored in such transitions. Stimulated emission can become significant, however, if hv ^ kT, and it may be dominant if hv « kT, a condition that applies at room temperature to atomic transitions in the microwave region of the spectrum where y is relatively small. We are now in a position to understand the concept behind lasers and masers. In general, the ratio of the emission rate to the absorption rate can be written as n2R2_1/ n1R 1 _ 2 or rate of emission n2A21 + n2B21p(v) n1B12p(v) rate of absorption r (11-41) A21 n2 1 =C n1 If we have energy states such that e 2 — « kT, or hv « kT, then (11-40) shows Bz1p(v) that we can ignore the second term in the parenthesis as very much smaller than one, and obtain n2 rate of emission (11-42) rate of absorption n 1 This result is general in the sense that we have not assumed an equilibrium situation. In situations of thermal equilibrium, where the Boltzmann factor applies, we expect n2 < n1 . But in nonequilibrium situations any ratio is possible in principle. If now we have a means of inverting the normal population of states so that n 2 > n1 , then the emission would exceed the absorption rate. This means that the applied radiation of frequency v = (e2 — 6 1)/h will be amplified in intensity by the interaction process, more such radiation emerging than entering. Of course, such a process will reduce the population of the upper state until equilibrium is reestablished. In order to sustain a3Shc 3H1 and QUANTU M STATI STICS the process, therefore, we must use some method to maintain the population inversion of the states. Devices that do this are called lasers or masers, depending upon the portion of the electromagnetic spectrum in which they operate. Energy must be injected into the system, most commonly by a method described later called optical pumping, and the output is an intense, coherent, monochromatic beam of radiation, as we now explain. In the ordinary atomic light sources there is a random relationship between the phases of the photons emitted by different atoms so that the resulting radiation is incoherent. The reason is that there is no correlation in the times that the atoms make their transitions. In laser light sources, on the other hand, atoms radiate in phase with the inducing radiation because their charge oscillations are in phase with that radiation. Since in a laser the inducing radiation is a coherent parallel beam formed by reflection between the ends of a resonant cell, the emitted photons are all in phase and act coherently. The resulting intensity, which is the square of the constructively combined amplitudes, is correspondingly high. The states between which transitions are made are an upper metastable state, whose relatively long lifetime allows it to be highly populated, and the lower ground state of infinitely long lifetime. From the uncertainty relation AEAt ^ h, with At equal to the long lifetime of the upper state, we conclude that the energy uncertainty in the energy difference of the states is small and the emitted transition frequency is sharp, giving a highly monochromatic beam. In practical devices the beam is also unidirectional, the coherence property making it possible to obtain essentially perfect collimation, or focusing. This further enhances the concentration of energy density. Some indication of the concentration of energy in a laser beam is given by the fact that a laser with less power than a typical light bulb can burn a hole in a metal plate. In the solid state laser that operates with a ruby crystal, some Al atoms in the Al2O3 molecules are replaced by Cr atoms. These "impurity" chromium atoms account for the laser action. In Figure 11-7 we show a simplified version of the appropriate energy-level scheme of chromium. (The uppermost level is really a multiplet.) The level of energy e1 is the ground state and the level of energy e3 is the unstable upper state with a short lifetime (^ 10 -8 sec), the energy difference e3 — f1 corresponding to a wavelength of about 5500 A. Level e2 is an intermediate excited state which is metastable, its lifetime against spontaneous decay being about 3 x 10 -3 sec. If the chromium atoms are in thermal equilibrium, the population numbers of the states are such that n 3 < n2 < n 1 . By pumping in radiation of wavelength 5500 A, however, we stimulate absorption of incoming photons by Cr atoms in the ground state, thereby raising the population number of energy state e3 and depleting energy state e1 of occupants. Spontaneous emission, bringing atoms from state 3 to state 2, then enhances the occupancy of state 2, which is relatively long-lived. The result of this optical pumping is to decrease n 1 and increase n 2 , so that n 2 > n 1 and population inversion exists. Now, when an atom does make a transition from state 2 to state 1, the emitted photon of wavelength 6943 A will stimulate further transitions. Stimulated emission will dominate stimulated absorption (because n 2 > n 1 ) and the output of photons of wavelength 6943 A is much enhanced. We obtain an intensified coherent monochromatic beam. In practice, the ruby laser is a cylindrical rod with parallel, optically flat reflecting ends, one of which is only partly reflecting as shown in Figure 11-7. The emitted photons that do not travel along the axis escape through the sides before they are able to cause much stimulated emission. But those photons that move exactly in the direction of the axis are reflected several times, and they are capable of stimulating emission repeatedly. Thus the number of photons is built up rapidly, those escaping from the partially reflecting end giving a unidirectional beam of great intensity and sharply defined wavelength. Short-lived state 'Spontaneous decay Metastable state 6'2 Pumping radiation 4vvv).5500 Â a3sd13Hl -`^"* Stimulated emission, 6943 Â Ground state Coiled lamp ( Mirror Partly transparent mirror + iwa External beam Figure 11 7 Top: The relevant energy levels of chromium atoms in a ruby laser. State 3 is very broad (large AE) because it is short lived (small At). State 2 is very sharp (small AE) because it is long lived (large At). Optical pumping raises the atom from ground state 1 to excited state 3, the latter's breadth facilitating the process. Then spontaneous decay occurs to state 2, the energy released usually going into mechanical energy in the ruby crystal rather than into photon radiation. Finally, state 2 decays to the ground state, either through spontaneous emission or through stimulated emission due to photons from other such transitions. Since state 2 is very sharply defined and the ground state is infinitely sharply defined, this radiation will be very monochromatic. Bottom: A schematic of the ruby laser, showing the optical pumping lamp, the escape of photons not moving axially, suggesting the buildup of repeatedly reflected axially moving photons which stimulate further emission, and indicating the escape of a fraction of the axial photons through the partially reflecting mirror at one end. - Note that this is reminiscent of the conclusion of Section 11-2 that n bosons already in a quantum state will enhance the probability of one more joining them by a factor of (1 + n). The conclusion is applicable to the photons in the quantum states of the cylindrical rod, since photons are bosons. It is possible to develop the basic theory of the laser by applying the Bose distribution to the quantum states of the photons, instead of by applying the Boltzmann distribution to the quantum states of the atoms as we have done here. But the treatments are very closely related (as they should be since they lead to the same results) because the energy density p(v) of (11-34) is proportional to the number n of photons in a state at energy by so that equation is very similar to the enhancement equations, (11-10) or (11-11), that we used in deriving the Bose distribution. Furthermore, (11-35) is identical to the thermal equilibrium condition of (11-4) that was also used in the Bose distribution derivation. Generally speaking, a laser is a device in which a material is prepared so that the higher of two energy levels is more highly populated than the lower energy level, the material being enclosed in an appropriate resonator of sharp response. The system produces coherent radiation at those frequencies common to the resonator and the difference in energy of the levels. There is now a wide variety of lasers—gas lasers, liquid lasers, and solid state lasers—covering various regions of the electromagnetic spectrum. The intense coherent nature of the radiation they provide has led to increasing application of lasers in fields such as radio astronomy, microwave spectroscopy, photography, biophysics, and communications. Q UANTUMSTATISTICS 11-8 THE PHOTON GAS We begin in this section to study applications of the Bose distribution. The first will be a derivation of Planck's blackbody cavity radiation spectrum, in which the photons in thermal equilibrium at temperature T with the walls of the cavity are treated as a gas that is governed by the Bose distribution. According to (11-22), that distribution is 1 n(s) = eaeg/kT 1 The discussion following (11-22) indicated that the value of the parameter a is specified by the total number of particles the system governed by the distribution contains. But for the case at hand the total number of particles in the system is not constant. A photon can be completely absorbed when it strikes a wall of the cavity, or the hot wall may at some other time emit a new photon. Thus for a system of photons the distribution cannot contain the term e". That is, the Bose distribution for photons (or other bosons that can be created or destroyed within the system) must have the form — (11-43) 1 The number of particles in the system has indeed specified the value of a; because that number varies it is necessary that a = 0 so that e = 1. Confirmation of the validity of this argument will be obtained soon. Let N(s) represent the number of quantum states per unit energy interval at energy i—called the density of states—for photons in the cavity. Then N(e) de is the number of quantum states for photons in the cavity within the energy interval 6' to e + de. Since n(s) is the probable number of photons per quantum state, the product n(g)N(g) dg gives the number of photons in the energy interval. However, mode for radiation confined to a cavity has already been evaluated by geometrical arguments in Example 1-3, except that the language used there is different from that which we are currently using; there we spoke of the radiation as waves and here we speak of it as particles (photons). We found there that n(s) _ e -1 N(v) dv = 8 kT — 3V v 2 dv C where V is the volume of the cavity and y is the frequency of a wave contained in the cavity. Using the familiar relation g = hv to evaluate the energy of the associated photon, here we find, after multiplying and dividing the term v 2 dv by h3, that 8rcV e2 dP (11-44) mode = c3 h3 Taking the product of this expression times n(s), multiplying by the energy b carried by each photon, and then dividing by the volume V of the cavity, we have en(e)N(é) de _ 87c S3 de PT(g) e3h3(ee/kT _ 1) where PT(e) de is the energy per unit volume in the energy interval e to g + de. Planck's spectrum follows at once by using the relation e = hv to convert from S to v. Thus p T (v)dv = 8Tcv 2 hv 3 ehv/kT c — 1 dv (11-45) Equation (11-45) is identical to (1-27), obtained in Chapter 1 and verified there by comparison with experiment. Note that this agreement confirms the validity of the Bose distribution for photons, (11-43). In the Planck derivation the radiation is a set of waves confined to the cavity. Each of these standing waves is a mode of vibration that is distinguishable from all the others, just as for the lattice vibration modes in the Debye model, so it is valid to apply the Boltzmann distribution to them. In the present derivation the cavity radiation is a set of indistinguishable particles—photons to which the Bose distribution must be applied. 11 9 THE PHONON GAS - 11 10 BOSE CONDENSATION AND LIQUID HELIUM - Here we sketch an application of the Bose distribution to an ideal gas in order to compare quantum and classical gas behavior. As a practical application we shall then consider the remarkable properties of liquid helium. The general form of the Bose distribution is 1 (11-46) — To apply this to bosons whose total number X in a system remains fixed, like helium atoms, we must first determine the parameter a. This is done by setting n(e) = e a e ^lkT cc ✓V = J n (g)N(e) de o where me) dg is the number of quantum states of the system in an energy interval e to e + de, and n(s) is the number of bosons per quantum state, so that the integral is just the total number X'. Using (11-46), we. have oo N(S) = eaeglkT 0 de 1 (11-47) BOSE C ON DEN SATI O NAND LI QUID H E LIUM We were able to use the wave-particle duality for electromagnetic radiation to derive the thermally excited distribution of radiation in a cavity either on a wave picture or a particle picture. Similarly, the thermally excited distribution of elastic vibrations in a solid can be deduced by applying a wave-particle duality for acoustic radiation. Just as photons are the quanta of electromagnetic radiation, so phonons are the quanta of acoustic radiation. Just as photons are emitted and absorbed by vibrations of the atoms in a cavity wall, so phonons are emitted and absorbed by vibrating atoms at the lattice points in the solid. The sources of each type of radiation are quantized so that the energy gain or loss is discrete; the discrete energy transferred through the system has an energy hv, where y is the frequency of the acoustic vibration for phonons and of the electromagnetic vibration for photons. Just as the number of photons is not fixed or conserved, so the number of phonons is not fixed or conserved. The Bose distribution with a = 0, i.e., (11-43), applies to phonon and to photon. There are differences, of course, between the photons and phonons. For example, the photon propagates through vacuum whereas the phonon propagates through a crystal lattice. This leads to different energy-momentum relations, a matter we return to in a subsequent chapter. It should be clear that the Debye specific heat formula can be deduced on the phonon picture from the Bose distribution in a way analogous to the photon deduction of the Planck spectrum formula using the Bose distribution. That is, the waveparticle duality for acoustic radiation is used just as before we used the wave-particle duality for electromagnetic radiation. The phonon calculation will not be reproduced here because it is completely analogous to the photon calculation and leads to no new results. The solid contains a gas of phonons just as the cavity contained a gas of photons. QU ANTU MSTATISTICS 0 0 To proceed we must determine for an ideal gas the number of states in the energy interval e to f + de, which is the product of the density of states N(s) and the size de of the energy interval. Consider the gas particles to be in a cubical box of side a. The potential energy for a particle in such a three-dimensional box is that of a threedimensional infinite square well. The Schroedinger equation for a one-dimensional infinite square well was solved in Section 6-8, giving allowed energies en = (h2/8ma 2)n2 . By a simple extension of the calculation we find the allowed energies f for a threedimensional well to be h2 e 8ma2 (nx ± ny+ nz) (11-48) in which the quantum numbers nx , n nZ are positive integers. The number of states in an energy interval can be obtained by plotting, in a space formed by axes n x, ni,, nZ , the allowed states (which are points where n x, n nZ take on positive integral values) and counting them. We have done this, in a different context, for the calculation of Example 1-3. There we defined r = jnx + ny + fZ, and we found in (1-15) that the number of states for r lying between r and r + dr is nr2 dr N(r) dr = 2 The same is true here. We convert this into the desired form, N(e)de, by using (11-48) to write h2 2 r e= 8ma2 and then taking this equation, and its differential, to evaluate rcr2 dr = TE h2 3/2 g1/2 de = 4ha3 (2m3)1/26,1I2 de 4 8ma2 ) - So the number of states for e lying between g and e + de is me) de = 4h3 (2m3)1/26e1/2 de (11-49) where V = a3 , the volume of the box. If now we combine this result with (11-47) and carry out the integration we obtain (27cm)3/2 hT V e-Œ 1 + 23/2 e -a + 3 3/2 e -2a + .. . To simplify the appearance of this equation, let e - a = A so that we can write (27rmkT3 312 V A 1+23 2 A+332 A2 + (11-50) For large mass m and high temperature T, A must be very small since X is fixed. In these circumstances, terms beyond the first power in A can be dropped. But lt.rge m and high T should be the classical region. Indeed, we find that the first term gives the classical Boltzmann result .J1( _ (27cmkT)312V h3 or A= A 3 (2nrmkT) 312 V = e' (11-51) Note that A = e'a is proportional to X, as in the Boltzmann result for a system of classical oscillators discussed after (11-23). Also note that here we conclude that since is fixed a must be very large (as A is very small), in contrast to our conclusion that a is zero for a system of bosons in which .N' varies. If we now compute the total energy E of the ideal gas from we obtain E_ (2^ h 3/2 3 V 2 kT A(1 + 25/2 A + 3 5/2 A z + •) (11-52) Once again the classical result follows for very small values of A. Neglecting terms beyond the first power in A, and using (11-51), we have E = (3/2).iVkT. This corresponds to an average energy per particle EIS' equal to (3/2)k T, which is the classical equipartition of energy result for three-dimensional translational motion. The general Bose result for the average energy per particle, obtained by dividing (11-52) by (11-50), is, including terms up to A 2 E= 3 = 2 kT [1 — 2 / 5 2 V(2 mkT)3/z] (11-53) The term beyond 1 in the bracketed expression of (11-53) represents the deviation of the Bose gas from the classical gas. This is sometimes called the degeneracy effect. (This degeneracy effect, or gas degeneration, is not related to the degeneracy that describes different quantum states having the same energy.) Equation (11-53), which neglects higher order terms, pertains to the case of weak degeneracy. Note that the degeneracy term is negative so that the average particle energy is less for a Bose gas than for a classical gas. This corresponds to previous results in which we found a greater probability of two particles to an energy state for the Bose distribution than for the Boltzmann distribution, the lower energy states being relatively fuller in the Bose gas than in the classical gas) as a consequence. Physically, this manifests itself, for example, as a lower gas pressure (lower average momentum) at the same temperature for a Bose gas than for a classical gas. Example 11 4. Whenever the mean interparticle distance is comparable to or smaller than the de Broglie wavelength assigned to particles on the basis of their temperature, we should expect to observe wave effects, that is quantum effects, in the system of particles. Show that this criterion leads to the requirement that the degeneracy term .4Vh 3/V(2xmkT) 3/2 not be negligible compared to 1 if deviations from classical behavior are to be detected. ■ The de Broglie wavelength of a particle is A = h/p. In a gas in equilibrium at temperature T the mean kinetic energy is (3/2)kT so that p = -.%/2mK = J3mkT. Hence - h (3mkT) 1 /2 If the volume of gas is V and there are ✓V atoms of gas, the volume per particle V/.A 1 can be set equal to d 3, where d is the mean interatomic separation. Hence ) d ( V 1/3 Now, if A > _ d we expect wave effects to be important. This requires h V ) 1/3 > /2— (3m kT) 1 — or, cubing each side h3 (3mkT)312 > V Wf1113Ha111 011 aNd N OIlb'SN3aN0 03909 E = J en(e)N(e) dS N QUANTU M STATI STI CS O which is the same as .A h 3 >1 V(3mkT )312 Hence, ✓Vh 3/V(27rmkT) 312 should exceed about 1/3 and so the term beyond 1 in the bracketed part of (11-53) should exceed about 1/16 to meet our criterion. • Under what circumstances might we detect the degeneracy effect experimentally? The degeneracy term is negligible in practice for most gases, having a value of about 10 -5, so that the Boltzmann distribution applies almost universally to them. Note that the degeneracy term, .IVh 3/V(27rmkT) 3"2, becomes more important the smaller the mass m, the lower the temperature T, and the higher the density .AVIV. The smallest mass gases obeying the Bose distribution (zero or integral spin angular momentum) are H2 and He. If we prepare such a gas to be at high density and low temperature we bring it near its condensation point. For this reason, and another to be mentioned shortly, the degeneracy effect is sometimes called the Bose condensation. For H2 the degeneracy term at its normal condensation point is less than 1/100, whereas for He near its normal condensation point (4.2°K) the degeneracy term is about 1/7. Hence, we should get observable effects more easily for helium. The theory would be approximate in this case, for at such high densities the behavior is like a real gas of interacting particles rather than an ideal gas of noninteracting particles. Indeed, in the liquid, or condensed phase, we observe the most striking nonclassical effects in the behavior of helium. Let us now describe these effects. Ordinary helium gas is composed almost wholly of neutral atoms of the isotope He4. The spin angular momentum of such an atom is zero so that the Bose distribution must be used to treat the behavior of this gas. At normal atmospheric pressure helium gas condenses to a liquid at 4.18°K. It remains as a liquid, i.e., it does not freeze into a solid, down to the absolute zero of temperature if it is cooled at a pressure equal to its own vapor pressure. (To obtain solid helium it is necessary to pressurize the liquid, about 26 atm of pressure being needed near absolute zero.) If, by pumping off the vapor, the temperature of liquid helium is reduced to 2.18°K, a dramatic change in its properties is observed. The temperature 2.18°K is called the A point because the shape of the graph of specific heat versus temperature resembles the letter A with the anomaly at 2.18°K. Liquid helium is called He I when it is above this temperature and He II when below. He I is essentially a classical fluid, its behavior not being unusual, but He II contains a superfluid component which causes it to show spectacular large scale quantum effects, including the following: 1. As the temperature of liquid helium is lowered by evaporation and the vapor is pumped away, the liquid boils in the usual manner. But as the A point is reached and passed the boiling suddenly stops throughout the liquid. Though evaporation continues, and the temperature and vapor pressure fall, the liquid is completely calm (see Figure 11-8). This is explained by the fact that heat can be conducted out of the liquid with practically no resistance, since the heat conductivity is measured to increase by a factor of about one million below the A point. 2. We can determine the viscosity of liquid helium by measuring its rate of flow through a fine capillary tube. At the A point, the measured value of the viscosity drops by a factor of about one million. 3. Most unusual and spectacular is the ability of liquid helium, below the A point, to creep as a thin film along the walls of its container, as shown in Figure 11-9. The speed of this ordered mass motion may be 30 cm or more per second. The effect involves helium first adsorbing on the entire surface of the cold container to form a thin film. The film then acts like a siphon through which the liquid flows with almost no viscosity. BOS ECONDENS ATION AND LI QUIDHELIU M Figure 11-8 The 2 paint transition in liquid helium. As liquid helium is cooled from its normal boiling point at 4.2°K by evaporation, with the use of a vacuum pump, it boils normally with small bubbles. As it undergoes the phase transition from He Ito He II at the 1 point, 2.18° K, it suddenly and briefly boils up violently (see top and middle pictures), and equally suddenly stops boiling altogether (see bottom picture). Below this transition point liquid helium cannot boil, even when pumping, evaporation, and cooling continue. (Courtesy of A. Leitner, Rensselaer Polytechnic Institute) Q UANTUM STATISTICS (a) (b) o ..........: Figure 11-9 The creeping motion of a film of liquid He 4 below the transition temperature demonstrates the superfluidity of He II. The film behavior, suggestive of liquid flow through a siphon, is shown schematically for liquid levels in the container (a) below and (b) above the level of the liquid helium reservoir. In (c) is a photograph of a glass vessel partially filled with liquid He II and suspended by threads above the surface of the same liquid seen at the bottom of the picture. He II creeps up along the inside wall, over the rim, and down along the outside wall as a thin film, collecting as a drop on the bottom. When this drop falls another will form, and so on, until the vessel is empty. (Courtesy of A. Leitner, Rensselaer Polytechnic Institute) K. Mendelssohn has written of the film flow as follows: "If the beaker is withdrawn from the bath, the level will drop until it has reached the level of the bath. If the beaker is pulled out completely, the level will still drop, and one can see little drops of helium forming at the bottom of the beaker and falling back into the bath. This is the sort of thing that makes one look twice and rub his eyes and wonder whether it is quite true. I remember well the night when we first observed this film transfer. It was well after dinner, and we looked around the building and finally found two nuclear physicists still at work. When they, too, saw the drops, we were happier." All of the properties of He II indicate that it has a very high degree of order. For instance, the almost complete absence of viscosity means that, when flowing, He II does not develop the small scale turbulences that cause the frictional energy loss responsible for the viscosity of ordinary fluids. The order is imposed by the (1 + n) enhancement factor that we often find when studying the low-energy behavior of a system of bosons. When the temperature becomes low enough to allow it, all the helium atoms in a system tend to condense into the same lowest' energy quantum state. This is the Bose condensation. The superfluid component, whose concentration rapidly approaches 100% as the temperature decreases below the point, is comprised of those atoms which are in that quantum state. To the extent that all the atoms do get into the same microscopic state, it becomes the state of the entire macroscopic system and the system can only behave in a completely ordered way in which the action of any atom "is correlated _with the action of all the others. This tendency is extremely pronounced because the factor (1 + n) has an extremely large value if n is anything like the total number of atoms in a beaker of liquid helium. 11 11 THE FREE ELECTRON GAS - In this and the following section we apply the Fermi distribution to quantum systems. In a manner analogous to that used for a boson gas, we could deduce the behavior of an ideal gas of fermions. To the same degree of approximation we would find, for example, that the average energy per particle is 3 E= = 2 kT[1 +25/2 (11-54) X 2mk V( T)3/2] which is the Fermi result corresponding to the Bose result of (11-53). The degeneracy term here (second term in brackets) is positive so that the average particle energy is greater for a Fermi gas than for a classical gas. This corresponds to a lower probability N(e) dg = 8rc V(2m 3) 1/2 h3 e1/2 d e (11-55) Multiplying by n('), the probable number of electrons per quantum state, we obtain 8^ V ( yn 3) 1/2 1/2 de n(e)N(e) de = ^F = — akT (11-56) c^ ^F)l + 1 F e This is the electron gas energy distribution of conduction electrons in a metal. If now we assume that the temperature is very low (strictly speaking, T = 0), we know that all the quantum states up to the Fermi energy 'F are occupied and that SVJ NO 17110313 33b13 31H1 (strictly zero) of finding two particles in the same quantum state for the Fermi distribution than for the Boltzmann distribution, the lower energy states being relatively fuller in the classical gas than in the Fermi gas as a result. Physically, this manifests itself as a higher gas pressure (higher average momentum), at the same temperature, for a Fermi gas than for a classical gas. Notice again how the Bose and Fermi results fall on opposite sides of the classical result. It is natural to ask for an example of a Fermi gas whose degeneracy effect we can detect. In Chapter 15 we shall find an example in the neutrons, and the protons, confined to a nucleus. Helium gas containing only the isotope He 3 also obeys the Fermi distribution, as do all particles with odd half-integral spin angular momentum, and it remains a gas without condensing to a low enough temperature that the degeneracy term of (11-54) is detectable. This isotope is rare and more difficult to get in large quantities, but the behavior of He 3 atoms has been shown to be markedly different from that of He 4 atoms in ways predicted by the different distribution functions applicable to them. For example, the vapor pressure of liquid He 3 at a given temperature is much higher than that of liquid He 4. Indeed, this is the basis for a practical method of cooling to 0.02°K. It would be quite easy to detect the effect of the degeneracy term for fermions, however, if we could obtain a gas of electrons. The degeneracy term can be written as nh 3/(27rmkT) 3/ 2, in which n = X/V is the number density of the particles. Notice that a small mass in and a high density n can increase the importance of this term, as well as a low temperature T. Because the electronic mass is several thousand times smaller than that of atoms, the degeneracy effect for electrons should actually be detectable even at high temperatures. For electrons in a metal the number density n of conduction electrons is also very high, so that conduction electrons in a metal show quantum degeneracy effects. The question remains whether we can regard such electrons, even approximately, as a gas of free electrons, i.e., an ideal gas. In a crystalline solid most of the atomic electrons are bound to the nuclei at the lattice points, but if it is a metallic conductor electrons from outer subshells of the atoms are relatively free to move through the solid. These are the conduction electrons. Because their mutual repulsion is cancelled, on the average, by the attractions of the atomic cores, we may regard the conduction electrons as approximately free particles and can treat them to good approximation as an ideal electron gas (see Figure 6-24). Indeed, we can regard the interior of the solid as a region of approximately constant potential for these electrons with the metal boundaries acting as high potential walls. The electron then behaves as a particle in a box whose quantum states we already know (see Section 6-8). To get the number N(S)de of states in an energy interval f to e + de we simply count the number of standing waves, each representing a definite state of the motion, in that energy interval. We have made this calculation before for an ideal gas in a box, with results described in (11-49). The results here are the same, after taking into account the two possible spin orientations for an electron having a given space eigenfunction. That is o QU ANTU M STATISTICS o none of the higher states are occupied. In that case the total number of free electrons equals the total number of distinct states up to energy eF, and we have a way of calculating the Fermi energy. That is eF Jr. = f N(e) de = F 8nV(2m3)1/2 J X1'2 d^ = 16m V3hm3)"2 n/2 0 or h2 3, 2/3 (11-57) 8m\ ITV/I For temperatures such that kT « eF this result is an excellent approximation. For ordinary metals we need temperatures of the order of several thousand degrees before the approximation breaks down. ( F Consider silver in the metallic state, with one free (conduction) electron per atom. (a) Calculate the Fermi energy from (11-57). •The density of silver is 10.5 g/cm 3 and its atomic weight is 108. Hence .N _ 6.02 x 1023 atom/mole x 10.5 g/cm 3 n= x 1 free electron/atom V 108g/mole = 5.9 x 1022 free electron/cm 3 = 5.9 x 1028/m 3 Therefore 3 x 5.9 x 10 -2$/m3\2/3 h2 3n \2/3 _ (6.6 x 10 -34 joule -sec) 2 fI F ^ 8m 71 8 x 9.1 x 10 -31 kg ( =8.8 x 10 -19 joule =5.5eV (b) Calculate the degeneracy term for the conduction electrons in metallic silver at 300°K. ^^ We have nh 3 5.9 x 1028/m 3 x (6.6 x 10 -34 joule-sec) 3 3/2 (2irmkT) (27c x 9.1 x 10 -31 kg x 1.38 x 10 -23 joule/°K x 300°K) 312 470 so that the second term in the brackets of (11-54) has the value 1 nh3 /2 820 Example 11 5. - C 25/2 (21CmkT)3 Hence, the degeneracy term is extremely large and completely overwhelms the leading (classical) term of (11-54). The electron gas is said to be a completely degenerate Fermi gas; that is, it behaves as if T ^ 0°K with the electrons in the configuration of lowest energy. Such a gas shows quantum behavior (i.e., is nonclassical) up to the highest attainable metallic temperature, the electron gas in silver remaining almost completely degenerate until the temperature is of the order of 10 5 °K. At those temperatures and higher the degeneracy term becomes small compared to one. We can now understand a result that classical physics was unable to explain, namely the experimental observation that the conduction electrons do not contribute to the specific heat of metals at ordinary temperatures. According to the classical view the free electrons take part in the thermal motion in a metal, each free electron having a mean energy (3/2)k T. Therefore, the specific heat for a metal should be not simply 3R, due to the vibrations of the atoms at the lattice sites, but it should be (3 + 3/2)R instead, in which the (3/2)R term is the contribution per mole of the electron gas. The origin of this term is seen by noting that if E = (3/2)k TN 0 = (3/2)R T, then c„ = dE/dT = (3/2)R, where N o is Avogadro's number. According to the Fermi model of an electron gas, the electrons do not exhibit this classical behavior until the temperature reaches about 10 5 °K. That is, there is no equipartition of energy between elec- 11-12 CONTACT POTENTIAL AND THERMIONIC EMISSION Up to now we have treated the electron in a metal as a particle in a box, that is we have implicitly assumed the electron does not escape the metal, the potential box having very high walls. We know, however, that electrons can escape from metals, as in the photoelectric effect, thermionic emission, etc., so that we should modify the potential function somewhat. Inside the metal the potential function is approximately constant, and near the metal boundary it increases rapidly to reach its higher constant value outside the metal. If we take the zero of potential energy to correspond to the electron being far outside the metal, then we can let — Y o represent the depth of the resulting potential energy well illustrated in Figure 11-10. We can determine Yo from photoelectric experiments, specifically from the fact that there is a cutoff frequency v o below which photons cannot eject electrons from the metal (see Section 2-2). This suggests that the most energetic electrons in the metal are an energy interval hv 0 below the top of the potential well. The fact that the photoelectric current rises rapidly as the photon energy rises above the threshold value suggests an abrupt rise in the number of electrons with decreasing kinetic energy A Empty energy levels wp o Filled energy levels Vacuum Metal Vacuum Figure 11-10 The average potential energy for a conduction electron in a metal. The potential is a well of depth V o that rises rapidly near the metal boundaries to zero. The energy levels increase in density in proportion to f 112 , and are filled up to the Fermi energy 4. The work function is w o , and Vo = wo + 4. NOI SSIIN3 OINOIW1d3H1 ❑ NV 1 `dI1N3lOd lOb'1NO 0 trons and lattice contributions, the electron gas in this sense not being anywhere near thermal equilibrium with the atoms of the metal confining it. As the temperature is raised, the Fermi distribution of electrons among available energy levels is affected only slightly at the high-energy end (see Figure 11-3) so that the average electron energy is hardly changed at all. This means that at ordinary temperatures the electron gas does not contribute to the specific heat of the metal in an appreciable way. That is, E (3/2)kTN 0 , but instead it is approximately independent of temperature, so that c„ = O. Hence, the Fermi distribution is in accord with experimental facts concerning electrons at ordinary temperatures. At ordinary temperatures, and even at temperatures high enough to make the cv = 3R law of Dulong and Petit a good approximation to the specific heat contribution of the lattice vibrations of a solid, the electronic specific heat term is too small relative to the atomic specific heat term to be detected. At temperatures near absolute zero, where the atomic specific heat is very small, the electronic contribution will exceed the atomic contribution. It is in the region of a few degrees Kelvin that the electronic specific heat dependence is observed experimentally, again in agreement with the Fermi distribution predictions. CO QUANTUM STATIS TICS 0 inside the metal. This corresponds to the features of the Fermi distribution, the most energetic electrons having kinetic energy and many electrons having nearby smaller kinetic energies. Therefore, we can retain the energy distribution of quantum states that we found for the particle in a box. (See Section 6-8 for a discussion of the similarity in energy levels of an infinite and a finite square well potential.) At T = 0 all states are filled up to an energy 6F above the bottom of the well, this highest state having a total energy — hv 0 . That is, — V0 + gaF = — hvo . Recall now that hv0 = w0, the work function of the metal, so that — V0 + eF = — w0 or V0 = w0 (11-58) For silver the work function is 4.7 eV and is 5.5 eV, so that V0 is 10.2 eV. For most metals V0 lies between 5 and 15 eV, as can be seen in Table 11-2. Of course, at ordinary temperatures the Fermi distribution does not give a sharp cutoff at eF but is spread out continuously over a narrow energy region near eF. In a region of the order of kT on each side of the Fermi energy, i.e., in a transition region of width 2kT, the number of particles per quantum state goes from a value near one to a value near zero. In the limit when T —* 0 this transition region becomes infinitesimally narrow. With this model for the behavior of electrons in a metal we can explain the contact potential difference of two metals and understand the thermionic emission process. First, consider the thermionic emission process, which is of great practical importance because it is responsible for the emission of electrons from the heated filament of a vacuum tube. At high temperatures (i.e., for large values of kT) the distribution of electrons among available energy states in a metal extends to energies well above 4. At sufficiently high temperature some electrons may acquire a kinetic energy greater than V0 (i.e., greater than 6aF + w0) and thereby escape from the metal. We can calculate the thermoelectric current density emitted from a metal surface as a function of temperature from the Fermi distribution and from the Boltzmann distribution. The calculation involves determining how many electrons will arrive at the metal surface moving in the required direction and with enough kinetic energy to escape. The two distributions give a different temperature dependence for the current density, and experiment rules in favor of the Fermi distribution for electrons. As for the contact potential difference between metals, consider two metals A and B which at first are not in contact, as is indicated schematically in the left part of Figure 11-11. Outside the metals the potential energy of an electron is zero. Inside the metals the Fermi level of metal A is WA below zero and the Fermi level of metal B is wB below zero. Let w B > WA so that the Fermi level of metal A is higher than that of B. Now let the metals be connected electrically, as illustrated in the right part of Figure 11-11. Then the most energetic electrons in metal A will flow into metal B, filling the energy levels in B just above its Fermi energy and depleting the upper levels in A. The process continues until equilibrium is reached; that is, until the highest filled levels in A and B are at the same energy, because the total energy of the Table 11 2 - Work Function and Fermi Level Energy for Some Metals Metal w0 (eV ) Ag Au Ca Cu K Li Na 4.7 4.8 3.2 4.1 2.1 2.3 2.3 (eV) 5.5 5.5 4.7 7.1 2.1 4.7 3.1 V= 0 Space Metal B Metal A Space Metal B - A and B with different work functions. Right: The metals are now connected electrically by a wire, becoming oppositely charged and exhibiting a contact potential difference. system is minimized when this situation is achieved. The result is that metal A becomes positively charged in the process and metal B becomes negatively charged. Consequently there is a potential difference of (w $ — w A )/e between the metals when they are connected electrically, a result in essential agreement with experimental values. 11-13 CLASSICAL AND QUANTUM DESCRIPTIONS OF THE STATE OF A SYSTEM We saw in Section 4-9 an example of how the instantaneous state of the motion of a classical particle can be represented by a point in phase space. For the one-dimensional motion considered there, the phase space was a two-dimensional space whose abscissa was the position x and whose ordinate was the momentum px . For a three-dimensional motion, phase space is a six-dimensional space of coordinates x, y, z, p x , py , pz . As the particle moves, the point representing it in phase space traces out a path, the path being an ellipse in our earlier example of a one-dimensional harmonic oscillator. If we had a large number of such oscillators we would have a large number of representative points in phase space corresponding to the instantaneous distribution of oscillators. For most systems of interest we can write the total energy of each member as E = K + V = (px + py + p!)/2m + V(x,y,z) so that the location of a point (x,y,z,p x ,py ,pz) in phase space gives the total energy of that member of the system which the point represents. The distribution of points gives the distribution in energy of all members of the system. Thus, in classical statistics we can characterize the energy distribution of a system by giving the number of points in each small volume of phase space, say AxAyAzAp xApyApz. We call such a small volume element a cell in phase space, and points in that cell have total energy between E and E + dE, corresponding to momentum values between px and px + Ap r, etc., and position values between x and x + Ax, etc. The cell is chosen to be small enough that the average total energy of its representative points differs little from the energy of any one of them; it is chosen large enough so that there are many points in a cell, thereby permitting the application of statistical ideas. Hence, the size of a cell is somewhat arbitrary and indefinite, but once it is chosen the cell is characterized by an average total energy and a population number. The cell then is the classical statistical analogue to the quantum state of quantum statistics. In Figure 11-12 we illustrate the situation for a one-dimensional system. In quantum mechanics we must modify the preceding picture because of the uncertainty principle. For one thing we cannot describe the trajectory of a particle by giving the path of a representative point in two-dimensional phase space because we cannot simultaneously know the exact values of x and px for the particle. The best we can do is locate the representative point at any time between x and x + Ax and px and px + Apr where AxApx ^ h, so that instead of a representative point tracing out a line we have a small area tracing out a ribbonlike CLASS IC AL AN DQU ANTUMDESC RIPT IO NS O F THE S TATE OF A SYSTEM Metal A Figure 11 11 Left: Showing the potential energy for an electron in two separated metals Px QUANTU M STATI STICS o • • LiPx < 0 x Figure 11-12 Phase space and representative points for a one-dimensional system. path in two-dimensional phase space. More important, however, is the fact that there is a definite smallest size to any cell in the quantum description. A cell in which AxAp x is less than h is meaningless, such a specification being more precise than allowed by the uncertainty principle. For the general six-dimensional phase space, therefore, the smallest cell has a "volume" of P. It is therefore possible in the quantum description to remove the arbitrariness and indefiniteness of the volume element in phase space. Because the size of the cell obviously affects the counting of distinguishable divisions of the total energy of the system, there is a certain indefiniteness in the results of classical statistics. For example, the entropy of a system can be written as S = k In P where P is the number of distinguishable divisions of its energy content (i.e., P is a measure of the probability that it has the particular energy). However, the classical entropy has an arbitrary constant in it basically because of the indefiniteness of the cell size. The quantum value is exact, because of the definiteness of the cell size, and it gives an absolute entropy constant in agreement with experiment and the laws of thermodynamics Indeed, it was this result, and not the results concerning the cavity radiation, that convinced Max Planck of the correctness of his ideas concerning energy quantization and the constant h. And it is this smallest size of a cell in phase space in quantum statistics that is the origin of the factor h3 displayed in many of the equations in this chapter. From considerations discussed here we can also understand the applicability of the classical Boltzmann distribution to so many quantum problems. If there is no definite smallest size to a cell in phase space then we can always get a situation in which there is not more than one particle per state. But this is just the high temperature case wherein classical and quantum statistics agree. The classical distribution function is valid in this case, regardless of the indistinguishability of particles. Of course, the real quantum world does set a limit to the smallness of a cell so that the classical distribution will not apply when the number of particles per cell is more than one. QUESTIONS 1. Exactly what do the inhibition and enhancement factors describe? What are their origins 2. Can you devise a cycle of transitions between three states which would maintain an equilibrium in the populations of these states, with transitions that violate detailed balancing? Does it seem reasonable to extend this to a system with many states? 3. What is the basic reason why the quantum distributions merge with the classical distribution at energies much larger than kT? 4. Explain why the behavior of the Boltzmann distribution is intermediate to that of the Bose and Fermi distributions. 5. Give examples of systems to which the Boltzmann distribution is applicable in principle. As a good approximation. 6. What factors determine the value of a for the thrée distributions? sw31aoad 7. Interpret physically the Fermi energy eF . 8. Thermal expansion is related to the anharmonic nature of the vibrations of atoms in a solid. Would the Debye model be appropriate to studying thermal expansion of solids? 9. In Debye's model of a solid, the maximum frequency v m corresponds to a minimum wavelength. Because of the discrete nature of a solid this minimum wavelength corresponds to a vibration in which adjacent atoms move 180° out of phase with one another; that is, the interatomic spacing is half a wavelength. Is this plausible? Explain. 10. Interpret the Debye characteristic temperature O physically. 11. In our analysis of emission and absorption processes of an atom in an electromagnetic field we neglected recoil effects. How does this affect our results? Are we justified in ignoring recoils? 12. What are the dimensions of the Einstein A and B coefficients? 13. It is said that a laser is not a source of energy but a converter of energy. Explain. 14. We have ignored the possible degeneracy of the states involved in laser action. How would you take this into account? What effect does it have? 15. Make a step-by-step comparison of the deduction of the Planck radiation law on the basis of the Maxwell distribution and the Bose distribution. 16. List similarities and differences between phonons and photons. 17. At low densities and high temperatures the Bose gas behaves like a classical ideal gas. Make this result plausible physically. 18. In writing about experiments on the scattering of a particles in helium Rutherford said, "On account of the impossibility of distinguishing between the scattered alpha particles and the projected He nuclei, the results are subject to a certain ambiguity." Explain how an awareness of quantum statistics could have removed the ambiguity. What determines whether a gas obeys Bose or Fermi distributions? 19. How can the ordered state of the He II explain its lack of resistance to heat conduction? 20. What examples of a Fermi gas are there other than an electron gas and a gas of He 3 atoms? 21. In the ideal gas equations we use the rest mass of particles. Should we ever use the relativistic mass instead? Consider the effect of temperature and the nature of the particle. 22. Give a plausibility argument for the relation, (11-57), between the Fermi energy eF and the density of free electrons in a metal. 23. In the Fermi distribution we obtain the result that at the Fermi energy gF the average number of particles per quantum state is exactly one-half. This is definitely not the same as saying that 50% of the particles are at energies above the Fermi energy and 50% below. Explain. 24. Justify the assumption that conduction electrons behave approximately as a system of free noninteracting particles. 25. Is there a connection between Vo , the depth of the potential well for conduction electrons in a metal, and electron diffraction experiments of the Davisson-Germer type? Can we determine V0 from such experiments? 26. Explain physically the effect of letting h 0 in expressions for the density of states, such as (11-49). Explain physically the effect of letting h -4.0 in equations involving the quantum degeneracy term, such as (11-53). PROBLEMS 1. The equilibrium state is one of maximum entropy S in thermodynamics and one of maximum probability P in statistics. Assuming then that S is a function of P, show that we should expect S = k In P, where k is a universal constant. This relation is sometimes called the Boltzmann postulate. (Hint: Consider the effect on S and P of combining two systems.) N T 2. The Maxwell distribution can be developed by looking at elastic collisions between two particles. If initially these particles have energies f1 and g2, and finally g3 and e4, then a)+(e2 +5) = + If all possible states are equally probable, the number of collisions per second P is proportional to the number of particles in each initial state, i.e. QU ANTUM STATIS TICS ^ CP(g1)P(e2) P1,2 = where Red is the probability of a state being occupied, and C is a constant. Similarly P3 , 4 = CP(^3)P(e4). In equilibrium, for each collision (1,2) -+ (3,4) there must be a collision (3,4) (1,2). Thus P 1,2 = P3 , 4. (a) Show that P(g1) = e-gilkT solves this equation. (b) Use similar reasoning to derive the Fermi distribution. Here, however, the initial states must be filled and the final states must be empty, and the number of collisions becomes P1,2 = CP(GP(e2)[1 - P(e3)][1 - P(4)] Then show that the equation can be solved by P(6L) - P1,2 = P3,4 1 P(^`) - 3. 4. 5. 6. Cep IkT J [ which yields (11-23). (a) Show that at T = 0, in the Fermi distribution, n(s) = 1 for all energy states in which 'F and n(s) = 0 for all energy states in which e > eF. (b) Show that n(s) = 1/2< for g = gF. Consider the Fermi distribution of (11-24), n(s) = l/[ev-g'F>1kT + 1]. (a) Show that n(s) = 1 - n(24 - g); that is, with é - SF = S, show that n(e, + (5) = 1 - n(gF - (5). This proves that the distribution has a symmetry about n(4) = 1/2. (b) Find n(s) for b = g - eF = kT, or 2kT, or 4kT, or 10k T. Make a rough sketch of n(s) versus e for any T > 0. (c) What percent error is made by approximating the Fermi distribution by the Boltzmann distribution when 8/kT = 1, 2, 4, 10? (a) At what energy is the Bose distribution function (for a = 0) equal to one for a temperature of 7000°K? (b) What is the temperature of the Bose function (for a = 0) with a value of 0.500 at this same energy? For the Fermi distribution function (a) show that gF J n(e) de = kT [in (1 + eg'F/kT)/2] o (b) Show that this reduces to ‘ F for T = 0. (c) Show that gF J n(g) de = J n(g) dg + o 7. k T(ln 2) o (a) From (11-25), show that the Einstein model of a solid gives the specific heat as ehv/kT = 3R [(e h vIkT - by \ 2l 1)2 kT ( (b) Show that c,,-* 0 as T -* 0 but that at low T, c„ increases as a -h v/kT rather than as the required T 3 law. 8. Show that the Debye specific heat result, (11-31), reduces to the classical law of Dulong and Petit at high temperatures. (Hint: First expand both exponentials and retain only first order terms. Justify.) 9. Imagine a cavity at temperature T. Show that c,,, the specific heat of the enclosed radiation, is given by (32n 5 kV/15)(kT/hc) 3 . Explain why c„ does not have an upper limit in this case whereas it does for solids. 10. In some temperature region graphite can be considered a two-dimensional Debye solid, but there are still 3N 0 modes per mole. (a) Show that N(v) dv = (2nA/v 2)v dv where A is ^. .Vk cv = ^ kT 2 e -g/kT (1 + e- g1kT)2 (This is the Schottky specific heat and is observed for paramagnetic solids at low temperatures. The energy levels correspond to the magnetic moments being aligned parallel or antiparallel to the magnetic field.) (c) Sketch the heat capacity as a function of temperature, being careful to have the correct temperature dependence at high and low temperatures. 12. The variation of density p with altitude y of the gaseous atmosphere of the earth can be written as p = poe-9(P°1P°)y, where po and Po are sea level density and pressure, provided the temperature is assumed to be uniform. (a) From the ideal gas laws show that this can be put into the form p = poe - mgy/kT (b) Show that this has the form of the Boltzmann distribution. 13. (a) By combining n(s) of (11-21) and N(e) of (11-49) for an ideal gas of classical particles, with A =e- "= Nh a (27cmkT )312 V show that n(6)N(6)dg 14. 15. 16. 17. 18. _ (kT)312n112 g 1 12e g/kT de is the energy distribution of particles in an ideal gas. (b) Show that Maxwell's speed distribution of molecules in a gas, which has the form n(v) dv = Cv2e-mv212kT dv, where C is a constant, follows directly from this. Assume that the thermal neutrons emerging from a nuclear reactor have an energy distribution corresponding to a classical ideal gas at a temperature of 300°K. Calculate the density of neutrons in a beam of flux 10 13/m2-sec. (Hint: Consider the average velocity, and justify its use.) In a certain nucleus the magnetic moment is 1.4 x 10 -26 joule-m 2/weber. Calculate the fractional difference in population of the nuclear Zeeman levels in a magnetic field of 1 weber/m 2 , (a) at room temperature and (b) at 4°K. Electron spin resonance is much like nuclear magnetic resonance except that electronic transitions are excited between atomic Zeeman levels. These experiments are done at microwave frequencies. If the electromagnetic wave has a frequency of 32 KMHz (K band) calculate the fractional difference in population between two atomic Zeeman levels (a) at room temperature and (b) at 4°K. (a) Determine the order of magnitude of the fraction of hydrogen atoms in a state with principle quantum number n = 2 to those in state n = 1 in a gas at 300°K. (b) Take into account the degeneracy of the states corresponding to quantum numbers n = 1 and 2 of atomic hydrogen and determine at what temperature approximately one atom in a hundred is in a state with n = 2. Consider the relation n i/n2 = e(g2- "MT , the Boltzmann factor for nondegenerate states for systems in equilibrium, where e2 > g1. (a) Show that n 2 = 0 at T = 0. (b) Show that n 1 = n2 at T = o0 or T = - oo. (c) Show that n 2 > n 1 at finite negative temperature T. (d) Show that n 1 - 0 as T -> -0. (e) Hence, explain the statements, "Negative absolute temperatures are not colder than absolute zero but hotter than infinite temperature," and ^ w sw31 soad the area of the sample. (b) Find an expression for v m and Co for graphite. (c) Show that at low temperatures the heat capacity is proportional to T 2. 11. .N' distinguishable atoms are distributed over two energy levels e1 = 0 and g 2 = (a) Show that the energy of the system is given by ✓t we -e/kT E_ 1 + e - g1kT (b) Show that c„ is given by QU ANTU M STATISTICS 19. 20. 21. 22. 23. "One approaches negative temperatures through infinity, not through zero." (f) Can you suggest a change in temperature scale that would avoid temperatures that are negative in this sense? Determine approximately the ratio of the probability of spontaneous emission to the probability of stimulated emission at room temperature in (a) the x-ray region of the electromagnetic spectrum, (b) the visible region, (c) the microwave region. An atom has two energy levels with a transition wavelength of 5800 A. At room temperature 4 x 10 20 atoms are in the lower state. (a) How many occupy the upper state, under conditions of thermal equilibrium? (b) Suppose instead that 7 x 10 2° atoms are pumped into the upper state, with 4 x 10 20 in the lower state. How much energy in joules could be released in a single pulse? The energy levels in a two-level atom are separated by 2.00 eV. There are 3 x 10 18 atoms in the upper level and 1.7 x 10 18 atoms in the ground level. The coefficient of stimulated emission is 3.2 x 10 5 m3/W-sec 3, and the spectral radiancy is 4 W/m 2-Hz. Calculate the stimulated emission rate. If B 10 = 2.7 x 10 19 m3/W-sec 3 for a particular atom, find the life-time of the 1 to 0 transition at (a) 5500 A (visible) and (b) 550 A (ultraviolet)? Combine (11-49) and (11-47) to obtain (11-50), as follows. Let .x = g/kT and obtain CO 2n V(2mkT) 3 / 2 Î x1 /2 dx J0 e" +x _ 1 h3 Then, with a positive, use the relation (e" + — 1) -1 = e- " - x(1 — e - " -x) -1 = e- "(e -x + a-- —2x + ) to obtain (11-50). 24. Obtain (11-52) as follows. Let x = g/kT and show that op E 2rckTV(2mkT)312 = h3 (' x312 dx 3 e "+x_ 1 = 2 k T V(27umkT) 312 h3 / 1 1 e' 1+ 25/2 e-" +••• 25. Show that the quantum degeneracy in a Fermi gas occurs if kT « eF. (Hint: See Example 11-4 and use (11-57).) 26. Show from the Fermi distribution that in a metal at T = 0°K the average energy of an electron is 34F/5. 27. Using 23 as the atomic weight and 9.7 x 10 2 kg/m 3 as the density of metallic sodium, compute the Fermi energy on the assumption that each sodium atom gives one electron to the conduction band. (Hint: See Example 11-5.) 28. Using 197 as the atomic weight and 19.3 x 10 3 kg/m 3 as the density of gold, compute the depth of the potential well for free electrons in gold. The work function is 4.8 eV and there is one free electron per atom. 29. In a one-dimensional system the number of energy states per unit energy is (l/h) \/2m/e, where 1 is the length of the sample and m is the mass of the electron. There are ../If electrons in the sample and each state can be occupied by two electrons. (a) Determine the Fermi energy at 0°K. (b) Find the average energy per electron at 0°K. 30. Show that about one conduction electron in a thousand in metallic silver has an energy greater than the Fermi energy at room temperature. 12 MOLECULES 12-1 416 INTRODUCTION relevance of molecular physics 12-2 416 IONIC BONDS electromagnetic origin of molecular binding; energy budget in ionic binding of sodium chloride; polar molecules; nondirectionality of ionic bonds; likely candidates for ionic binding 12 3 - 418 COVALENT BONDS role of hydrogen molecular ion; preferred eigenfunction symmetry for ion; energy budget in ion; energy budget in hydrogen molecule; paired electron sharing in covalent bond; saturation; directionality of covalent bonds; homopolar molecules 12 4 - MOLECULAR SPECTRA 422 comparison to atomic spectra; decomposition of level structure and spectra into electronic, vibrational, and rotational 12 5 - ROTATIONAL SPECTRA 423 quantization of rotational energy; quantum number r; selection rule; spectra 12 6 - VIBRATION ROTATION SPECTRA - 426 quantization of vibrational energy, quantum number y; selection rule; vibration-rotation bands; isotope effects; vibrational and rotational constants 12 7 - ELECTRONIC SPECTRA 429 band spectra; Franck Condon principle - 12 8 - THE RAMAN EFFECT 432 description; role of intermediate state; relation to Rayleigh scattering; use in study of molecules with identical nuclei 12-9 DETERMINATION OF NUCLEAR SPIN AND SYMMETRY CHARACTER 434 symmetries of vibrational, rotational, and nuclear spin factors of molecular eigenfunction; nuclear spin quantum number i, ortho and para molecules; alternation of intensities; missing lines; application to several nuclei QUESTIONS 438 PROBLEMS 438 415 MO LECULES 12-1 INTRODUCTION The subject matter of the previous chapters is considered to be common to all of quantum physics. The concepts and techniques we have developed in these chapters for the purpose of studying atoms prove to be necessary, or at least useful, in studying most of the areas to which quantum physics is applied. But from atoms the applications of quantum physics branch into two well-defined, and fairly well-separated, channels. One of these leads to the systems larger than atoms; i.e., it goes from atoms to molecules and then to solids. The other channel leads from atoms to the smaller systems; i.e., to nuclei and then to their constituents, the elementary particles. In the next three chapters we shall follow the first channel, and in the last four chapters of this book we shall explore the second. We know that two or more atoms can combine to form a stable molecule. Here we seek a description of the interatomic forces which bind atoms into molecules, and also an understanding of the nature of energy levels and spectra of molecules. Since a very large number of atoms may join together to make a solid, in much the same way as a few do to form a molecule, the phenomenon of molecular binding is very relevant to the properties of solids. The motivation for studying molecular spectra, in addition to its intrinsic interest, is found in practical considerations. For example, a new but rapidly expanding field of science is molecular astronomy, which involves the measurement of molecular spectra originating in interstellar, or intergalactic, matter, for the purpose of determining its composition and condition. And as we shall see, measurements of molecular spectra have for a long time provided the primary source of information about important properties of the nuclei contained in the molecule. 12-2 IONIC BONDS From one point of view a molecule is a stable arrangement of a group of nuclei and electrons. The exact arrangement is determined by electromagnetic forces and the laws of quantum mechanics. This concept of a molecule is a natural extension of the concept of an atom. Another view regards a molecule as a stable structure formed by the association of two or more atoms. In this view the atoms retain their identity whereas in the first-mentioned view they do not. Of course, both views are useful and there are situations wherein each is directly applicable. In general, however, the structure and properties of molecules are best described by a combination of both views. When a molecule is formed from two atoms, the inner shell electrons of each atom remain tightly bound to the original nucleus and are barely disturbed at all. The outermost loosely bound electrons, known as the valence electrons, are influenced by all the particles (ions + electrons) of the system. Their wave functions are significantly modified when the atoms are brought together. Indeed, it is this very interaction that leads to binding, i.e., to a lower total energy, when the nuclei or ions are close together. This interaction, called the interatomic force, is of electromagnetic origin. Hence, we see that valence electrons play the central role in molecular binding. There are two principal types of molecular binding, the ionic bond and the covalent bond. The NaCl molecule is an example of ionic binding and the H2 molecule an example of covalent binding. Consider the formation of a NaCl molecule from an atom of Na and an atom of Cl which are far apart initially. Figure 9-15 shows that to remove the outermost 3s electron from Na and form the Na + ion requires an ionization energy of 5.1 eV. The atomic binding in the alkali Na is relatively weak because its filled inner subshells are effective in shielding the valence electron electrically from the nucleus so that it moves in a weakened field at an outlying position. If now we attach this electron to the halogen Cl atom it will complete a previously Example 12 1. Evaluate approximately the depth of the minimum in Figure 12-1 by assuming that at the 2.4 A equilibrium nuclear separation R of NaCl the Na + and Cl ions have spherically symmetrical charge distributions that do not yet overlap. ■ With this assumption, Gauss's law of electrostatics allows us to evaluate the Coulomb binding energy of the unit charge ions from the simple expression 1 e2 - V = 4nEO R Na+ + e + CI 5.1 eV Na 3.8 eV CI ionization electron energy affinity Na+ + Cl' (for R=œ) Na + CI 3.6 eV -4.9 The energy for the neutral atoms Na and Cl, and for the ions Na + and CI , as functions of the internuclear separation R. The ionic combination lower__ energy at small separation, while the neutral atom combination has lower energy at large separation. Thus, as the two neutral atoms are brought together, they go over to ionic form when their separation becomes less than a certain value. Figure 12-1 - SdNOBONO! unfilled 3p shell in Cl to form a Cl ion. The halogen has a relatively high electron affinity; that is, the closed shell ion is more stable than the neutral atom, its energy being lower by 3.8 eV. Hence, at the cost of 1.3 eV of energy (5.1 eV — 3.8 eV), we have formed two distinct separate ions, Na + and Cl ; but these ions exert attractive Coulomb forces on one another, and the energy of attraction is greater than 1.3 eV. Now, since the mutual Coulomb potential energy of the ions is negative, the potential energy of the combined system initially decreases as the separation of the ions is steadily reduced. As the ions are brought still closer together the electron charge distributions begin to overlap. This has two effects, each of which increases the potential energy: (1) the nuclei are not as well shielded from one another as before and they begin to repel one another and (2) at small internuclear separation we effectively have a single system to which the exclusion principle applies, and some electrons must be in higher energy states than before to avoid violating this principle. The potential energy curve therefore yields a repulsive force at small interatomic separations and an attractive force at large separations. There is a separation at which this energy is a minimum, the energy being 4.9 eV lower at this proximity than for distantly separated ions. Hence, compared to two neutral atoms, Na + Cl, the combined system NaCl is lower in energy by 3.6 eV (that is, E = 1.3 eV — 4.9 eV = — 3.6 eV) so that a bound state is energetically favored, as illustrated in Figure 12-1. The equilibrium nuclear separation in NaC1 is 2.4 A. MO LECU LES where R = 2.4 A. We obtain 9.0 x 109 nt- m 2/cou1 2 x (1.6 x 10 -19 coul) 2 V — 2.4 x 10 -10 m = —9.7 x 10 -19 joule x l eV 1.6 x 10 -19 joule = —6.0 eV If the student extrapolates slightly the 1/R behavior in Figure 12-1 to R = 2.4 A, he will see that the results of this evaluation are consistent with its assumptions. • NaC1 is a molecule held together by ionic binding. Because the region of positive charge (Na t ) and the region of negative charge (C1 - ) are separated, there is a permanent electric dipole moment. An ionic molecule is thus said to be a polar molecule. Ionic binding is also called heteropolar binding. Ionic bonds are not directional, for each ion has a closed shell configuration which is spherically symmetrical. Ionic bonds can be formed with more than one valence electron, as in the case of the MgC12 molecule, when the molecular state is energetically lower than the state of separated atoms. The number of ionic bonds that an atom can form depends on the shell structure of the atom, i.e., on the ionization potentials for successively removing electrons. It will be energetically favorable to form ionic bonds only for those (few) outer subshell electrons that have ionization potentials in certain ranges. Compounds of elements from the first column, and the second from last column, of the periodic table (the alkali halides, such as KC1, LiBr, etc.) are ionic, as are many of those from the second column and the third from last column (the alkaline-earth oxides, sulfides, etc.). 12 3 COVALENT BONDS - Let us consider now the formation of the H2 molecule. If in the case of H2 we were to calculate the energy required to form positive and negative hydrogen ions by moving an electron from one hydrogen atom to the other, and then added to this the energy of the Coulomb interaction of the ions, we would find that there is no distance of separation at which the total energy is negative. That is, ionic bonding does not result in a bound H2 molecule. The fact that H2 is bound is explained quantum mechanically by the behavior of the electronic eigenfunction describing the charge distribution of the system, as two hydrogen atoms approach one another. As we shall see soon, the resulting charge distribution does lead to electrostatic attraction, but it is a charge distribution that can be interpreted as a sharing of electrons by both atoms. The binding is called covalent. We can best understand the covalent bond by treating first the simpler case of H2 , the hydrogen molecular ion. In this case we have two nuclei each exerting a Coulomb repulsion on the other, and both exerting a Coulomb attraction on the single electron. Since the electron motion is very rapid compared to the nuclear motions, the procedure is to assume that the nuclei are at rest a distance R apart, with the single electron moving in their Coulomb fields, and then determine the electron energy from the Schroedinger equation. We next treat R as a variable and consider both the electron energy, and the internuclear Coulomb repulsion energy, as a function of the internuclear separation. The total energy of the system is the sum of these two energies, and the system will be bound if the total energy exhibits a minimum at some value of internuclear separation. The top of Figure 12-2 indicates the potential energy in which the electron moves by plotting its value along an x axis passing through the two nuclei, for an internuclear separation R = 1.1 A. The potential energy is symmetrical with respect to a plane perpendicular to the line connecting the two nuclei and passing through its WN x (A) S4NO9 1N31 `dAO0 —6 —5-4-3 —2 —1 0 1 2 3 4 5 6 Odd Even R = 1.1Â —4 —3 ® 1 2 3 4 —1 x (Â) —2 3 Figure 12-2 Top: The potential function, and the two lowest energy levels, for an electron in a H2 molecule with internuclear separation R = 1.1 A. The potential function is evaluated along the line passing through the two nuclei. Bottom: The even and odd eigerunctions corresponding to the two energy levels, evaluated along the internuclear line. Néar each nucleus, both eigenfunctions have magnitudes that are decreasing exponentials of-the distance from the nucleus, as in the ground state of the hydrogen atom. middle, since the potential is just the sum of a Coulomb potential centered on one end of that line and an equal Coulomb potential centered on the other end. Because the motion of the electron in a bound state of this potential will have the same symmetry, the electron's bound state probability densities etfr will have equal values at two points on either side of the plane and equidistant from it. But this requires each of its eigenfunctions 0 to have either precisely the same value at the two points, or else to have at one point a value precisely the negative of its value at the other point. That is, the eigenfunctions must be either even or odd with respect to reflection in the plane. The situation is shown schematically in the bottom of Figure 12-2 by plotting the lowest energy even and odd normalized eigenfunctions along a line passing through the two nuclei. The important idea is that the odd eigenfunction must necessarily have zero value at the center of this line since it obeys the equation t, ( — x) — (x), which would otherwise be internally inconsistent at the center where x = O. But the even eigenfunction is not so constrained, and thus it has an appreciable value at x = O. Because an electron with probability density vf*/i for the odd eigenfunction must avoid the center of the molecule, to a certain extent it avoids the central region. And since the integral over all space of 0*0 equals one, if that quantity is relatively small in the region between the nuclei, it must be relatively large in the regions outside the nuclei. These outside regions are where the potential is least binding, however, so such an electron is relatively loosely bound. The odd eigenfunction could be more tightly concentrated in the regions near the nuclei, while still being zero at the center, but only if its curvature were higher. Since higher curvature requires higher kinetic energy, this would not decrease the total energy of the electron. An electron whose behavior is described by the probability density for the even eigenfunction has a relatively high probability of being found in the region where the potential is most _ 0 MOLECU LES N binding—that is, in the region from near one nucleus, through the center of the molecule, to near the other nucleus. Thus such an electron is relatively tightly bound. The two lowest energy levels for an electron in the potential are shown in Figure 12-2. We can now understand why the lowest of these is for the quantum state in which the eigenfunction is even. Figure 12-3 shows the sum of the electron energy and the internuclear. Coulomb repulsion energy for the two lowest energy states of the H2 molecule, as a function of the internuclear separation distance R. For very large R, the electron will bind to one nucleus or the other in the lowest energy state of an H atom, and the repulsion energy will be negligible, so the energy of the system will have the familiar value —13.6 eV. For the quantum state with the even eigenfunction, the energy of the system at first decreases with decreasing R. The reason is that the binding energy exerted on the electron already near one nucleus becomes negative more rapidly, as the other nucleus moves into proximity, than the repulsion energy between the two nuclei becomes positive. (The electron in the even eigenfunction state at moderate internuclear separation tends to be between the nuclei, so its distance to either nucleus is smaller than the distance separating the nuclei.) As the internuclear separation continues to decrease, the energy of the system passes through a minimum and then begins to increase rapidly. This happens because the electron binding energy when the nuclei overlap can become no more negative than — (2) 2 x 13.6 eV = — 54.4 eV, the ground state energy of a singly ionized helium atom, whereas the internuclear repulsion energy increases without limit as the internuclear separation decreases. For the even eigenfunction case the molecule is stably bound by a rudimentary covalent bond. At equilibrium it has R 1.1 A, which is where the energy as a function of R has a minimum that is about 2.7 eV deep. The measured binding energy, i.e., the energy required to dissociate HZ into H and H + , is in good agreement with this value. Because of the significantly weaker binding of the electron in the odd eigenfunction state, the corresponding total molecular energy curve does not have a minimum at any value of R. Thus the molecule will not bind if the eigenfunction of the electron is odd since its energy always decreases as the nuclear separation increases. If we now add a second electron to H2 to form H2, the energy of the system is decreased further, the two additional attractive forces acting between this electron and the nuclei more than counteracting the electron-electron repulsion. For H2 the binding energy is about 4.7 eV, and the equilibrium internuclear separation is about 0.7 A. So H2 is more compact, and more tightly bound, than HI. The second electron in H2 goes into a quantum state whose eigenfunction has the same space properties 1 2 I I; 3 I 4 5 I I 6 I 7 I h0 CI) W Even Figure 12-3 The total energy of the HZ molecule for the two lowest electron energy levels, as a function of the internuclear separation. The molecule binds only in the state where the electron eigenfunction is even. IF "Parallel" spins and antisymmetric space eigenfunction o —4.7 > R (A) "Antiparallel" spins and symmetric space eigenfunction Figure 12-4 The total energy of the H2 molecule for "parallel" and "antiparallel" electron spins, as a function of the internuclear separation. The molecule binds only in the state where the electron spins are "antiparallel". SON OS 1N31VAO0 as the eigenfunction for the first electron. That is, in the lowest energy state of H2 both electrons are in a state with the same space eigenfunction, and that eigenfunction is even with respect to reflection in the plane halfway between the two nuclei. So for both the probability density shows some concentration in the region between the two nuclei. Of course the exclusion principle demands that the two electrons have different spin eigenfunctions; thus they have spins with opposite z components. Using the more precise terms of Section 9-3, the eigenfunction describing the system of two indistinguishable electrons is a product of a symmetric space eigenfunction and the antisymmetric (i.e., singlet) spin eigenfunction . In that section we found that the two electrons may be relatively close together when the system is described by such an eigenfunction. Of course this is consistent with the idea that both have a reasonable chance of being located near the point halfway between the nuclei. Because of the complete space overlap of the wave functions of the indistinguishable electrons in H2, it is definitely not possible to associate a particular electron with a particular atom of the molecule. Instead, the two electrons, which are responsible for the bond that holds the atoms together as a molecule, are shared by the molecule, or shared by the bond itself. This is the idea of the shared pair of electrons, with "antiparallel" spins, that form a covalent bond. Note that if the two electrons had essentially parallel spins they could not both be in the region between the two nuclei. Then they could not both be where they optimize the attraction exerted on them by both nuclei. If we imagined trying to form H2 by bringing two separated H atoms together, it would make a decisive difference whether the electrons' spins were "parallel" or "antiparallel." In Figure 12-4 we show the prediction of quantum mechanics for the total energy of the system as a function of internuclear separation in the two possibilities; binding is obtained only for "antiparallel" spins. The calculations that produced the curves in Figure 12-4 take into account the electron-electron repulsion. This has a quantitative effect in reducing the binding, but it does not make a qualitative change in the description we have presented of the origin of the covalent bond. No more than two electrons can form one covalent bond. We say an electron from one atom pairs up with an electron of "antiparallel" spin from another atom. If an atom has several electrons in an uncompleted outer subshell, i.e., if it has several valence electrons, each may try to form a covalent bond with a valence electron in a nearby atom. However, if there are two valence electrons with "antiparallel" spins in one atom, an additional valence electron from another atom will not succeed in forming a bond with either of them since they are already paired with each other. N MOLECULES N That is, if the spin of the additional electron is "antiparallel" to the spin of one of these electrons, it is "parallel" to the spin of the other. Since the exclusion principle acts in the molecule in such a way as to prevent two electrons with "parallel" spins from having the same space eigenfunction, the additional electron may not occupy the same energetically favorable molecular region as the electrons of the preexisting pair. Therefore the valence electrons of an atom that are effective in forming covalent bonds are those which the action of the exclusion principle in the atom has not already forced into pairs with "antiparallel" spins. For instance, in the Hartree theory all of the three 2p electrons in N can have "parallel" spins because there are three possible values of the quantum number m1 for 1 = 1, so none of them are forced to pair in that atom. (In the residual Coulomb interaction theory the three electrons do have "parallel" spins in the ground state of the LS coupling atom N.) The result is that the molecule N2 has three covalent bonds. But O has a fourth electron in the 2p subshell, and the exclusion principle forces it to have its spin "antiparallel" to the spin of one of the other three. So there are only two unpaired valence electrons in 0, and the molecule 0 2 has only two covalent bonds. In general, the number of unpaired valence electrons equals the number of electrons in the subshell up to the point where it is half filled, and it equals the number of vacancies, or holes, in the subshell beyond that point. As in ionic binding, the forces saturate in covalent binding. That is, a given atom strongly interacts with only a limited number of other atoms. Saturation is due to the limited number of electrons or vacancies in the outermost occupied subshell of the atom. As distinguished from the ionic bond, the covalent bond is directional. The directional property is not present in H2 since the probability density of the valence electron in each separated H atom is spherically symmetrical, so the only defined direction in the H 2 molecule is the one connecting the two nuclei, and the covalent bond acts along that direction, whatever it may be. In a more typical case the probability density of a valence electron has its own directional dependence and certain preferred directions for forming covalent bonds. The directional properties of covalent bonds are manifested in the structural properties of covalently bonded molecules, and so form the basis of organic chemistry. The charge distribution of the paired electrons in a covalent bond has a symmetry about the center of the molecule, as we discussed in the case of H2, so there is no permanent electric dipole moment associated with the covalent bond. The bond is therefore sometimes called homopolar. Because the binding in molecules other than those containing two identical nuclei may be partly ionic, even though principally covalent, only molecules such as 02 or N2 are strictly homopolar. 12-4 MOLECULAR SPECTRA Molecules can remain bound in excited states as well as in the ground state. The emission and absorption spectra of molecules are due to transitions between allowed energy states. The energy-level scheme is relatively complicated and differs in many respects from the atomic case. For one thing, we can no longer classify states according to the electronic orbital angular momentum. Because the force on an electron is not a central force (in a diatomic molecule, e.g., there are two separated nuclear attracting centers), the magnitude of its orbital angular momentum L is not conserved. In the words of Section 7-9, the energy eigenfunctions are not eignfunctions of the operator L op . However, in a diatomic molecule the total charge distribution is symmetrical about an axis connecting the nuclei, say the z axis, so that the component of angular momentum about this axis, L Z , is conserved. We find then that the molecular energy eigenfunctions are eigenfunctions of LZ0 and that LZ has allowed values which are integral multiples of h, in analogy to the values m 1h of atomic states. 12 5 ROTATIONAL SPECTRA - The rotational motion of a diatomic molecule can be visualized as the rotation of a rigid body about its center of mass, illustrated in Figure 12-5. The center of mass lies on the axis connecting the nuclei, and the angular momentum associated with the rotation is a vector passing through the center of mass on the axis of rotation perpendicular to the internuclear axis. Rotation about the internuclear axis itself is negligible. The rotational inertia, or moment of inertia, about the axis of rotation due to the nuclei is I = µRô where R 0 is the (equilibrium) separation of the nuclei and `d 1:1103dS1HN OI 1tf1OE1 Another difference between the molecular and atomic cases is that we could neglect the nuclear motion in an atom, or else we could take it into account easily by using the reduced electron mass. Of course, in a molecule, as well as in an atom, we do not need to consider the translational motion because that motion, being free particle motion, is not quantized. However, the nuclei in a molecule can move relative to one another. In a diatomic molecule, for example, the nuclei can vibrate about the equilibrium separation, and in addition the whole system can rotate about its center of mass. The energy in each of these motions, vibrational and rotational, is quantized so that we expect many more energy levels in a molecule than in an atom. Indeed, these motions interact or couple with one another and an exact analysis would have to take this into account. Of course, the solution of the Schroedinger equation for any but the simplest molecules is very difficult. However, empirical results of molecular spectroscopy show that we can consider the energy of a molecule to be made up of three p ri ncipal parts— electronic, vibrational, and rotational. The molecular energy levels fall into widely separated groups, each group being said to correspond to a different electronic state of the molecule. For a given electronic state the levels again fall into groups separated by nearly equal energy intervals; these are said to correspond to successive states of vibration of the nuclei. Within a vibrational state is a fine structure of levels ascribed to different states of rotation of the molecules. This level structure (which will be discussed later in connection with Figure 12-9) suggests that we can obtain an approximate solution to the Schroedinger equation by separating it into three equations, one describing the motion of the electrons, one the vibration of the nuclei, and one the rotation of the nuclei. In the next approximation we can take into account the coupling between the electronic and the nuclear motions, such as that between the electronic angular momentum and the rotation of the molecule, and the coupling between the nuclear vibrational and rotational motions. The spectrum emitted by a molecule can be divided into three spectral ranges corresponding to the different types of transitions between molecular quantum states. In the far infrared we observe the rotation spectra, corresponding to radiation emitted in transitions between rotational states of a molecule having an electric dipole moment. In the near infrared we observe the vibration-rotation spectra, corresponding to radiation emitted in vibrational transitions of molecules having electric dipole moments, within which there are changes in rotational states as well. In the visible and ultraviolet part of the spectrum we observe electronic spectra, corresponding to radiation emitted in electronic transitions. The electronic vibrations undergo many cycles in the time required for the nuclear configuration to change (this being the physical reason that permits us to separate the eigenfunction into an electronic and nuclear factor to begin with), so that the electronic spectra have a fine structure determined by the rotational and vibrational state of the nuclei during electronic transitions. In the succeeding sections we shall examine the motion and spectra of diatomic molecules and from this extract valuable information about their properties. MOLECULES Axis of rotation z axis "t- m1 - H / • (Internuclear axis) ri Rotating diatomic molecule ^ ^ Dynamically equivalent one-body model Figure 12-5 Top: A simp ified picture of a diatomic molecule consisting of two masses m 1 and m 2 rotating about their common center of mass (CM) with separation R o . Bottom: A dynamically equivalent model consisting of a reduced mass µ = m 1 m 2 /(m 1 + m 2 ) rotating at distance R o about a fixed point. If v is the speed of the reduced mass µ, then its kinetic energy of rotation is Er = µv /2 and its angular momentum is L = µvR o . So Er = µL 2/21.12R1 = L 2/20> = L2/2I, where 1 - µR1 is its rotational inertia, or moment of inertia. µ is the reduced mass of the system. As is proven in the caption to Figure 12-5, the rotational energy is, classically, Er = L2/21 where L is the angular momentum of the system about the axis of rotation. Quantization of the magnitude of the angular momentum gives L2 = r(r + 1)h 2 with the rotational quantum number r = 0, 1, 2, ... , so that h2 Er = 21 r(r + 1) (12-1) Successive rotational levels will be separated in energy by DEr = Er — Er_ 1 = [r(r + 1) — (r — 1)r] = h2 (12-2) 2I The quantity b 2/I for the typical molecule has a value of about 10 -4 eV to 10 -3 eV, so little energy is needed to raise a molecule to an excited rotational state. At room temperature, for example, the translational thermal energy of molecules is 2.5 x 10 -2 eV, so that ordinary collisions can transfer the necessary energy of excitation. At any given temperature the rotational state populations obey the Boltzmann distribution, since they are spread over many states so each population is small. If the molecule has a permanent electric dipole moment, as do all diatomic molecules that do not have identical nuclei, rotational emission and absorption spectra may be observed. The emission of radiation is due to the rotation of the electric dipole, and the absorption of radiation is due to the interaction of this dipole with the electric field of the incident radiation. For electric dipole radiation, the allowed transitions between states are given by the selection rule analogous to that for orbital or 1 h (12-3) 27rIc r A in which r is the quantum number of the upper rotational state. With Ar = ± 1, the separation between spectral lines (in terms of reciprocal wavelength) then is A(1/2) = h/27tIc, a constant. This is illustrated in Figure 12-6. Measurement of the separation gives the value of I, the rotational inertia of the molecule, and from this we can estimate the value of the equilibrium internuclear separation R o . In the case of HCI, for r-5 0 =1 rf _0 2 ri 3 2 4 3 5 4 v 100 7 ° E. 60 ô 40 5 4 3 .1 1 234 , 41 ^ 8 8 • 80 — ¢ 6 9 9 to 11 to J 20 -12 104° HCI 103° J 12 102° 101° Grating setting 100° 99° Top: Schematic energy-level diagram for the rotational energy states of a diatomic molecule, and the corresponding frequency emission spectrum for allowed transitions. Bottom: The rotational absorption spectrum for gaseous HCI, giving the percent absorption versus a measure of the reciprocal wavelength. Figure 12-6 N v, m C, N. 6 : b'1:1103d S 1b'N 011b'1O1:1 angular momentum in atomic transitions, namely Ar = ± 1. The spectral wavelengths A follow from (12-2), and AE, = by Q That is h2 he r=A I co MOLECULES N N _jes Q- example, we find h/27cic = 2079.4 m -1 , which gives I = 2.66 x 10 -47 kg-m2 ; from the known masses of H and Cl we then obtain R o = 1.27 x 10 -10 m as a measure of the separation of the atoms in the molecule. Pure rotational spectra fall in the extreme infrared or the microwave regions, the corresponding wavelengths A being about 1 mm to 1 cm. An example is shown in Figure 12-6. Diatomic molecules with identical nuclei, like 02 , having no permanent electric dipole moment, do not exhibit pure rotational spectra. T. Example 12 2. (a) Find the ratio of nr, the number of molecules in rotational level r, to n0 , the number in the r = 0 level, in a sample in equilibrium at temperature ^^ From the Boltzmann factor we have - nr _ '17r e - (Er-E0)/kT no Jr0 in which the "Cs are the degeneracy factors, or number of degenerate quantum states for each there are 2r + 1 states, corresponding to the number of possible energy level. For energy values of the z component quantum number m r associated with each value of r. Hence, ✓Vr, = 2r + 1 and x0 = 1, so that E,. nr - (Er - E0)lkT • no (b) Show that the population of rotational energy levels first increases with r and then decreases as r continues to increase. •From (12-1) we have (h2/2I)r(r + 1) and E 0 = 0, so that nr = no (2r + 1)e -( h 212IkT) r(r + 1) Now as r increases the factor 2r + 1 increases whereas the exponential factor decreases. For large r the exponential term dominates so that at first n r increases with r, but soon the exponential suppresses the increase and n,. decreases for larger r. For example, for HBr at room temperature nr is a maximum at r = 3 with n 3/n0 ^ 4, whereas by r = 9 we have n9/n0 1/2. E,. = • (c) Relate these populations to the intensities of the rotational lines. ^ Consider the absorption spectrum. The probability that a particular frequency will be absorbed is proportional to the number of molecules in the initial rotational energy level. Hence the intensity variation of the absorption lines (Ar = + 1) are proportional to the populations of the initial rotational energy levels (see Figure 12-6). The student should construct a similar argument for the emission spectrum. • 12-6 VIBRATION-ROTATION SPECTRA The nuclei do not maintain a fixed separation, of course, as we assumed previously, so that the molecule is not like a rotating rigid body except in approximation. Indeed, the rotational inertia I changes from the value assumed previously when the molecule rotates because of the stretching of the internuclear distance. Also the nuclei vibrate about some equilibrium separation and this vibrational motion is quantized. Let us now consider the vibrational motion. For a given electronic configuration, we have a potential energy curve whose minimum is at an equilibrium separation R 0 . Near R 0 the curve is nearly a parabola so that small oscillations are simple harmonic. According to (6-89) the energy of such oscillations is quantized to satisfy Ev = (y + 1/2)hv 0 (12-4) with the vibrational quantum number y = 0, 1, 2, 3, ... , and where the classical vibration frequency is v 0 = (1/27r) /C/µ. Note that the energy levels here are equally spaced and that there is a zero-point energy (1/2)hv 0 . The separation hvo equals 0.04 eV for NaC1 and, because the dissociation energy is about 1 eV, there are approximately 20 vibrational levels in the potential well. Actually as the energy rises the potential (a) Given that the equivalent force constant C of a vibrating HC1 molecule is about 470 nt/m, estimate the energy difference between the lowest and the first vibrational state of HC1. ^ We have for HC1 35 and C = 470 nt/m = 36 YnH Example 12-3. and also _ 1 1 = mH 6.02 x 1023 g 6.02 x 10 26 kg From (12-4) we have that AE = hv o , where vo = (1/211) /C/µ. Hence, using these data, we get the energy difference to be hv o = (h/27r),/C/µ = 0.59 x 10 -19 joule = 0.37 eV. • (b) Given that the rotational inertia of HC1 has the value I = 2.66 x 10 -47 kg-m 2, estimate the energy difference between the lowest and first excited rotational state of HC1. ^ Since Er = (h 2/2I)r(r + 1), the lowest rotational state has an energy E 0 = 0 and the first excited rotational state has an energy E 1 = (h2/21)2 = h2/I. The required energy difference then is AE = h 2/I. Hence h2 _ (6.63 x 10 -34 joule-sec)2 — 4.2 x 10 22 joule = 2.6 x 103eV I (2x) 2 x 2.66 x 10 - 47 kg-m 2 Thus the energy difference between the two lowest vibrational levels is greater by a factor • 142 (i.e., 0.37/2.6 x 10 -3) than that between the two lowest rotational levels in HC1. (c) At room temperature, collisions of HC1 molecules in a gas can transfer sufficient kinetic energy to internal energy to excite many rotational states.. At what temperature would the number of molecules in the first excited vibrational state be equal to 1/e (about 37%) of the number in the ground vibrational state? ■ We have n1 - -47. 1 e -(E1 -Eo)/kT no At0 where the subscripts refer to y = 1 or y = 0. The vibrational states are not degenerate so that = 1 = .iro . Also (E 1 — E0) = hvo so that n1 no =e -nvo/kT and if kT = hv o n 1 = no e 1 Hence hv o 0.59 x 10 -19 joule 4300°K 1.38 x 10 -23 joule/°K is the temperature at which the number of HC1 molecules in the first excited vibrational state is about 37% of the number in the ground state. Clearly the number of HC1 molecules in the = 1 state at room temperature is negligible compared to the number in the ground state. v T = k • If the molecule, like HC1 or NaC1, has a permanent electric dipole moment at the equilibrium internuclear separation, it will exhibit vibrational emission and absorption spectra due to the oscillations in the electric dipole moment arising from oscillations in the nuclear separation. The selection rule for electric dipole transitions is Av = + 1 so that AEU ^ hvo . The resulting spectral lines lie in the infrared, between 8000 A and 50,000 A for most molecules. Diatomic molecules with identical nuclei 1 N ^ `d1:1 103dS NOI1`d1O1J -N OIlb'Id 8IA energy curve becomes anharmonic so that the levels are not equally separated but get somewhat closer to one another. The rotational levels are spaced much closer still, as we saw earlier, there being about 40 rotational levels of NaC1, and about 50 of HC1, between each pair of vibrational levels. do not have vibrational spectra because they have no electric dipole moment at any nuclear separation. In a vibrational transition the molecule may also change its rotational state so that vibrational changes really result in a combined vibration-rotation spectrum. The vibrational transition determines the wavelength region of the spectrum and the rotational transitions determine the separation of the lines. The spectrum consists of a band of lines, as in Figure 12-7. Among the interesting results that can be obtained from analysis of vibrational states and spectra are the relative abundance of nuclear isotopes. The frequency of vibration, vo = (1/2ic)JC/,u, depends on the masses of the atoms in the molecule through the reduced mass u. If in a sample of HCI molecules, for example, the isotopes C1 35 and C1 37 are each present, then the vibrational frequencies and resulting energy levels will be slightly different for the two types of molecule (see Figure 12-7). Their spectral lines, consequently, will be shifted with respect to one another, and from a measurement of spectral intensities we can obtain the relative abundance of the isotopes Cl" and Cl". r'- 5 4 = 3 1 2 1 0 Ar= +1 r" — Or =0 Or= — 1--> 5 N./ 4 =0 3 v 2 V 1 V 0 5 4 -^ -- v HCI Absorption MOLECULES co N IvN ^ 6 3000 Figure 12-7 2900 2800 Reciprocal wavelength (cm -1 ) 2700 Top: Energy-level diagram for vibrational and rotational states of a diatomic molecule, showing allowed transitions and the formation of a band of equally spaced lines, as indicated in the spectrum below. Note that all Ar = 0 transitions would yield photons of the same frequency v o , but being forbidden, that line is missing in the spectrum. Bottom: A recorder trace of the vibration-rotation absorption spectrum in HCI. Again note that the central transition is missing. The slightly different frequencies at each absorption line are due to the presence of two isotopes of chlorine. Figure 12 8 The energy for H2, HD, and D2 is the same function of the internuclear separation R. But the ground state vibrational energy S differs for each molecule. - In a somewhat related way we obtain experimental evidence for the finite zeropoint energy of an oscillator. Consider the molecules H2, HD, and D2 in which D stands for a deuterium atom. Because the electric forces are identical in all cases we obtain for all the same potential energy curve V(R), illustrated in Figure 12-8. The energy required to dissociate the molecule is Ed = V0 — b. If the ground state energy 8 were zero, then the dissociation energies would be the same, Ed = Vo , for each type of molecule. Quantum theory gives a finite zero-point energy, namely b = (1/2)hvo . However, because the reduced mass ,u enters the formula for v o , a has a different value for each type of molecule so that their dissociation energies should differ. In fact, with /1D2 = 2µ H2 and P HD = ( 4/3),uH2 we can predict the difference, and we find that the observed dissociation energies differ exactly as predicted, thereby verifying the existence of a zero-point energy in agreement with the requirements of the uncertainty principle. In Table 12-1 we list the rotational and vibrational constants of some diatomic molecules. 12 7 ELECTRONIC SPECTRA - The rotational and vibrational states in molecules are due to the motion of the nuclei. There can be also electronic excited states, of course. For each of the electronic states, corresponding to different electron configurations, there is a different dependence of the molecule's energy on its internuclear separation. Because the atoms are more loosely bound in the excited states, the curves representing the molecule's potential energy as a function of nuclear separation become shallower and broader, and the Table 12 1 - Rotational and Vibrational Constants of Some Diatomic Molecules ^2 ^2 Molecule Ro(A) v o (cm -1 ) H2 HD D2 Li2 N2 02 O.74 0.74 0.74 2.67 1.09 1.21 4395 3817 3118 351 2360 1580 21 (eV) 7.56 x 10 -3 5.69 x 10 - 3 3.79 x 10 -3 8.39 x 10 -5 2.48 x 10 -4 1.78 x 10 -4 Molecule R o(Â) LiH 1.60 HC1 3 5 1.27 2.51 2.79 2.94 1.41 NaC1 35 KC1 35 KBr79 HBr79 vo (cm - i) 1406 2990 380 280 231 2650 — 2/ (eV) 9.27 x 10 -4 1.32 x 10 -3 2.36 x 10 -5 1.43 x 10 -5 9.1 x 10 -6 1.06 x 10 -3 b'a103dS O INOa1O313 0 0 MO LECULES M N d L U 11 10 9 8 j 7 6 5 One electronic state Vibrational levels E," Rotational levels Er" 0 R — Figure 12 9 Illustrating the molecular energy versus internuclear separation curves for two electronic states. Each electronic state has its own set of vibrational levels, and each vibrational level has its own set of rotational levels. - equilibrium separation R 0 increases, with increasing electronic excitation, as illustrated in Figure 12-9. The energy separation between different electronic states is from 1 to 10 eV, so that transitions between electronic states give radiation in the visible or ultraviolet portion of the electromagnetic spectrum. To each electronic state Ee there are many bound vibrational states of energy E0 , and to each vibrational state there are many bound rotational states of energy Er. Neglecting interactions between these modes, we can write the total energy as E _ Ee + E v + Er. The energies of all three modes may change in an electronic transition so that in general we can write 4E = AEe + (E', — Fe') + (E'r — Er) (12-5) The initial (primed) and final (double-primed) vibrational and rotational states differ in their binding so that the equilibrium spacing, the rotational inertia, and the fundamental vibrational frequency change. A great many transitions are possible and they produce a complex spectrum of lines, which appear in a series of bands as illustrated in Figure 12-10. Hence the term band spectra. The term 4Ee is the energy difference of the minima of the two electronic states. The vibrational term is Ev — Ez = (y' + 1/2)hvO — (v" + 1/2)hv' (; and the rotational term is Er — Er = (h2/21')r'(r + 1) — (h2/2l")r"(r" + 1). For a given electronic transition the spectrum consists of bands, where each band corresponds to given values r'- 11 10 9 8 tJa103 dSOIN O 1=11031 3 7 6 5 4 3 2 // 0 r" = V 11 V V ^ r 1J y r ^ V 1y 3 7 I r r rI 5 Y r 5 y 1 3 r i 1^ 0 r' = 10 r " = 11 2 1 0 U 123 9 10 3 2 1 ° 012 C2 Swan bands 6191 A 5636 A O IIIIII Ill In N .-I I I I CO 1fl <t 5165 A O N III M N" I ' I n CO 1f1 CN (Red) O II O III 4737 A 4383 A O O N IIII II IIIIIII N I O 4606 A IIIIIIiii HI .--I 1 O 4216 ill O I .-I O 1 O A 3883 A 3590 A 'CN (Violet) Figure 12-10 Top: Energy-level diagram and transitions leading to the formation of an electronic band. Unlike Figure 12-7, the band spectrum indicated folds back on itself, giving rise to a band head at the right end of the spectrum. Again note that the transition of frequency v o is missing. Bottom: Bands of the CN and C2 molecules in a carbon arc in air. (From Herzberg, Spectra of Diatomic Molecules, 1950. D. Van Nostrand Co., Inc., New York) N MOLECULES M of v' and y" and all possible values of r' and r". The selection rules determine the possible combination of values of y', y", and r', r". The rotational selection rule here is Ar = 0, ± 1 for electric dipole radiation. This rule is broader than for pure rotation in that Ar = 0 is now allowed. The reason is that the change in the electronic configuration accompanying the rotational change eliminates the parity considerations which earlier excluded Ar = 0 (see Section 8-7). The vibrational selection rule for electric dipole radiation is Av = ± 1 for a simple harmonic oscillator. If, however, the potential deviates from the simple harmonic,_ i.e., if it is anharmonic, then Av = 2, 3, ... , etc., are also allowed. These vibrational rules apply only if the electronic state does not change and they apply to pure vibration-rotation bands. If there is a change in electronic state then the selection rules are determined from the so-called Franck-Condon principle, which we explain next. We have seen that there is little interaction between the electronic motion and the nuclear motion in a molecule. Furthermore, the characteristic time for an electronic 16 sec, whereas for a nuclear vibration the time has the much transition is At ^ 10' longer value At ^ 10 -13 sec. As a result the internuclear distance stays about the same during an electronic transition, and a vertical line (a line of constant R) in Figure 12-9 accurately represents such a transition. If the upper state corresponds to y' = 0, then the probability distribution function for the oscillator is large only near the equilibrium separation, and an electronic transition to the lower state leaves the molecule at about the point P on the potential curve in that figure. This corresponds to y" = 7 for the lower state. Notice that classically the nuclei have small kinetic energy in each case, because y' = 0 initially, and because P corresponds to the end point of the vibrational motion for y" = 7. This meets the requirement that the relative nuclear velocity be about the same in both states at the time of a transition in order that the nuclear motion be able to adjust quickly to the new electronic conditions. Transitions are most favorable under these conditions. Quantum mechanically we get the same result because in the ground state of an oscillator, as in y' = 0, the maximum amplitude of the eigenfunction occurs at the center of the motion, whereas for the upper states, such as in y" = 7, the eigenfunction has maximum amplitude near the ends of the oscillation. Since the integral in the electric dipole matrix element, (8-42), that determines the relative intensities, or selection rules, involves a product of the eigenfunctions of the upper and lower states, the intensities will be large only where both these eigenfunctions have significant space overlap. In general, the most favored transitions are those which, from a classical point of view, can occur with the internuclear distance for both initial and final states the same and the nuclei at end points of their oscillations. Examples in Figure 12-9 are shown by vertical lines from v' = 5 to y" = 2 or y" = 11. These rules were deduced by Franck from classical considerations and put on a firm quantum mechanical basis by Condon. If the excited electronic state is not bound, the molecule dissociates. Because such unbound states have a continuum of possible energies, the corresponding spectrum gives a continuous band. The appearance of a continuum in the absorption spectrum of a molecule is therefore experimental evidence for photochemical dissociation. 12-8 THE RAMAN EFFECT An interesting effect which gives much information about molecular quantum states was discovered experimentally in 1928 by Raman. This is the scattering of light by molecules with a frequency change. The student may be familiar with other light scattering processes. In ordinary Rayleigh scattering by molecules, the scattered frequency is the same as the incident frequency. In the fluorescence process, the frequency of the incident light coincides with an absorption frequency of the scattering gas molecules; this is a resonance phenomenon in which the molecule is raised to an excited state and, after a short lifetime there, reemits light at a r- 4 3 2 ^ A 1 0 v—^ Figure 12-11 Schematic diagram showing the origin of rotational Raman lines on each side of the Rayleigh scattering line. W w 103333 N `dWbId 3H1 8-Z1- '00S different frequency. In the Raman effect, the scattered frequency is different from the incident frequency, and the incident frequency is not related to a characteristic frequency of the scattering molecule. If the incident radiation is intense and monochromatic with a frequency y, it is found that the light scattered at right angles to the incident direction contains not only radiation of frequency y (Rayleigh scattering), but also weaker radiation of frequency y + v' (Raman scattering). The scattered spectrum therefore has weak Raman lines on each side of the Rayleigh line. If we change the incident frequency, we again find weak lines on each side of the Rayleigh line in the scattered spectrum with the same frequency difference as before. The frequency difference V between the incident and scattered light in the Raman effect is characteristic of transitions in the scattering molecule. During the scattering process the molecule may have its state changed from one allowed energy to another. To conserve energy in the process the scattered photon must then have an energy different from the incident photon by an amount equal but opposite to the molecular energy change. Consider a scattering molecule in a rotational state r. In the ordinary rotational spectrum, lines will be found corresponding to transitions with Ar = + 1. In the scattered Raman spectrum, however, we find frequency shifts from the incident frequency that correspond to rotational transitions in the scattering molecule with Ar = ± 2. Hence, transitions that are not allowed in the ordinary emission or absorption spectrum are allowed in the Raman process. A quantum mechanical analysis of the Raman process leads to the conclusion that a Raman transition between states a and f can occur only if there is a state y such that ordinary transitions are allowed between a and y and /3 and y. It is as though we get from a to 13 by going through y. In this case, if a has quantum number r then y has r + 1. An ordinary transition from y to /3, however, requires another change Ar = + 1, so that the total change in r from a to 16 is Ar = 0, ± 2. The Ar = 0 selection rule gives Rayleigh scattering, and the Ar = +2 selection rule gives Raman scattering. Hence in the scattered spectrum we have lines on each side of the incident line which are spaced about twice as far apart in frequency as the lines in the ordinary rotational spectrum. This is shown schematically in Figure 12-11. There is a Raman effect with vibrational states as well. In the process of scattering a photon of frequency y a molecule may change its vibrational state. Because Ay = + 1, the final vibrational level of the molecule may be one just above or just below the initial level. Therefore the Raman scattering frequency will be y + y', where the frequency change y' is a characteristic vibrational frequency of the molecule. At ordinary temperature, however, most molecules are in the ground vibrational state, y = 0, so that the molecule absorbs energy in changing to state y = 1. Hence, only the lower frequency line y — y' appears in the Raman spectrum. However, the higher frequency line y + v' may be observed if the y = 1 level is sufficiently populated so that enough transitions from y = 1 to y = 0 occur to give detectable intensities. This is more likely the lower the energy of the y = 1 state and the higher the temperature of the scattering gas. As an example of the utility of Raman scattering, consider molecules with two identical nuclei, such as 0 2 and N2. We cannot directly observe rotational spectra or vibration-rotation spectra for such molecules because they have no electric dipole moment. We can, however, obtain a spectrum corresponding to vibration and rotation of such molecules in the Raman MOLECULES s o scattering. It is as though the incident radiation polarizes the molecule, thereby inducing an electric dipole moment; this permits absorption and emission of radiation corresponding to rotational and vibrational motions of the molecule. Of course, in an electronic transition in 02 or N2 the fine structure of the,spectrum reveals the vibrational and rotational structure, but such a spectrum lies in the ultraviolet and the fine structure is very difficult to resolve. Historically, Rasetti used the Raman spectrum to make the first determination of the rotational inertia, or moment of inertia, of the N2 molecule. 12-9 DETERMINATION OF NUCLEAR SPIN AND SYMMETRY CHARACTER We have ignored the weaker interactions that enter in the detailed structure of molecular spectra, such as the effect of nuclear spin on the energy states of a molecule. But we cannot ignore a very important effect that nuclear spin has on the spectrum of a molecule even when the spin interaction itself is negligible. For a diatomic molecule with identical nuclei, the states that can be occupied and the transitions that are allowed are restricted by symmetry requirements. If the nuclear spins are integral (0,1,2, ...) then the complete eigenfunction of the molecule must be symmetric with respect to exchange of the labels of the two identical boson nuclei. If the nuclear spins are half-integral (1/2,3/2, ...) then this eigenfunction must be antisymmetric in an exchange of the labels of the two nuclei because they are identical fermions. If we neglect the small interactions between the modes associated with the electronic, vibrational, rotational, and nuclear spin behavior of the molecule, we can write the molecular eigenfunction as a product of four factors. Since it is usually the case, we henceforth assume the electronic factor is symmetric in an exchange of the labels of the two nuclei because it is even in a reflection in the plane half way between them (as in H 2). The vibrational factor is always symmetric since it can be written tfry = (Mkt — x 2 1) where x 1 and x2 are the coordinates of the nuclei labeled 1 and 2, measured along their center to center line. That is, the independent variable in the vibrational eigenfunction is the magnitude of the distance between the two identical nuclei. Since this does not change when the nuclear labels are exchanged, 0, itself does not change and so is symmetric with respect to the exchange. Thus the symmetry of the molecular eigenfunction is governed by the symmetry of the product of its rotational factor and its nuclear spin factor. The question of what happens to the sign of the rotational factor Cr when we exchange the labels of the identical nuclei is intimately related to the question of what happens to the sign when we change the signs of all the coordinates, providing we are wise enough to choose the origin of coordinates at the center of the molecule (i.e., at its center of mass, halfway between the nuclei). With this choice, the parity questioning operation of (8-44) (x -* — x,y —* y,z z) obviously accomplishes the same thing as the symmetry questioning operation (1 —> 2,2 —> 1), and the symmetry of O r becomes the same as its parity. Furthermore, we can immediately apply the interpretation of (8-47) to determine the parity of fir, if we change from the orbital angular momentum quantum number l used there to the rotational quantum number r used here, and conclude that the parity of 1// r is even if r is even and the parity of >Jir is odd if r is odd. The justification is that if the rotational angular momentum of the molecule is quantized then there can be no external torques acting on it, so the potential energy function describing the external environment (if any) in which the molecular rotation takes place must be spherically symmetrical about our origin of coordinates; this is the only requirement for the validity of (8-47). Putting it all together, we see that the rotational eigenfunction C. is symmetric if r is even, and antisymmetric if r is odd. — — Para Ortho (symmetric (antisymmetric spin spin eigenfunction) eigenfunction) Ortho Para (symmetric (antisymmetric spin spin eigenfunction) eigenfunction) Half-integral nuclear spin Integral nuclear spin Illustrating the relation between the rotational and spin states that can -be populated in molecules having symmetric electronic factors with identical half-integral, and integral, spin nuclei. The dots indicate the possible states and the arrows indicate transitions between these states. Figure 12 12 - DETERMINATION OF N UCLEARSPIN AND SYMMETRY CHARACTER Now let us consider a situation in which the nuclear spin angular momentum quantum number i has one of the values i = 1/2, 3/2, 5/2, .... Then the complete molecular eigenfunction must be antisymmetric in a nuclear label exchange. There are two ways this can come about: (1) either the nuclear spin eigenfunction is antisymmetric and the rotational eigenfunction is symmetric, or (2) the nuclear spin eigenfunction is symmetric and the rotational eigenfunction is antisymmetric. Both possibilities will occur, but not in the same molecule. The reasons are: (1) the symmetry of the nuclear spin eigenfunction factor is determined by the relative orientation of the two nuclear spins (e.g., for i = 1/2, the symmetric case corresponds to the two spins being essentially parallel while the antisymmetric case corresponds to them being essentially antiparallel, exactly as for two electrons with spin quantum number s = 1/2), and (2) the interaction between the nuclear spins is very small so that if the spins have a particular relative orientation, they will maintain it for a very long time (as long as years). Practically, it is as though there are two distinctly different species of molecules. The species with symmetric nuclear spin eigenfunctions is called ortho and the species with antisymmetric nuclear spin eigenfunctions is called para as, for example, orthohydrogen and parahydrogen. The same terminology is used in the same way, whether i is half-integral or integral. But if i is half-integral, the ortho species has only antisymmetric rotational eigenfunctions and the para species only symmetric rotational eigenfunctions, as we have been considering; while if i is integral, the symmetry of the complete molecular eigenfunction is reversed so the ortho species has only symmetric rotational eigenfunctions and the para species has only antisymmetric rotational eigenfunctions. These relations are summarized in the rotational energy-level diagrams of Figure 12-12. The pair on the left is for molecules whose nuclei have half-integral spin. For the ortho species of such molecules only odd-r rotational states can be populated because the rotational eigenfunction must be antisymmetric, and it is only for odd r. In the para species only the symmetric rotational states can be populated, and these are the ones for even r. The relations are reversed for molecules with integral spin nuclei, as is indicated in the pair of energy-level diagrams on the right side of Figure 12-12. The dots in the figure show the energy levels that can be populated, and the arrows show the possible transitions between these levels. MO LECULES co co Since molecules with two identical nuclei have no electric dipole moments, we cannot directly observe the rotational spectra emitted in such transitions; but we can indirectly observe transitions between rotational states in Raman scattering, or in band spectra, as explained in earlier sections. Measurements of the number of transitions made by the para species of such molecules, relative to the number of transitions made by the ortho species, constitute a quite frequently used procedure for determining the value of the spin quantum number i of the nuclei forming the molecules. These numbers are in proportion to the relative amounts of the two species present in the sample and, at ordinary temperatures where many rotational states are excited, the relative amounts are in proportion to the numbers of nuclear spin states for the two species. We shall show in Example 12-6 that the ratio of the number of antisymmetric spin states, Xpara, to the number of symmetric spin states, . Northo, is para _ l (12-6) 1 The 'number of transitions should be in this ratio, so that we get an alternation of intensities in the Raman spectra or band spectra, of diatomic molecules with identical nuclei. This can be seen in the photograph of the N2 rotational Raman spectrum, shown in Figure 12-13, for which the intensities of alternate lines are measured to be quite accurately in the ratio 1/2. Even more dramatic is the spectrum of C2, for which the ratio is 0/1 because alternate lines are completely missing! We do not show that spectrum because the drama is not apparent until a careful comparison between the measured and predicted frequencies of the lines demonstrates that half are absent. 'Kortho l+ Example 12 4. Determine the values of the nuclear spin quantum number i for the nuclei in N2 and C2, by using the measured intensity ratios 1/2 and 0/1 in (12-6). ^^ Since the possible values of i are restricted to i = 0, 1/2, 1, 3/2, 2, ... , inspection immediately demonstrates that the solution to 1 i 2 i +1 is i = 1. This is the spin of the N nucleus (i.e., of its overwhelmingly abundant isotope N14) For - Figure 12 13 - line 2536.5 A. Alternating intensities in a rotational Raman spectrum of N2, excited by the Hg the solution is obviously i = O. This is the spin of the C nucleus (actually, of its most abundant isotope C 12, since the other isotopes, C 13 and C 14, are so rare that the abundant one completely dominates the spectrum). • Example 12 5. In N2 it is observed that transitions involving even-r rotational states yield the most intense lines. Determine the symmetry character of the nuclei in that molecule. ^ Since (12-6) shows that the highest population is for nuclear spin states that are symmetric (ortho), and since even-r rotational states are also symmetric, the symmetric nuclear spin states are associated with the symmetric rotational states. Therefore the N 14 nucleus must be a • boson. - Symmetry character determinations made in this manner on a number of nuclei provided some of the earliest evidence for the correlation, seen in Table 9-1, between symmetry character and spin. Furthermore, we shall see in Chapter 15 how the fact that the particular nucleus N 14 is an i = 1 boson was used at an early date to show that nuclei must contain protons and neutrons, instead of protons and electrons. Show that the ratio of the number of antisymmetric spin states to the number of symmetric spin states is i/(i + 1), in agreement with (12-6). •The number of possible individual states of spin for a particle of a given spin quantum number i is equal to the number of possible values of its z component quantum number m i. Since, as usual, the values of m i differ by integers and range from — i to + i, this number is the familiar (2i + 1). So the total number of possible independent combinations of spin states for two identical particles of spin i is (2i + 1)(2i + 1) = (2i + 1) 2 . In (2i + 1) of these states both particles will have the same m i, and so are in identical spin states. For these the spin eigenfunction of the two particle system is symmetric with respect to particle label exchange (like the top and bottom members of (9-18) in the case of i = 1/2). Of the (2i + 1) 2 — (2i + 1) = 2i(2î + 1) remaining states, half will be symmetric and half will be antisymmetric in such an exchange, since half will involve the sums of products of individual spin eigenfunctions and the other half will involve the differences of the same products (like the center member of (9-18), and (9-17), in the case of i = 1/2). So the total number of symmetric eigenfunctions is Xsymmetric ortho = (2i + 1) + (1/2)2i(2i + 1) = (i + 1)(2i + 1) and the total number of antisymmetric eigenfunctions is 'AV'antisymmetric = pa ra = (1/2)2i(2i + 1) = 1(2l + 1) The ratio of the number of eigenfunctions, or spin states, is Example 12 6. - Afpara _ l cirortho in agreement with (12-6). l +1 • DETE R MINATION O F N UCLEAR SPIN AND SYMMETRY CH ARACTER The reason for the complete absence of half of the transitions involving rotational levels of molecules having symmetric electronic factors and two identical i = 0 nuclei is simply that i = 0 means the nuclei are bosons that have no spin, so the molecular eigenfunction is necessarily symmetric and has no spin factor in it. Therefore its rotational factor must always be symmetric, which requires that the molecule only be in even-r rotational levels. Proof that these symmetry considerations are very real indeed comes from that fact that if in C2 the nuclei are not identical (e.g., if we have C 12 — C13), then half the transitions are not missing. This experimental fact actually led to the discovery of the isotope C 13 As we have said, the procedure of Example 12-4 has been widely applied. It was used in the first determination of the spin i = 1/2 of the proton, from the measured intensity ratio of 1/3 in the spectrum of H2. The measurements are difficult to make only when i becomes very large. The determination of the symmetry character of the identical nuclei in molecules like N2 is a matter of keeping track of which lines of the spectrum are the more intense. M OLECU LES QUESTIONS 1. Discuss the statement that the interatomic force law must be attractive to permit condensed phases and must be repulsive to avoid zero volume. 2. Would you expect H3 to exist in a bound state? He t? Explain. 3. Of the so-called inert gases, which might most easily form molecules with other elements? Explain. 4. How would you explain the existence of bound states of XeF 4, in view of the absence of valence electrons in a Xe atom? 5. Do the even, or odd, H2 eigenfunctions have even, or odd, parity? 6. Explain why only two electrons can form a covalent bond. 7. Would you predict ionic binding or covalent binding in H 2O? In NH3? In CH4? Does experiment decide the issue or can you rule out one or the other types of binding independently? 8. From the fact that CO 2 does not have a permanent electric dipole moment, what can you conclude about the binding and the arrangements of the atoms in the molecule? 9. Of the molecules H2 , D2, and HD, 'which has the greatest binding energy? The least? 10. What does it mean to say that a molecule is in an excited state? 11. Explain how the existence of a finite zero-point vibrational energy is related to the uncertainty principle. 12. The fundamental vibrational energy for HC1 is about ten times that for NaCl. Considering the factors determining this quantity, make this plausible. 13. What effect, if any, does the increasing angular momentum of higher rotational states of a diatomic molecule have on the vibrational energy of the molecule? 14. What effect does the change in internuclear separation in a diatomic molecule due to its vibration (the binding energy curve is asymmetric) have on the rotational energy levels of the molecule? 15. The asymmetry in the binding energy curve accounts for thermal expansion of solids. How can information from molecular spectra be used to determine the shape of this curve? 16. Explain why the separation between vibrational levels is somewhat smaller in an excited electronic state than in the ground electronic state (see Figure 12-9). Explain the same effect for rotational states. 17. If Raman rotational lines arise from an induced electric dipole moment how can we explain that the selection rule is Ar = ± 2 rather than Ar = ± 1? 18. Since it is known to take a very long time for the para and ortho species of a molecule to convert themselves into each other, the interaction between the two nuclear spins in a molecule must be very small. Why would you expect this to be the case? 19. What changes must be made in the result developed in Section 12-9 if the electronic factor of the molecular eigenfunction is antisymmetric in an exchange of the labels of the two nuclei? PROBLEMS 1. From the following data, find the energy required to dissociate a KC1 molecule into a K atom and a Cl atom. The first ionization potential of K is 4.34 eV; the electron affinity of Cl is 3.82 eV; the equilibrium separation of KC1 is 2.79 A. (Hint: Show that the mutual potential energy of K + and Cl is —(14.40/R) eV if R is given in Angstroms). 2. The first ionization potential for K is 4.3 eV; the ion Br - is lower in energy by 3.5 eV than the neutral bromine atom. Compute the largest separation of K + and Br ions that gives a bound KBr molecule. 3. For a system which executes simple harmonic motion about a position of stable equilibrium, the force, F, is given by ( 2 )RO (R —R0) where V is the potential energy and R — R o is the deviation from equilibrium. Show that the zero-point vibration of a molecule is given by 2 hv o = 1172 47cu 1/2 ( 2 / ^ \aR 2 R 0 4. The potential energy V of NaC1 can be described empirically by V= 5. 6. 7. 8. 9. 10. e2 + Ae -RIp 4Te0R where R is the internuclear separation. The equilibrium separation of the nuclei R o is 2.4 A and the dissociation energy is 3.6 eV. (a) Calculate A and p/R o, neglecting zero-point vibrations. (b) Sketch V and each of the terms in V on one graph. (c) Give the physical significance of A and p. (a) Show that the ratio of the number of molecules in rotational level r to the number in the r = 0 level, in a sample at thermal equilibrium, is a maximum for the level specified by r = (kTI/lî2)1"2 — 1/2 (b) For HC1, what is the most populated level at 600°K? Taking the rotational inertia of H2 from Table 12-1, find the temperature at which the average translational kinetic energy of an H2 molecule equals the energy between the ground rotational state and first excited rotational state. What can you conclude about the occupation of rotational excited states in H2 at room temperature? Determine b, the zero-point vibrational energy, for a NaCl molecule, given that its fundamental vibrational frequency is 1.14 x 10 13 vib/sec. (a) Show that, if Ed is the dissociation energy of a molecule, the fraction of the molecules that dissociate at a temperature T is e - Ed/kT. (b) It is found (from electron diffraction studies) that as T increases, the internuclear separation increases. Explain what effect this has on the potential energy curve and on the result of part (a). For NaC1, the separation of two vibrational levels is about 4 x 10 -2 eV. Using Table 12-1, and noting that the rotational levels are not equally spaced, show that there are about 40 rotational levels between a pair of vibrational levels. The potential energies of two diatomic molecules of the same reduced mass are shown in Figure 12-14. From the graph determine which molecule has the larger (a) internuclear distance, (b) rotational inertia (moment of inertia), (c) separation between V Figure 12-14 Potential energy curves considered in Problem 10. w CO S I/1 3 1 8 0 8d (0 2 v F=— MOLECULES 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. rotational energy levels of the same r and y, (d) binding energy, (e) zero-point energy (Hint: See Problem 3), (f) separation between low-lying vibrational states. (a) What fraction of HC1 molecules at 1000°K will be found in the first excited vibrational state? (Hint: Use the Boltzmann factor.) (b) Find the ratio of HCI molecules in the first excited rotational state to those in the first excited vibrational state at 1000°K. (Hint: Remember the degeneracy factors.) (a) Derive an expression giving the ratio of the energy of a transition from the lowest to the first excited vibrational level to the energy of a transition from the lowest to the first excited rotational level for a diatomic molecule. (b) What is this ratio for NaCl? For H 2? (Hint: See Example 12-3.) (a) Show that the relative frequency shift of a spectral line in a rotational band arising from a mixture of two isotopic diatomic molecules is given by Av/v = — Aµ/µ, where u is the reduced mass of the molecule. (b) What is this ratio for a mixture of HC1 35 and HC1 37? Show that the ratio R of the total number of molecules in all excited vibrational states to the number in the ground vibrational state is R = (e hva/kT _ 1)-1 provided that the levels are assumed to be equally spaced. What is the amplitude of vibration of HCl in the first excited vibrational state? (a) Use data from Example 12-3 to predict the reciprocal wavelength of the zero-point vibration of HC1 given in Table 12-1. (b) What must be the force constant to give exact agreement? From the value 2940.8 cm -1 for the reciprocal wavelength equivalent to the fundamental vibration of a molecule C1 2 , each of whose atoms has an atomic weight 35, determine the corresponding reciprocal wavelength for C1 2 in which one atom has atomic weight 35 and the other 37. What is the separation of spectral lines, in reciprocal wavelengths, due to this isotope effect? (a) Specify the resolution, A2/2, of a spectrometer which can just resolve the rotational spectra of Na 23 C135 and Na23C1 37 assuming R 0 to be the same for both molecules. (b) Would this spectrometer also resolve the vibrational spectra of the two molecules, assuming the force constants are the same? Calculate the difference in dissociation energies of H2 and D2 from the value 4395.2 cm -1 H2 molecule.forthecipalwvngquetohfdamnlvibrto The zero-point vibrational energy for H2 is 0.265 eV. Compare the vibrational energy levels of H2, D2, and HD numerically for the low-lying states. From the fact that the lowest electronic excited state in 0 2 and N2 molecules is over 3 eV above the ground state, explain why air is transparent in the visible. In the vibrational Raman spectrum of HF are adjacent Raman lines of wavelength 2670 A and 3430 A. (a) What is the fundamental vibrational frequency of the molecule? (b) What is the equivalent force constant for HF? A ruby laser (2 = 6943 A) is used to excite the Raman spectrum of N2. (a) What are the wavelengths of the lines which result from the lowest energy allowed transitions in the pure rotational spectrum of N 2? (b) What is the ratio of the intensities of the lines of part (a) at room temperature? (c) What are the wavelengths of the lines which result from the allowed transitions to and from the ground state vibrational level? (d) What is the ratio of the intensities of the lines of part (c) at room temperature? (e) How do the answers to parts (a) and (c) change if the laser is used to excite the Raman spectrum of diatomic molecules with nonidentical nuclei having the same rotational inertia and force constant as N 2? The energy-level diagram for the rotational levels in each of the two lowest vibrational states of the electronic ground state is given in Figure 12-15 for a diatomic molecule. Find the energies of the transitions that give rise to the allowed spectral lines in the infrared and Raman spectra, (a) for molecules containing two identical i = 0 nuclei, 2 1 o r" U' = 1 8x 10 -4 eVi =3 2 x 10 -1 eV 2 1 0 a"=0 1 x 10 -3 eVJ Figure 12 15 - 25. 26. 27. 28. 29. Energy levels considered in Problems 24, 25, and 26. (b) for molecules containing two identical i = 1/2 nuclei, and (c) for molecules containing two nonidentical nuclei. Calculate the relative intensities at room temperature for the lines found in parts (a) and (b) of Problem 24. Using the information in Figure 12-15, (a) calculate the rotational inertia, or moment of inertia, of the molecule in each vibrational level, and (b) calculate the zero-point energy, (a) How many rotational degrees of freedom do you expect in a polyatomic molecule? Translational degrees? If the molecule has N atoms (N > 2) there should be 3N — 6 vibrational degrees of freedom, i.e., independent modes of vibration. Explain. (b) How many vibrational degrees of freedom are there in an H 2 O molecule? A CH4 molecule? Consider the relative intensities of the spectra of H2 and D2 to determine which Raman rotation spectrum will yield lines alternating in intensity and having a relative intensity of 1/2. Band spectrum measurements of diatomic molecules containing C1 35 nuclei yield an alternating intensity ratio of 3/5. What is the spin of the C1 35 nucleus? sw31soad r' — 3 13 SOLIDS CONDUCTORS AND SEMICONDUCTORS 13-1 443 INTRODUCTION subjects included in solid state physics 13 2 - 443 TYPES OF SOLIDS crystal lattices; qualitative characteristics of molecular, ionic, covalent, and metallic solids 13 3 - BAND THEORY OF SOLIDS 445 exchange degeneracy in a lattice of identical atoms; comparison to hydrogen molecule; formation of energy bands; allowed and forbidden bands; overlapping bands; occupation of bands; unit cells; insulators; conductors; electron momenta in insulators and conductors; valence and conduction bands; semiconductors 13 4 - ELECTRICAL CONDUCTION IN METALS 450 electron-lattice imperfection collisions; classical expressions for resistivity, conductivity, and mobility; Hall effect; Hall coefficient; hole conduction 13 5 - THE QUANTUM FREE ELECTRON MODEL - 452 free-electron energy dist ri bution and density of states; estimate of Fermi energy and relative number of conduction electrons for metal; evaluation of energy width of band; density of states for band in two-dimensional metal 13 6 - THE MOTION OF ELECTRONS IN A PERIODIC LATTICE 456 Bloch eigenfunctions; Kronig-Penney model; Bragg reflection; relation of Kronig-Penney results to Bragg conditions; eigenfunction symmetry and origin of band gaps; Brillouin zones 13-7 EFFECTIVE MASS 460 properties of wave groups recapitulated; equation of motion of electron in lattice under applied electric field; interpretation of effective mass; effective mass in various regions of a Brillouin zone; relation to Bragg reflection; comparison of level densities by means of effective mass; use of effective mass in classical expressions for conductivity and resistivity; lattice imperfections and resistivity; effective mass of holes 13 8 - ELECTRON POSITRON ANNIHILATION IN SOLIDS - energy-momentum conservation; correlation measurements; electron momentum distributions; defects; lifetime measurements; positronium 442 464 SEMICONDUCTORS 467 energy gaps in silicon and germanium; temperature dependence of conductivity; intrinsic and extrinsic conductivity; photoconductivity; donor impurities and n-type semiconductors; estimate of donor electron binding energy; acceptor impurities and p-type semiconductors; Fermi energy in an intrinsic semiconductor; temperature dependence of Fermi energy in impurity semiconductors 13-10 SEMICONDUCTOR DEVICES 472 p-n junctions; thermal current; recombination current; application of reverse or forward bias; rectifier action; advantages over vacuum tube rectifier; junction transistors; operation explained in terms of junctions; power amplifier action; tunnel diodes; negative resistance characteristic and fast response time QUESTIONS 477 PROBLEMS 478 13-1 INTRODUCTION Solid state physics is a vast area of quantum physics in which we are concerned with understanding the mechanical, thermal, electrical, magnetic, and optical properties of solid matter. Some aspects have been discussed in earlier chapters, such as the lattice and electronic contributions to the specific heats of solids, radiation from a blackbody, thermionic emission, and contact potentials. Here we shall focus on the origin of the forces that hold atoms together in a solid and on the allowed energy levels of the electrons in the solid. This will lead us to the band theory of solids. That theory will then be applied to phenomena of much practical and theoretical interest, including semiconductors and semiconductor devices. Many electrical, thermal, and optical properties of solids will thereby become more clearly understood. In the next chapter we extend the theory to the phenomenon of superconductivity and consider magnetic properties of solids as well. 13-2 TYPES OF SOLIDS In the gaseous state the average distance between molecules is large compared to the size of a molecule, so the molecules may be regarded as isolated from one another. Many substances, however, are in the solid state at ordinary temperatures and pressures. In that state molecules (or atoms) can no longer be regarded as isolated. Their separation is comparable to the molecular size, and the strength of the forces holding them together is of the same order of magnitude as the forces binding the atoms into a molecule. Hence, the properties of a molecule are altered by the presence of neighboring molecules. Characteristic of crystalline solids is the regular arrangement of atoms, a recurrent or periodic pattern called a crystal lattice. The solid can be regarded as a large molecule, the forces between atoms being due to interaction between atomic electrons, and the structure of the solid being determined as that arrangement of nuclei and electrons which yields a quantum mechanically stable system. Although the number of atoms involved is very large, they are arranged in a regular pattern. In noncrystalline solids, such as concrete and plastic, the perfectly regular pattern does not hold over long distances, but there is an orderly pattern in the neighborhood of any one atom. We shall discuss only crystalline solids in this Sal-lOSJ O S3dAl 13 -9 SOLIDS-CONDUCTORS AND SEMI CONDUCTORS ci. book. Such solids are classified according to the predominant type of binding, the principal types being molecular, ionic, covalent, and metallic. Molecular solids consist of molecules which are so stable that they retain much of their individuality when brought in close proximity. The electrons in the molecule are all paired so that atoms in different molecules cannot form covalent bonds with one another. The intermolecular binding force is the weak van der Waals attraction that is present between such molecules in the gaseous phase. The physical mechanism involved in the van der Waals attraction is an interaction between electric dipoles. Because of the fluctuating quantum mechanical behavior of the electrons in a molecule, all molecules have a fluctuating electric dipole moment, even though for many of them symmetry considerations require that it fluctuate about an average value of zero. At a time when a molecule has a certain instantaneous electric dipole moment, the external electric field that it produces will induce in the charge distribution of a nearby molecule a dipole moment. By drawing rudimentary sketches of the charges and field in various cases, the student can immediately convince himself that the force exerted between the inducing and the induced electric dipole is always attractive. The interaction energy is proportional to the mean square of the inducing electric dipole moment. The resulting attraction is weak, the binding energies being of the order of 10 -2 eV and the force varying with the inverse seventh power of the intermolecular separation. In the solid, successive molecules have electric dipole moments which alternate in orientation so as to produce successive attractions. Many organic compounds, inert gases, and ordinary gases such as oxygen, nitrogen, and hydrogen form molecular solids in the solid state. Because the binding is weak, solidification takes place only at very low temperatures where the disruptive effects of thermal agitation are very small. (The melting point of solid hydrogen is 14°K, for example.) The weak binding makes molecular solids easy to deform and compress, and the absence of free electrons makes them very poor conductors of heat or electricity. Ionic solids, such as sodium chloride, consist of a close regular three-dimensional array of alternating positive and negative ions having a lower energy than the separated ions. The structure is stable because the binding energy due to the net electrostatic attraction exceeds the energy spent in transferring electrons to create the isolated ions from neutral atoms, just as for ionic binding in molecules. Ionic binding in solids is not directional because spherically symmetrical closed shell ions are involved. Hence the ions are arranged like close-packed spheres. The actual crystal geometry depends on which arrangement minimizes the energy, and this in turn depends principally on the relative sizes of the ions involved. Because there are no free electrons to carry energy or charge from one part of the solid to another, such solids are poor conductors of heat or electricity. Because ctf the strong electrostatic forces between the ions, ionic solids are usually hard and have high melting points. Lattice vibrations can be excited by energies corresponding to radiation in the far infrared, so that ionic solids show strong optical absorption properties in that region. But optical absorption by excitation of electrons requires energies in the ultraviolet, so that ionic crystals are transparent to visible radiation. Covalent solids contain atoms that are bound by shared valence electrons, as in covalent binding of molecules. The bonds are directional and determine the geometrical arrangement of atoms in the crystal structure. The rigidity of their electronic structure makes covalent solids hard and difficult to deform, and it accounts for their high melting points. Because there are no free electrons, covalent solids are not good heat or electrical conductors. Sometimes, as for silicon and germanium, they are semiconductors. At room temperature some covalent solids, such as diamond, are transparent; the energy required to excite their electronic states exceeds that of photons in the visible region of the spectrum so that such photons are not absorbed. But most covalent solids absorb in the visible and are therefore opaque. covalent binding in which electrons are shared by all the ions in the crystal. When a crystal is formed of atoms having a few weakly bound electrons in the outermost subshells, electrons can be freed from the individual atoms by the energy released in binding. These electrons move in the combined potential of all the positive ions and are shared by all the atoms in the crystal. We speak of an electron gas interspersed between the positive ions and exerting attractive forces on each ion that exceed the repulsive forces of other ions, hence the binding. The atoms have vacancies in their outermost electron subshell, and there are not enough valence electrons per atom to form tight covalent bonds. The electrons are shared by all the atoms and are free to wander through the crystal from atom to atom, there being many unoccupied electronic states. In this sense they behave like a gas, an "electron gas." A metallic solid is a regular lattice of spherically symmetrical positive ions, arranged like close-packed spheres, through which the electrons move. Metallic solids are obviously excellent conductors of electricity, or heat, the electrons easily absorbing energy from incident radiation, or lattice vibrations, and moving under the in fl uence of an applied electric field, or thermal gradient. Because radiation in the visible portion of the electromagnetic spectrum is easily absorbed, such solids are opaque. All the alkalies form metallic solids. The type of binding that a particular solid has is determined experimentally by studies of x-ray diffraction, dielectric properties, optical emissions, and so forth. There are some solids whose binding must be interpreted as a mixture of the principal types we have described. In addition, not all solids have the ideal structure implied by the discussion so far. Indeed, the so-called lattice imperfections, or deviations from ideal crystal structure, lead to many properties of solids which have practical consequences. 13 3 BAND THEORY OF SOLIDS - To understand the effect of putting a great many atoms close together in a solid, consider first two atoms only that are initially far apart. All of the energy levels of this two-atom system have a twofold exchange degeneracy. That is, for the combined system the space part of the eigenfunction for the electrons can contain either a combination of the individual atom space eigenfunctions which is symmetric in an exchange of pairs of electron labels, or which is antisymmetric in such a label exchange. (The total eigenfunction of the system of electrons is, of course, antisymmetric, since the symmetric space eigenfunction is associated with an antisymmetric spin eigenfunction, and vice versa.) When the atoms are widely separated, the two different types of eigenfunctions lead to the same energy, and so each of the energy levels is said to have a twofold exchange degeneracy. But when the atoms are brought together, the exchange degeneracy is removed. Because the electron charge density in the important region between the atoms depends on whether the space eigenfunction is symmetric or antisymmetric, when the atoms are close enough together that the wave functions of the individual atoms overlap, the energy of the system depends on the symmetry of the space eigenfunction. Hence, a given energy level of the system is split into two distinct energy levels as overlap commences, and the splitting increases as the separation of the atoms decreases. Of course a famous example of this phenomenon is found in the ground state energy level of the system containing two hydrogen atoms, as we saw in Section 12-3. Figure 12-4 shows this splitting for the ground state level only, but each of the higher levels of the system splits in the same way, and for the same reason, as the atoms are brought together. If we had started with three isolated atoms, we would have had a threefold exchange degeneracy of the energy levels. When the atoms are brought together in a SaI10S 3O A1:1O3H1 aMdB E-El3aS Metallic solids exhibit a binding that can be thought of as a limiting case of SO LIDS-CO NDUCTORSAND SEMICONDUCTORS t O R Figure 13-1 Schematic drawing of the splitting of an energy level in a system of six atoms, as a function of the separation distance R between adjacent atoms. The space eigenfunction of the level at the top of the band is antisymmetric with respect to two-at-a-time label exchange, and the one at the bottom is symmetric with respect to such an exchange. The total eigenfunction is antisymmetric for all the levels in the band. But the space eigenfunction for the intermediate levels is neither symmetric nor antisymmetric. Instead, the space eigenfunction of each of these levels has what might be called a mixed symmetry, there being a different mixed symmetry for each intermediate level. The net result is a gradual transition of the electron charge distribution from one that leads to a minimum energy to one that leads to a maximum energy in going from the bottom to the top of the band. The reason why only two levels in a band can have a space eigenfunction with a well defined symmetry (that is, either symmetric or antisymmetric) is that the label exchanges are carried out two at a time. uniform linear lattice, each of the levels splits into three distinct levels. Figure 13-1 illustrates this schematically for a typical energy level of a system of six atoms. The splitting commences when the center-to-center atomic separation R becomes small enough for the atoms to begin overlapping. As R decreases from this value there is a decrease in the energy of the levels for which the symmetry of the space eigenfunction leads to a favorable electron charge distribution (i.e., which puts electron charge where the ions exert the strongest binding), and an increase in the energy of the levels associated with space eigenfunctions whose symmetry leads to an unfavorable charge distribution. The more favorable, or unfavorable, the charge distribution is, the greater is the decrease, or increase, in the energy. So the levels are spread, by the quantum mechanical requirements of indistinguishability, about an average energy equal to the energy the system would have at a given R if there were no such requirements. Note that this average energy begins to increase rapidly for sufficiently small R. This is due to the Coulomb repulsion that the ions exert on each other. As we go to a system containing N atoms of a given species, each level of one of these atoms leads to an N-fold degenerate level of the system when the atoms are well separated. With decreasing separation, each of these splits into a set of N levels. The spread in energy between the lowest and highest level of a particular set depends on the separation distance R, since R specifies the amount of overlap that causes the splitting. But it does not depend significantly on the number of atoms in the system if the same separation distance is maintained. Thus, as more and more atoms are added to the system each set of split levels contains more and more levels spread over about the same energy range at a particular R. At the values of R found in a solid, a few angstroms, the energy spread is of the order of a few electron volts (see Figure 12-4). If we then consider that a solid contains something like 10 23 atoms per mole, we see that the levels of each set in a solid are so extremely closely spaced in energy that they form a practically continuous energy band. SaI1OSJO A 1`IO3H1aN b'8 Figure 13-2 Top: Energy-level scheme for two isolated atoms. Middle: Energy-level scheme for the same two atoms in a diatomic molecule. Bottom: Energy-level scheme for four of the same atoms in a rudimentary one-dimensional crystal. Note that the lowest lying levels are not split appreciably because the atomic eigenfunctions for these levels do not overlap significantly. The process we have just described is indicated in Figure 13-2. We see from this figure that the lower-lying energy levels are spread less than those that lie higher. The reason is that the electrons in lower levels are electrons in inner subshells of the atoms, which are not significantly influenced by the presence of nearby atoms. These electrons are localized on particular atoms, even when R is small, because the potential barriers between the atoms are for them relatively high and wide. The valence electrons, on the other hand, are not localized at all for small R, but they become part of the whole system. The overlapping of their wave functions results in a spreading of their energy levels. It should be pointed out that the is level of an individual atom becomes a band of N levels, as does the 2s level, if we count in such a way that each of these can accommodate two electrons of opposite spin. But the 2p level is triply degenerate in the space quantum number m1 in the isolated atom, since m1 can assume any of the values —1, 0, + 1. Thus the 2p level in the atom leads to 3N levels in the solid. As we shall discuss soon, these can be thought of as forming three bands of N levels, whose energy ranges may or may not coincide. In Figure 13-3 we show the band formation for the higher levels of sodium, whose ground state atomic configuration is 1s 22s22p6 3s 1 . Several general features of allowed bands (the continuous bands of energy levels for electrons) and forbidden bands (the regions where there are no electron energy levels) are illustrated in this figure. Allowed bands corresponding to inner subshells, such as 2p in sodium, are extremely narrow until the interatomic spacing becomes smaller than the value actually found in the crystal. As we go through the outer occupied subshells and into the unoccupied subshells of the atom in its ground state, however, the bands become progressively wider at a given interatomic separation. The reason is, again, that the greater the energy of the electrons the larger the regions in which they can move and the more they are affected by nearby ions. As the energy increases, therefore, the successive allowed bands widen and overlap each other in energy. Direct experimental verification of energy bands comes from observations of x-ray spectra in solids. For example, the 3s 2p transition in sodium gives the L series x-ray lines. A very sharp line spectrum is observed for gaseous sodium in which the 3s and 2p levels are narrow. C hap. 13SOLIDS—CO NDU CTORS AND SEMICO NDUCTO RS 3 Figure 13-3 Showing the formation of energy bands from the energy levels of isolated sodium atoms as the interatomic separation decreases. The dashed line indicates the observed interatomic separation in solid sodium. The several overlapping bands that constitute each p or d band are not indicated. But the same x-ray lines from solid sodium are broadened because, although the low-lying 2p level remains narrow, the 3s level has now become an energy band. The observed shape of x-ray lines from solids agrees with the energy band picture. Consider now the occupation of the energy levels. Those bands which originated in levels of closed subshell electrons of an isolated atom have all their levels occupied. The bands that originated from valence electrons may or may not be fully occupied. If an electric field is applied to the solid the electrons will acquire extra energy only if there are available empty levels within the range of energy that the strength of the applied field allows the electron to gain. If there are no nearby empty levels, then the electron will not be able to gain any energy at all and the solid behaves like an insulator. What counts in determining the emptiness, or fullness, of the bands containing valence electrons is the valence of the atoms forming the solid, and the geometry of the crystal lattice into which they solidify. An isolated band will be full if a unit cell of the crystal lattice contains two valence electrons, one for each of the two possible values of the spin quantum number m s Crystal structure geometry, or crystallography, is a complex subject that is very important in any detailed study of solid state physics. It is treated briefly in Appendix Q. We avoid it in the text by restricting ourselves to particularly simple (usually one-dimensional) crystal lattices. We shall, however, define a unit cell as the smallest geometrical arrangement of atoms that by periodic repetition along the coordinate axes can fully describe the geometrical arrangement of the atoms in the complete crystal. We shall also say that in a crystal lattice some or all of the degeneracy of the atomic valence electron levels with respect to the quantum number m1 is removed because these electrons are not in the spherically symmetrical potential of an atom in free space, but in a potential whose more complicated symmetry depends on the crystal geometry. For this reason, the three degenerate levels from a p subshell of a . It is worthwhile putting the distinction between conductors and insulators into momentum, instead of energy, language. Without an applied electric field there are as many electrons in the solid with momentum vectors in one direction as there are with momentum vectors in the opposite direction, since there is no net current. When an electric field is applied, this equilibrium can be upset causing a current to flow, if some of the electrons can go into quantum states with changed momentum vectors. This is quite possible for electrons in a partially filled band, but it cannot be done by electrons in a completely filled band. SUR OS 3O A1:1O3H1 aNVB single atom lead to three bands of N levels, each capable of holding two electrons of opposite spin, in a crystal containing N of these atoms. These bands may be completely nonoverlapping, partly overlapping, or completely overlapping in energy, depending on the crystal geometry. The term isolated band, used in expressing the condition for a full band, refers to a case in which these bands do not overlap each other or bands from other subshells. Then if there are two valence electrons per unit cell, each of the N levels in the lowest lying band will have its full complement of two electrons. Note that the quantity determining occupation is the number of valence electrons per unit cell, and not per atom. In a uniform one-dimensional lattice of identical atoms, such as we considered in the argument from which we concluded that a band contains N levels, if the crystal contains N atoms, a unit cell contains one atom and there is no distinction to be made. When that argument is extended to three-dimensional crystals containing atoms of different species, it is found that the conclusion remains the same, providing. N is the number of unit cells in the crystal. Thus if there are two valence electrons per unit cell there will be two in each of the levels of the band, and the band will be fully occupied. The problem in predicting whether or not a solid is an insulator is that the question of band overlap is all important, and this depends on details of the geometry of the crystal structure (and of the geometry of the atomic eigenfunctions). If what, as far as valence is concerned, might have been a completely filled band actually overlaps what might have been a completely empty band, then there will be two partly filled bands. The result is that a solid that might have been an insulator will actually be a conductor. But it is at least possible to say that a solid can certainly not be an insulator unless one of its unit cells contains an even number of valence electrons, because an odd valence electron can never be in a filled band. Most covalent solids like diamond, or ionic solids like sodium chloride, are insulators; they all have an even number of valence electrons per unit cell. In diamond each carbon atom has four valence electrons, and there are two atoms in each unit cell. The eight valence electrons per unit cell fully occupy the 4N levels of four bands, one originating from the 2s subshell of the atom and three originating from the three 2p subshells. These bands overlap each other, but they are well separated from empty higher energy bands. Sodium chloride contains one sodium ion and one chlorine ion per unit cell, and the valence band consists of a set of completely filled bands that overlap each other but do not overlap unfilled bands. Alkali-earth atoms like beryllium are divalent and form crystals with an even number of valence electrons per unit cell, but these solids are metals, not insulators, because overlapping bands make slightly higher unfilled levels energetically available to the electrons. In solids formed from the monovalent alkali atoms like sodium, the band containing the valence electrons cannot be filled, and so the solid behaves like a conductor. Only half of the levels of the isolated 3s allowed band of sodium are filled because a sodium atom has a single electron in the 3s level, whereas the exclusion principle allows such a level to accommodate two electrons. Hence electrons in the solid can easily acquire a small amount of additional energy. Thus any applied electric field will be effective in giving electrons energy, and the solid will be a conductor. As we mentioned in the previous paragraph, conductors are also found in cases where bands containing valence electrons overlap. 0 SOLIDS-CONDUCTO RS AND SE MICONDUCTORS ^ At temperatures above absolute zero it is, of course, possible for some electrons to gain enough thermal energy to jump over the energy gap of a forbidden band of energy into a higher allowed band, thereby creating vacancies in the lower allowed band and making a new allowed band available. We speak of the nearly filled band as a valence band and the nearly empty band as a conduction band. The probability of this happening increases with temperature, and it depends strongly on the width of the forbidden band. Substances in which the width of the energy gap is small are called semiconductors. An example is silicon, a covalent solid with a diamondlike structure, but with a forbidden band only about 1 eV wide. It becomes reasonably conducting at room temperature though at low temperatures it is an insulator. On the other hand the gap between the filled and empty allowed bands in diamond is about 7 eV. Thus diamond is an insulator even at relatively high temperatures. 13 4 ELECTRICAL CONDUCTION IN METALS - Some useful results concerning conduction electrons in metals can be obtained from classical ideas. In the absence of an applied electric field, the directions in which these electrons move are random. The reason is that the electrons frequently collide with imperfections in the crystal lattice of the metal, which arise from thermal motion of the ions about their equilibrium positions in the lattice or from the presence of impurity ions in the lattice. In colliding with these imperfections, the electrons suffer changes in speed and direction, and this makes their motion random. As in the case of molecular collisions in a classical gas, we can describe the frequency of electronlattice imperfection collisions by a mean free path 2, where 2 is the average distance that an electron travels between collisions. When an electric field is applied to a metal, the electrons modify their random motion in such a way that, on the average, they drift slowly in the direction opposite to that of the field, because their charge is negative, with a drift speed v d. This drift speed is very much less than the effective instantaneous speed v of the random motion. In copper vd is of the order of 10 -2 cm/sec, whereas v is of the order of 10 8 cm/sec. The drift speed can be calculated in terms of the applied electric field E and of v of magnitude eE which will give it an acceleration of magnitude a given by a = eE/m. Consider now an electron that has just collided with a lattice imperfection. In general, the collision will momentarily destroy the tendency to drift and the electron will move in a truly random direction after the collision. Just before its next collision the electron will have changed its velocity, on the average, by a2/v where 2/v is the mean time between collisions. We call this the drift speed v d , so that a2 eE2 V =—_ v mv If n is the number of conduction electrons per unit volume and j is the current density, we have vd = j/ne = eE2/mv. Combining this with the definition of resistivity, p = E/j, gives us my P ne2 (13-1a) Equation (13-1a) can be taken as a statement that metals obey Ohm's law, for the quantities v and 2 that determine the resistivity p do not depend on the applied electric field, which is the criterion that the law is obeyed. Often we deal with the conductivity 1 p ne2/1 mv (13-1b) and2.Whefilspdtoanecrihml,twexprincafo This can be put in a more useful form by defining a measurable quantity, the mobility of magnitude given by the ratio of the drift speed to the applied electric field, i.e. µ, D vd e^ my (13-1c) Then since o- = ne 2A/mv, we have µ = a/ne or a = neµ (13-2) If we have conduction by positive carriers as well as negative carriers, the conductivity is given by a = ngnµn + pq u in which µn and µp are the mobilities of negative and positive carriers, qn and qP are their charges, and n and p are the numbers of these carriers per unit volume. If conduction is by negative charge carriers the charge q of the carrier is negative, whereas q is positive if conduction is by positive carriers. Since the sign of y also depends on the sign of q, each term in the expression for a is always positive. The sign of the charge carrier of electric current in a metal can be determined from measurements of the Hall effect. That is, when a current carrying conducting sheet is placed perpendicular to a magnetic field, an electric field is set up perpendicular both to the magnetic field and the flow of current. By measuring the potential difference between the two surfaces of the conductor, it is possible to deduce the sign and value of the quantity 1/nq, called the Hall coefficient. Here n is the number of charge carriers per unit volume and q is the charge of the carrier. The electric field arises from an accumulation of charge carriers on one surface due to the v d x B force exerted on them when they move with velocity v d through the magnetic field B. In some metals, as zinc and beryllium for example, the Hall effect indicates net positive charge carriers. This is interpreted as being due to transitions of electrons from the filled valence band to the conduction band leaving holes (unoccupied energy levels) in the valence band. Such holes correspond to the absence of an electron and behave much like positive charges. As these vacancies are filled by electrons, moving under the influence of an electric field, the holes move in a direction opposite to the electrons just as though positive charge carriers were moving in the field direction. In the case of metals with an s2 atomic configuration, such as zinc and beryllium, the mobility of the s-band holes is much greater than that of the p-band electrons. Since the sign of the Hall coefficient depends on which type of carrier has the higher mobility, the Hall coefficient is positive for these metals. In Table 13-1 we list the Hall coefficients of some metals and also the number of free electrons per atom. The latter is computed from the value of the Hall coefficient, 1/nq, and the density of the metal. For the alkalis and other monovalent metals, Hall measurements agree with one conduction electron per atom. Of course, the freeelectron model on which the simple Hall effect analysis is based is not expected to be valid for all metals. Table 13-1 Metal Na K Cu Ag Al Li Observed Hall Coefficient and Calculated Number of Free Electrons per Atom. 1/nq 1/nq (10 -10 m 3 /coul) No./atom —2.5 —4.2 — 0.55 —0.84 —0.30 —1.70 0.99 1.1 1.3 1.3 3.5 1.0 Metal (10 -10 m 3/coul) No./atom Be +2.4 +0.33 +0.60 +40 —20 —5000 —2.2 —2.9 —2.5 —0.04 0.09 0.0005 Zn Cd As Sb Bi SiV13W NINOI1O f1 aNOO1VO I U103 13 E ^ SO LIDS-CO NDUCTORS ANDSE MICONDU CTORS N (e) N (e) N i^ i H kT ^ { 1 —..—.—.—.\ I 0 n (e) N(e) eF n(g) N(s) Unfilled levels Filled levels .` Unfilled levels Filled levels Figure 13-4 Left: The distribution with energy of conduction electrons in an unfilled band of width emax in a solid at T = 0, according to the free electron model. Right: The same at a higher temperature. 13-5 THE QUANTUM FREE-ELECTRON MODEL Let us now recall our application in Section 11-11 of quantum theory and the Fermi distribution to conduction electrons in a metal. There we saw that the potential in which the electron moves can be approximated by a rectangular potential well. This constant potential smooths out the actual periodic variation due to the ion cores and includes the average effect of all the remaining electrons. It is equivalent to treating the electrons as an ideal gas of fermions inside the solid. This approximation, which greatly simplifies quantum mechanical calculations, turns out to be surprisingly good in determining many of the observed properties of solids, as we saw in Section 11-12 when we used it in describing phenomena such as contact potential and electronic specific heats. In connection with our present discussion we can use the result, (11-56), for the distribution with energy of free conduction electrons in a metal, namely 8zc V ( m3 )1/2 1/2 de (13-3) 1 where n(g)N(g) dg is the number of electrons with energy from e to e + de in a metal at temperature T. The justification is that the distribution of energy states in a band is nearly the same as that for free electrons if the Fermi energy is not close to the top of the band. This condition applies to the alkali metals, for example, and accounts for the success that the free-electron model has in describing their electrical properties. On the left side of Figure 13-4 we show the prediction of (13-3) for the absolute zero temperature energy distribution of electrons in a partly filled band, with energy being measured from the lowest energy in the band. The maximum energy allowed in the band is fi and ^ F < max) as shown in that figure. At a temperature greater than zero, the uppermost electrons are excited to occupy nearby available higher states, and the distribution function takes the form shown on the right side of Figure 13-4. The number of quantum states in an energy interval e to e + de is the factor n(g)N(g) dg = e( ' eF)' + N(e)de of (13-3), namely w ^ (13-4) In Figure 13-4 N(s) is shown by a dashed curve and, for unit volume, is the density of states. The dash-dot curve is n(s), the Fermi distribution for the number of electrons per state. The solid curve gives the product n(e)N(e), the energy distribution of electrons, or number of electrons per unit energy interval. Example 13-1. The Fermi energy, gF, for lithium is 4.72 eV at T = 0. Calculate the number of conduction electrons per unit volume in lithium. ^ From (11-57) we have h2 3N 2/3 (13-5) = for kT « ‘ F 8m (^ V so that the number of free electrons per unit volume is _ N _ 8 m \3/2 3 /2 TE n V — ^F 3 ^ \ h2l eF = 4.72 inwhcmsteaoflcrn.The,wit eV, we have n= 3/2 (4.72 x 1.60 x 10 -19 joule) 3/2 8 x 9.11 x 10- 31 kg V 3 (6.63 x 10 -34 joule-sec) 2 = 4.64 x 10 28/m3 = 4.64 x 1022/cm 3 as the number of conduction electrons per unit volume in lithium. This corresponds exactly to one free electron per lithium atom, since the number of lithium atoms per unit volume, in solid lithium of density 0.534 g/cm 3 , is • 0.534 g x 1 mole x 6.02 x 1023 atom = 4.64 x 10 22 atom/cm 3 mole cm3 6.94 g Example 13-2. Make an estimate of the relative number of conduction electrons in a metal which are thermally excited to higher energy states. • Figure 13-4 shows that most of the excited electrons are in a range AS above the Fermi energy enF, where M ^ 2kT. Assuming that kT « gF, the number A.N' of excited electrons can be calculated from N(eF)n(eFmM ti N(gF)( 1/2)2kT ^ N(SF)kT Equation (13-5) shows that, for kT « gF = 3 3/2 g3/2 F C and (13-4) shows that e F1/2 aV N(eF)= 2 Hence O.N' N(gF)kT ✓V N' ^ C h^)8m33/z 7TV 3 _ 3 kT 2 .6aF kT gF 1 /2 gFl2kT Ch2^ J 603/2 /2 Sec . 1 3-5 THE QUANTUM FREE-E LECTRONMO DEL 87tV(2m 3) 112 112 N(g) de = e dg h3 SOL IDS-CONDU CTORS AND SEMIC ONDU CTOR S The fraction of conduction electrons that is thermally excited is small. At room temperature kT ^ 0.025 eV and typically eF ^ 4 eV, so that Ai/.4(^ 1/160. The absolute number of excited conduction electrons is large, however, because .iV itself is so large. Now we shall use the free-electron model to evaluate the width in energy of a band for the simple case of a one-dimensional metal. The eigenfunctions for an electron in the deep square well, representing the smoothed out attraction of the ion cores distributed uniformly along the x axis plus the average repulsion of the remaining electrons, are essentially sinusoidal standing waves like 2Ex 2Ex i/i cc cos = cos kx and tk cc sin = sin kx (13-6) where 2 is the wavelength and k = 2E/2 is the wave number. The eigenfunctions have nodes at each end of the well since their values go to zero outside the well. These boundary conditions lead immediately to the requirement that n2/2 = L, where L is the length of the well. Each value of the integer n = 1, 2, 3, ... , corresponds to a different eigenfunction, or energy level if we allow two electrons of opposite spin per level. Since for free electrons the energy is e = p 2 /2m = h2/2m22 = h2 n2 /8mL2, the minimum value of n corresponds to the level of essentially zero energy at the bottom of the band, and the maximum value of n corresponds to the level of maximum energy at the top, the width of the band being approximately equal to that maximum energy. If there are N ions each separated by distance a in the one-dimensional metal of length L, then N = L/a. As we have explained before, the number of levels in the band is just equal to N, so the maximum value of n will also be equal to N. Thus the maximum energy, or energy width of the band in our one-dimensional metal, is h2N2 h2L 2 max = 8mL 2 8mL 2a 2 or h2n2 (13-7) 2ma2 This result, which depends on a but is independent of N, confirms the statement made earlier that the width of a band depends on the separation of the ions and not on the number of ions in the lattice. The free-electron model gives very good results for many metals. It is especially good for the alkali metals where the overlap of bands (as in Figure 13-3 for sodium) is so complete that the density of states N(e) behaves like the curves of Figure 13-4. The ei/2 dependence of N(e) on e is not correct, however, in the case of an isolated band. Although the actual shape of the curve of density of states depends on the position of the band and the structure of the lattice, its shape is roughly symmetric, as shown in the upper part of Figure 13-5, in that it decreases to zero at the top of the band. To understand how this comes about, we consider a one-dimensional crystal which is so long that we first ignore the boundary conditions at its end. Then the most convenient eigenfunctions for a free electron are sinusoidal traveling waves like ti cc e- ikx (13-8) cc eikx and where the forms with positive, or negative, exponents describe an electron moving in the positive, or negative, direction of the x axis. It is even more convenient to take only the form 1i cc e`kx, and let k be either positive or negative. Now we write the energy e of a free electron in terms of its wave number k = p/h, where p is its momentum. That is e _ p2 h2 k2 (13-9) 2m_2m max = N(e) 0 N(6°) 0 Figure 13-5 Top: A qualitative representation of the density of states as a function of energy in an unfilled isolated band. Bottom: The same for the case of two barely overlapping bands. This relation is plotted in Figure 13-6, over a range of k including both positive and negative values. A positive value of k corresponds to an electron moving in the positive x direction, and a negative k corresponds to motion in the opposite direction. The energy depends on k 2, so the curve is symmetrical about k = 0. It can be seen immediately by comparing (13-7) and (13-9) that k < + n/a (13-10) That is, the values of k corresponding to the maximum value of g found in the band are — n/a and +n/a, and the value of k corresponding to the mimimum value g = 0 is the value k = 0 in the middle of this range. Since k cc 11,10c n and n = 1, 2, 3, .. o , the values of k allowed by the boundary conditions are evenly spread throughout this range. Each of them is associated with a different quantum state for the electron. — 7r/a 0 + it/a k Figure 13-6 The energy of a free electron plotted as a function of its wave number k. The points indicate schematically the uniformly spaced allowed values of k. For the first band of the crystal they fall within the range —2t/a < k < +n/a, where a is the ion separation of the one-dimensional lattice in which the electron moves freely. 134OW N Oa10313-33 1:13W fllNd flO3H 1 Unfilled levels CO SOLI DS-CO NDUCTO RS ANDSEMICO NDUCTO RS ^ Figure 13 7 Illustrating the uniformly distributed allowed values of the x and y component wave numbers for a free electron in the first band of a two-dimensional square lattice with ion separation a. - Next consider a two-dimensional metal with ions spaced by the same distance a y directions. In a band the allowed values of both the x and y inbothexad component wave numbers, k x and ky , are uniformly distributed over ranges extending from — rz/a to + rz/a, as shown in Figure 13-7. Each pair of kx and ky values defines a point that specifies a quantum state for a free electron of the metal; these points are uniformly distributed within the square. A circle surrounding the origin of radius k, where k 2 = kx + kÿ, passes through all states having the same energy since in two dimensions (13-9) reads ky) h2k2 ^_ h2(kz+ 2m 2m The number of states dN, for values of k ranging from k to k + dk, is equal to the number of points contained within the area limited by k and k + dk. As the points are uniformly distributed, this number will be proportional to the area. The figure shows that as long as k < rc/a, dN increases with increasing k; specifically dN = 2rtk dk. When k begins to exceed rz/a, further increase in k causes dN to decrease. Thus dN/dk = N(k), the number of states per unit range of wave number, increases from zero for small k, reaches a maximum, and then decreases back to zero when k reaches the largest allowed value for the band of our two-dimensional metal. The same general behavior is found when these results are converted from N(k) to N(s), the number of states per unit energy. In a real three-dimensional metal it is also true. That is, the density of states N(6) increases from zero for small e (the bottom of the band), reaches a maximum, and then decreases back to zero at the largest allowed value (max found in the band (the top of the band). The detailed behavior of N(s) depends on the geometrical details of the arrangements of ions in the crystalline metal, as does the exact value of emax• But the general behavior is always about as we have indicated, and the approximate value of émax is given by (13-7) if a is interpreted as the characteristic ion spacing in the crystal. 13 6 THE MOTION OF ELECTRONS IN A PERIODIC LATTICE - The free-electron model that we have used ignores the effects of electrons interacting with the crystal lattice. Let us begin to consider this by making some general remarks about the effect of the periodic variation in the potential. For one thing, the lattice periodicity has the effect that the wave functions for an infinitely long lattice are no longer sinusoidal traveling waves of constant amplitude, but they exhibit the lattice periodicity in their amplitudes. In addition, electrons may be scattered by the lattice. Just as an electromagnetic wave suffers a Bragg "reflection" when the Bragg condition is satisfied, so also when the de Broglie wavelength of the electron corresponds to a periodicity in the spacing of the ions the electron interacts particularly strongly n•n• - Kronig-Penney model potential <— l ^E a Actual potential b= a — l Figure 13-8 Illustrating how the potential for an electron moving in a periodic lattice can be approximated by the Kronig-Penney model of an array of rectangular potential wells and barriers. THE MOTI ON OF ELECTRONS IN A PE RI ODIC LATTIC E with the lattice. We shall see that these modifications result, among other things, in changing the resistance of the crystal to the conduction of electricity. Our approach in finding the allowed energies of electrons in solids has been to consider the effect of forming a solid as the individual constituent atoms are brought together. If, instead, we had begun by modelling the periodic potential seen by an electron in the crystal lattice by a succession of rectangular wells and barriers, and had then solved the Schroedinger equation for such a potential, we would have found sinusoidal wave solutions in certain energy ranges (the allowed bands) and real decaying exponential wave solutions in the other energy ranges (the forbidden bands). This approach permits detailed quantitative calculations, but we present it here only qualitatively. Although the electrons tend to smooth out the variations in the potential due to the ions, the potential is not constant but varies in a periodic way. The actual shape of the potential determines the exact solution to the Schroedinger equation for an electron in a crystal lattice, but the most important feature of the potential is its periodicity. The effect of periodicity is to change the free particle traveling wave eigenfunction in such a way that instead of constant amplitude it has a varying amplitude which changes with the period of the lattice. If the space periodicity of the lattice is a, then, according to Bloch, the eigenfunctions for a one-dimensional system do not have the free particle traveling wave form i/i(x) = Ae ikx of (13-8), but instead they have the form (13-11a) 41/(x) = uk(x)e ikx where the periodicity of the lattice requires that (13-11b) uk(x) = uk(x + a) = uk(x + na) n being an integer. Hence, the effect of the periodicity is to modulate periodically the free-electron solution amplitude. The wave function is (13-12) 'P(x,t) = uk(x)e i(kx where the second (exponential) factor describes a wave of wavelength )L = 2rc/k that travels toward + x if k > 0 and toward — x if k < 0, and the first factor uk(x) describes the modulation. The function u k (x) resembles the eigenfunction for an isolated atom. Its exact form depends on the particular potential assumed and the value of k. A very good approximation to V(x) for a crystal is an array of rectangular potential wells and barriers having the lattice periodicity, as in Figure 13-8. Each well represents an approximation of the potential produced by one ion. This is the Kronig-Penney model which is, of course, easier to treat mathematically than the real case, but which retains all of its important features. Let us now examine the model in more detail. For wells that are deep and widely spaced, the electron of not too high energy is practically bound within one of the wells, so that the lower energy eigenvalues are those of a single well. For wells that are closer together the eigenfunctions can penetrate the potential barriers more easily. This results in the spreading of a previously co Vo SOLIDS-CONDU CTORS AND SEMICONDUCTORS ^ 0.51 0.23 0.058 0 Single potential well Periodic array of wells Figure 13-9 Left: Allowed energies for an electron in a single potential well. Right: Allowed energies in an array of periodically spaced wells and barriers. The levels shown are for a well strength given by 2mV 0 / 2 /h 2 = (11) 2 , and a barrier thickness b = //16. Note the appearance of forbidden bands even for energies g greater than Vo . single energy level into a band of energy levels. As the separation of the wells is reduced the band becomes wider. Indeed, in the limit of zero barrier thickness we obtain an infinitely wide single well in which all energies are allowed, i.e., we obtain the free-electron model. In Figure 13-9 we compare the allowed energies of a single well with those of the Kronig-Penney model of an array of wells and barriers. Notice that each allowed band corresponds to a discrete level of the single well, and that forbidden bands appear even for energies 6 greater than the well depth v o . The band widths can be made to approach the level width as a increases (the width of the individual wells, 1, remaining fixed) and to approach a continuum as a decreases. In solving the Schroedinger equation for the Kronig-Penney model, we must satisfy the conditions on the continuity of and dpi/dx, just as we had to do for the single rectangular well. This restricts the validity of the Bloch solution, (13-1 la) and (13-11b), to certain ranges of energy and gives the allowed bands. For energy values in the forbidden bands, the eigenfunctions are rapidly damped by a real decaying exponential factor. The expression e(k) for the allowed energies in terms of the wave number k of the electron is more complicated than that for the free electron, but the gaps or discontinuities in energy occur at values of k given simply by k=±â,±a 2 ,±3a ,... (13-13) in which a is the space periodicity of the lattice. In Figure 13-10 we plot the function e(k). At values of k equal to the values specified in (13-13) we get energy gaps, whereas for values of k not near those values the energies are much like that of a free electron shown by the dashed curve in the figure. The origin of the allowed and forbidden bands is apparent from the figure. Each allowed band corresponds to solutions to the Schroedinger equation in which the wave number k has positive values in a range of width n/a, and also negative values in a range of the same width. Note Vo cn CD • 2.0 W C!D I^ 37r a Brillouin 6I5 zone number a a 4 ^ 27r 71- a 0 3 I 2 I 1 a a 2 a a 2 % a 3 J4I5 Energy bands a 6 Figure 13-10 Allowed energies in a one-dimensional lattice of periodicity a, as a function of the wave number k. The dashed curve gives the free electron model result, for comparison. The allowed and forbidden energy bands that result are shown on the right. that this agrees with a conclusion obtained from a very different point of view in the last section, and expressed in (13-10). From the present point of view, the gaps between the top of an allowed band and the bottom of the next one up can be understood as a result of Bragg reflection of the traveling wave describing an electron propagating down the lattice. If a wave traveling to the right is incident on a set of barriers representing the regions between the ions of the lattice, spaced by the uniform distance a, it will be partly reflected by each of these barriers. Generally, the reflected waves traveling to the left will not be exactly in phase with each other, and so they will not combine constructively to produce a net reflected wave of large amplitude. But they will be in phase if the wavelength 2 of the incident and reflected waves is related to the spacing a by the one-dimensional version of (3-3), the Bragg condition 2a = 2, 22, 32, ... (13-14) Here 2a is the extra distance traveled in reflections from successive barriers, so if it equals an integral number of wavelengths 2 the reflected waves will all be precisely in phase and there will be a net reflected wave whose amplitude equals the amplitude of the incident wave. Since 2 = 2n/k, the Bragg condition is 2a = 2n/k, 2(2n/k), 3(27c/k), ... , or k = + rc/a, +27c/a, + 37c/a, ..., where we have inserted + signs to account for the fact that the incident wave could as well be moving to the left (to — x) as moving to the right (to +x). Comparing with (13-13), we see that the values of k f(k) occur are just those values of the wave number atwhicegpsnfuto for which the wavelength 2 satisfies the Bragg condition for constructive reflection. The gaps themselves arise because there are two distinctly different ways for the amplitude of the reflected wave to equal the amp li tude of the incident wave, at each critical value of k where these amplitudes are equal. Consider, for instance, a unit amplitude incident wave moving to the right along the x axis with k = rc/a. The traveling wave eigenfunction describing this is eikx = eirzxia. The reflected wave, which also has unit amplitude for this value of k, is e - ikx = e 'xIa. The total eigenfunction THE M OTION O F ELECTRONS I N A PER IODI C LATTICE 1.0 o SO LIDS-CONDUCTORS AND SE MICONDU CTORS co .4r is obtained by adding these two or, equally well, by subtracting them. The first possibility gives r = ei( 0'1°' + e - i(^c^a)x oc cos x a. (13-15) and the second gives ÿr = e`ocia)x — e -i(n/a)x oc 7r sin — x a (13-16) In both, the reflected wave has the same amplitude as the incident wave, and so it combines with it to form a standing wave; but the two cases differ very significantly in regard to the locations of the nodes of the standing wave, and therefore in the locations of the maxima and minima of the probability density tJr*tfr. In the case where t/r oc cos mx/a, the probability density will maximize at x = 0, as well as at x = + a, ±2a, +3a,... , while for t(i cc sin 7rx/a the probability density will be zero at all these points. If they are the locations of the barriers between ions, the electron described by IA will feel a larger repulsion, and therefore have a higher energy, in the cosine case than in the sine case. If these points are the locations of the ions, the situation will be reversed. But the basic conclusion—that there are two different energies e corresponding to the same value of the wave number k when k is any one of the values given by (13-13)—is independent of how the origin of the x axis is defined. Looking again at the function &(k) plotted in Figure 13-10, we see the two different values of e at each of the critical values of k where Bragg reflection will occur. We also see how this circumstance causes the e(k) curve to have an S-shaped deviation from the parabolic curve for a free electron in each region between the critical values of k. The range of k values between — 7r/a and + n/a defines what is called the first Brillouin zone; those k values between — 27c/a and — 7c/a and between + 7r/a and + 2n/a define the second Brillouin zone, etc., as is indicated below the k axis of the figure. 13-7 EFFECTIVE MASS When discussing the behavior of an electron in a periodic lattice under the application of an external electric field, it is very convenient to introduce the concept of the effective electron mass. This is done by using a relation developed in Section 3-4 to describe the motion of the electron in terms of a group of traveling waves. According to (3-13b), the velocity g of such a group equals the derivative of the frequencies y of its component sinusoidal traveling waves with respect to their reciprocal wavelengths K. That is dv = do) dK dk where v is converted to the angular frequency w, and K to the wave number k, by multiplying and dividing dv/dic by 2n. To remind the student of the meaning of this g= relation, we shall apply it to the simple case of a free electron, whose energy is e p2 h2 k 2 no) = 2m 2m = The last equality depends on the Einstein-de Broglie relation e = by = hcw. Evaluating dw/dk from this expression, we have dc^ h2k hk p my g =—= ------=v dk 2m m m m (13-17) de = qE dx = qE d^ dt = qEv dt = qEg dt But we also have, from e = ho) de= haw = hdkdk=hgdk Comparison then shows that qE dt = h dk or h dt = qE (13-18) If we take the time derivative of g do) 1 de dk h dk we obtain dg _ 1 d2e _ 1 d2e _ 1 deg dk dt h dt dk h dk dt h dk 2 dt or, using (13-18) dg 1 d2e dt h 2 dk2 qE Employing (13-17) again, this can be written dv qE dt m* where 1 1 d2e m* h 2 dk2 The quantity 1/m* is the reciprocal of the effective mass of the electron in the crystal lattice. The electron we are studying moves under the in fl uence of internal forces, exerted on it by the ions of the lattice, and an external force, exerted on it by the applied electric field E. If we wish, we can use (13-19a) to discuss its motion in terms of the external force alone since that equation is in the form of Newton's law of motion, acceleration equals external force divided by mass. Of course the effects of the internal forces are actually contained in the equation. They appear, however, only in the reciprocal effective mass 1/m*, which can have values quite different from the reciprocal of the true electron mass, 1/m. The properties of the lattice determine 1/m* because, as we saw in the preceding section, they determine the form of the function e(k) and so also the derivative SSW] 3/1I103333 We obtain the correct result that the group velocity g equals the velocity y of the electron whose motion is represented by the group. Of course this result is of general validity. Now we consider an electron in a one-dimensional lattice, whose wave number dependence of energy has the form 6' (k) that we have been discussing. To this system an external electric field E is applied. In time dt the electron of charge q moves distance dx, and the work done by the external field is the applied force qE multiplied by dx. Since this equals the magnitude of the change de in the energy of the electron, we have, using (13-17) SOL IDS-CONDU CTORSAN D SEMI CO NDUCTO RS e%vo 1/m*<0 1/m*^—'1/m 0 —7r/a + rr/a k Figure 13 11 Illustrating the reciprocal effective mass at various locations in the first and second Brillouin zones of a one-dimensional lattice. The points on the k axis indicate the uniformly distributed allowed values of k. - d2 6(k)/dk 2 appearing in (13-19b). Figure 13-11 shows the first, and part of the second, Brillouin zones of a one-dimensional crystal. The solid curve is e(k) and the co parabolic dashed curve is the free electron relation 6' = h2 0/2m. Near the center of c the first zone, where k h2k2 2m 1 m * = 026 dk 2 h2 02212,0h2 = 1 m. So in bc this region the lattice has very little effect on the electron, because its reciprocal e ffective mass is almost the same as its reciprocal true mass, and it responds to the applied electric field as if it were an essentially free electron. The curvature of the function g(k) changes significantly from the curvature of the parabola in proceeding in either direction from the center of the zone, which makes dramatic changes in the reciprocal of the effective mass of the electron and so in its response to the applied field. Since d2 6/dk 2 goes through zero, and then becomes negative and of large magnitude as k approaches either boundary of the first zone, 1/m* does the same. Thus in the upper part of the energy range of the band corresponding to the first zone the electron in the lattice responds to the applied electric field very differently from the way it would if it were a free electron. Where 1/m* is zero a given applied force qE causes no acceleration of the electron, and where 1/m* is negative the force causes an acceleration in the opposite direction to that which would be experienced by a free electron. (This has nothing to do with the sign of the electron charge which, to avoid confusion, we have written as q instead of — e.) At the bottom of the energy band for the second Brillouin zone, 1/m* is positive but appreciably larger than 1/m for a free electron, so the applied force produces a relatively large acceleration of the electron in the lattice. The response of an electron in a crystal to an applied electric field can be understood in terms of the way the electron wave is reflected by the potential barriers located between each pair of ions. At the bottom of the first energy band where the magnitude of the wave number has the value IkI 0 there is practically no reflection since the Bragg condition IkI = rc/a is far from being satisfied. When the field is applied the force it produces will increase the electron's momentum, and the work it does will increase the electron's energy, just as in the case of a free electron. Higher up in the band, where IkI is closer to the critical Bragg value n/a, reflection starts to become appreciable. In this region the work done on the electron will still increase its energy, but this increases the amount of reflection, and reflection corresponds to reversing the sign of its momentum. At the point where 1/m* = 0, the gain in positive momentum due to the applied field acting directly on the electron is exactly compensated for by the gain in negative momentum due to the enhanced reflection of If the curvature of g(k) is high, so that e increases rapidly with increasing k, then 1/m* in this expression is large. Since the allowed values of k are uniformly distributed along the k axis of Figure 13-11, the density of the corresponding energy levels along the e axis will be low if Z increases rapidly with increasing k. So the reciprocal mass can also be used to compare level densities of bands, in the regions where they obey (13-20). If the level density is relatively low, 1/m* is relatively large; if the level density is relatively high, 1/m* is relatively small. The concept of effective mass is useful in a variety of ways. For instance, the classical theory of the behavior of charge carriers under the in fl uence of an applied electric field is summarized by (13-1b), which predicts that the electrical conductivity a of the material containing the carriers is proportional to the reciprocal of their masses. We can easily modify this to take into account the quantum behavior of charge carrying electrons in a crystal lattice by replacing the reciprocal true mass with the reciprocal effective mass, obtaining (13-21) a cc 1* m Consider iron. The valence electrons in this metal partly fill its 3d bands, which are overlapping and narrow since 3d is an inner subshell in the transition element iron so the splitting of the atomic 3d level into the 3d bands is not very pronounced. Because the bands are narrow, the level density is high. Therefore the reciprocal effective mass is small for the electrons involved in electrical conduction in iron, the value of 1/m* being about 0.1/m. As a consequence, the metal is not a particularly good conductor. Copper, on the other hand, is a good electrical conductor. The reason is that for copper the 3d bands are filled, and the conduction electrons are 4s electrons which are in a very broad band (it overlaps the 3d bands) that has a low-level density and a high reciprocal effective mass (1/m* is roughly equal to 1/m). The 4s band is broad because this is an outer subshell of the atom and so the splitting in the crystal of the 4s atomic level is large. The result is that the conductivity of copper is an order of magnitude higher than the conductivity of iron. It should be pointed out that using the reciprocal effective mass in (13-21) amounts to accounting for the influence of a perfect crystal lattice on the accelerated motion of an electron in an applied electric field. As was discussed in Section 13-4, accelerated motion takes place between collisions of the electron with the imperfections that are actually found in the lattice of a real material, due to thermal motion of the ions or to impurity ions. These collisions SSdW 3n I1033d3 the electron by the lattice ions. Thus here the net change in electron velocity is zero, and from the point of view of its response to the applied field the electron effectively has infinite mass, or zero reciprocal mass. (Momentum is, of course, given to the lattice by the overall effect of applying the field, but not to the electron.) At the top of the band the reciprocal effective mass is large and negative because the enhanced reflection resulting from the closer approach to the Bragg condition of perfect reflection is much more significant in changing the electron momentum than the direct action of the applied field. The situation is reversed at the bottom of the next higher band, and so the reciprocal effective mass is large and positive there. Effective mass is also used in a somewhat different way to compare, for various bands, the curvature of the function e(k) in the concave upward approximately parabolic regions found except near the tops of bands. If the zero of k is taken to be at the boundary of the second zone, and the zero of e is taken at the bottom of the corresponding band, then f(k) for the part of the second zone shown in Figure 13-11 can be written as h2k2 e(k) . 2m* (13-20) SOLIDS-CONDUCTORS AND S EMI CONDUCTORS tend to randomize the electron motion, and they cause the over-all electron motion to be a drift with velocity proportional to the strength of the applied field, in contrast to an ever increasing velocity with acceleration proportional to the strength of the field. If there were no lattice imperfections, after a fixed field was applied the electron current would increase in time until it reached such large values that it was limited by practical considerations having nothing to do with either the strength of the field or the properties of the material. In such circumstances the material could be said to have zero resistance (or at least it could be said not to obey Ohm's law). So the presence of nonzero resistance, or noninfinite conductivity, is due to the presence of lattice imperfections. This can be seen in the fact that the resistance of a metal increases with increasing temperature and with increasing impurity concentration. Nevertheless, the value of 1/m*, which has to do with the properties of a perfect lattice, influences the value of the resistivity or conductivity because it influences the average velocity gain between randomizing collisions with imperfections, and this determines the drift velocity. In situations where all the levels of an isolated band are filled except for those near the very top, it is convenient to think in terms of holes representing the absence of electrons in an otherwise completely filled band. Since the absence of a negatively charged electron is equivalent to the presence of a positive charge, holes behave as if they are positively charged. Futhermore, since the effective mass is negative for the levels near the top of a band, holes, describing the absence of negative effective mass, behave as if they have positive effective mass. We shall have more to say about them, after we have explained briefly one of the most useful procedures for determining experimentally the behavior of electrons in solids. 13-8 ELECTRON-POSITRON ANNIHILATION IN SOLIDS The interaction of positrons with electrons provides a technique used, with great success, to measure the momenta of electrons in solids. The positron was introduced in Section 2-7. These particles have the same mass and the same magnitude of charge as electrons but positrons are positively charged. In that section the process of pair production, in which a photon disappears and is replaced by an electron-positron pair, was described. Of interest for the measurement of electron momentum, however, is the reverse process, pair annihilation, in which an electron and a positron disappear and are replaced by photons. In the usual experiment, high energy positrons, from radioactive sources, are directed toward a sample. Once inside, they quickly lose energy, via scattering and electronic excitation, to the particles of the material. They generally reach their lowest quantum state in about 10 -12 sec or less, after penetrating into the sample a distance on the order of 10 -4 m Annihilation takes place well inside the material and, at annihilation, the momentum of the positron is nearly zero. The most likely result of the annihilation event is the appearance of two photons, traveling in nearly opposite directions, each with energy nearly equal to the electron rest mass energy (511 keV). Slight deviations of the photon momenta from the same straight line can be used to obtain information about the electron momentum distribution in the sample. The geometry is illustrated in Figure 13-12, which shows an electron incident on a positron at rest and the emission directions of the resulting photons. For the analysis which follows, the direction of one of the photons is taken as the z axis and the x axis is taken to be in the plane of the photons. Experimentally the z axis is determined by the position of one of the photon detectors. Total relativistic energy is conserved in the annihilation process, so 2m0c2 = cp1 + cp2 where m o is the rest mass of the electron (or positron), p i is the magnitude of the momentum of one photon, and p2 is the magnitude of the momentum of the other photon. The kinetic and potential energies of the electron are small compared to its rest mass energy m oc2 and are neglected. Momentum is also conserved during annihilation, so p cos cp= p i — p 2 cos 0 and p sin cp= p2 sin B P2 Pi z Figure 13-12 Top: An electron incident on a positron at rest. Bottom: The momenta of the resulting photons. where p is the magnitude of the electron momentum and the angles 9 and 0 are defined in the figure. The momentum equations are solved for p i and p2 and the results are substituted into the energy equation to yield 2 moc = p (cos 9 sin 0 + sin 9 cos 0 + sin 9)/sin 0 For all electrons in solids p « m oc and 0 is extremely small, usually around 10 -3 radians. So sin 0 can be approximated by 0, in radians, and cos 0 can be approximated by 1. The first term in the parentheses is small compared to the other two and is neglected. When the last equation is solved for 0, using these approximations, the result is 0_ p sin 9 mo c The angle 0 is measured in what is called an angular correlation experiment and the result is used to calculate the x component of the electron momentum, p sin q). The difference in the photon energies is given by AE = cp l — cp 2 = cp cos 9 and, if this quantity is measured, the result can be used to calculate the z component of the electron momentum, p cos 9. This is rarely done, however, since much finer resolution can be obtained for angular measurements than for energy measurements. Position annihilation takes place in a free electron gas with the same concentration (number per unit volume) as the conduction electrons in lithium. Find the largest correlation angle 0, defined in Figure 13-12. ■ Consideration of the figure will make it apparent that 0 has the largest value when the Example 13 3. - annihilated electron has the largest possible momentum magnitude and one of the photons is emitted in a direction perpendicular to the electron momentum. Electrons with an energy equal to the Fermi energy 4 have the largest momentum magnitude. This is the Fermi momentum pF, where 4 = pF/2mo . Since the Fermi energy depends only on concentration, according to (11-57), and since the Fermi energy for lithium is 4.72 eV, we have pF = ^12 moeF = (2 x 9.11 x 10 -31 kg x 4.72 eV x 1.60 x 10 -19 joule/eV) 1 /2 or pF = 1.17 x 10 -24 kg-m/sec. The maximum angle is _ 1.17 x 10 -24 kg-m/sec OF — _ pF/moc 9.11 x 10 -31 kg x 3.00 x 10 8 m/sec or OF = 4.29 x 10 -3 rad. • SaI10SNINOLLV1IHINNV NO1ilISOd -N Obla313 x CD SOLIDS-CONDU CTO RSA N DSE M ICO NDUCTO RS CD 5 10 15 o (10 -3 rad) Figure 13 13 Number of two photon annihilation events as a function of correlation angle B for a typical metal. The small angle portion is due to annihilation by conduction electrons while the large angle portion is due to annihilation by core electrons. - For a metal, a typical graph of the number of two-photon events as a function of the correlation angle 8 is like that shown in Figure 13-13. The curve is proportional to the number of electrons in the sample with x component of momentum equal to p sin (p. The central part of the curve is due to annihilation of conduction electrons. For an electron gas, this has a parabolic shape and the shape is not much different for conduction electrons in metals and semiconductors. By taking measurements with the sample in various orientations relative to the z direction, it is possible to construct the momentum distribution of the electrons. If the central portion of the curve is extrapolated, the correlation angle for annihilation of an electron whose energy equals the Fermi energy can be found and, from this, its Fermi momentum can be calculated. The wings of the curve, which generally have a Gaussian shape, are due to annihilation of electrons in atomic cores, which have higher momenta. The situation here is more complicated than for conduction electrons because the positron is repelled by the positively charged atomic core and may acquire a high momentum itself before annihilation. The curve reflects the momenta of both electrons and positrons. In most molecular solids, including a great many organic materials, in amorphous materials, and perhaps in ionic solids, some positrons become bound to electrons and form hydrogenlike "atoms," called positronium. There are two states of interest: a singlet state with the spins of the particles essentially antiparallel and a triplet state with the spins essentially parallel. Annihilation from the singlet state produces two photons, and the lifetime of positronium, in this state, is short—on the order of 10 -1° sec. In contrast, two-photon annihilation from the triplet state violates conservation of angular momentum and, instead, three photons are usually produced. The lifetime of triplet positronium is about 10 - 7 sec in free space. Detection of both prompt and delayed photons, in different events, is a signal that positronium has been formed. An external magnetic field is sometimes used to change the spin orientations, and the change in the relative yield of prompt and delayed photons provides further verification of positronium formation. Positronium does not form in materials, such as metals, in which electron concentrations are high and the positron suffers many collisions during its lifetime. In solids, the triplet state lifetime shortens to around 10 -9 sec, not as short as the singlet state lifetime but longer than the free-space triplet state lifetime. This decrease occurs because the positron, while still bound to an electron, is annihilated by another electron, outside the positronium "atom." The lifetime is dependent on the electron concentration at the site of the positronium and so a lifetime measurement provides information about the concentration. Positronium is generally trapped in large open spaces between molecules of the material, and the positron samples the electron concentration of such a region. Both the number of such regions and the electron concentration in them undergo changes when the material changes phase, and positronium lifetime measurements are used to study phase transitions in amorphous substances, such as glasses, and in organic crystals. Semiconductors are of much interest because their behavior is the basis for many practical electronic devices, such as transistors. Also, they are excellent illustrations of the ideas discussed in previous sections. Semiconductors are covalent solids that may be regarded as "insulators" because the valence band is completely full and the conduction band is completely empty at the absolute zero of temperature, but they have an energy gap between the valence and conduction bands of no more than about 2 eV. For silicon the energy gap is 1.14 eV and for germanium the gap is 0.67 eV. Although the value of the Fermi distribution function governing the relative population of an energy state in the conduction band to an energy state in the valence band is small, since kT 0.025 eV at room temperature, the number of available states in the conduction band is high. Hence the thermal excitation from the valence band into the conduction band occurs for a significant number of electrons, this number being the product of the number of electrons per quantum state and the number of quantum states per energy interval. Furthermore, the conductivity of a semiconductor increases rapidly with rising temperature, the number of excited electrons in silicon, for example, increasing by a factor of about one billion with a doubling of temperature from 300°K to 600°K. Since the valence band is filled at low temperature, with the four valence electrons of silicon or germanium forming covalent bonds, each electronic excitation into the conduction band leaves a hole in the valence band. These holes, acting as positive charge carriers, also contribute to the conductivity. In Figure 13-14 we illustrate the semiconductor band scheme. The conductivity of the semiconductors arising from thermal excitation is called intrinsic conductivity. There are other ways to enhance the conductivity, such as by photoexcitation. The energy gap in semiconductors is equivalent to the energy of photons in the red or infrared portion of the electromagnetic spectrum so that semiconductors are photoconductive. This contribution to the conductivity increases with the intensity of the light and will drop to zero when the light source is turned off and the normal thermal equilibrium distribution of electrons is restored. Still another way to increase the conductivity is by adding impurities to the semiconductor. That is, we replace some atoms of the semiconductor with atoms of another element, having about the same size but a different valence. The resulting conductivity, whose origin we explain presently, is called extrinsic conductivity, and the procedure is called doping. If a small quantity of arsenic is added to molten germanium, the arsenic impurities will crystallize with the germanium into its diamondlike structure. `Arsenic has five electrons per atom in the valence band and germanium has four electrons per atom in the valence band. Hence, four of the arsenic electrons are used for covalent binding and the fifth electron is nearly free. It cannot go into the filled valence band and is very weakly bound in an "orbit" of very large radius around the singly charged arsenic ion. The arsenic ion Coulomb attraction is largely shielded by polarization of the intervening germanium atoms; that is, the field of the ion is weakened by the dielectric nature of the germanium crystal. Because this fifth electron has such a small binding energy to the arsenic, it can be ionized, and go into the conduction band at a much Conduction band Energy gap Valence band Figure 13-14 The band scheme of a semiconductor in which the energy gap between the initially full valence band and the initially empty conduction band is small. Thermal excitation raises some electrons over the gap into the conduction band, leaving holes in the valence band. Sa Ol0naNOaIW3S 13-9 SEMICONDUCTORS SOLIDS-CONDUCTORSAND SEMICONDUCTOR S lower temperature than would be needed for electrons in the valence band. Hence, this excess electron will occupy some one of a set of discrete energy levels just below the conduction band at a low temperature, but it can very easily be thermally excited into that band. At ordinary temperatures all of these excess electrons go into the conduction band. The electrical conductivity can be controlled by the amount of arsenic used as an impurity. A significant effect is obtained with as little as one impurity atom per million semiconductor atoms. An impurity that contributes electrons is called a donor impurity and the resultant semiconductor is called an n type (negative) semiconductor because it has an excess of free electrons. - Example 13 4. Make a rough estimate of the binding energy of the donor electron of arsenic in a germanium crystal, taking the dielectric constant of the crystal to have the value K = 16, and the effective mass of the electron to have the value m* = 0.2 m. ^^The donor electron moves in the field of the arsenic ion, As +, and it behaves like the electron in the ground state of a hydrogenlike atom. The chief difference is that this electron moves in a polarizable lattice rather than in vacuum. Because the potential energy of the ion-electron system is now —e 2/K47rE O r, the corresponding hydrogenlike energy levels are given by replacing 4nE0 by K47re0 in the hydrogen energy-level formula, (4-18), and also by replacing the electron mass m found there by the effective mass m* to take into account the fact that the electron is actually in a crystal lattice. Since the electron is near a lower band edge where d2 e/dk 2 is large, m* is small; various evidence indicates the value is m* ^ 0.2m. So we have - ci E L U * e4 1 K 2 47rE^ C 1 22h m 2n2 where K = 16, m* = 0.2m, and n = 1. Since for value —13.6 eV, it is easy to show that E K= 1 and m* = m the energy E has the —0.01 eV Hence, according to our estimate, the energy required to ionize the arsenic donor electron in a germanium crystal to the bottom of the conduction band is about 0.01 eV. The value obtained directly from measurements of the photon energy required to ionize, or indirectly from measurements of the temperature dependence of the conductivity, is 0.0127 eV. See Table 13-2 for measured values in other cases. Note that the radius of the Bohr-like orbit of the donor electron is Km/m* - 80 times that of the ground state hydrogen atom, as can be seen by inspecting (4-16). So the electron moves in an orbit that contains a large number of germanium atoms. This justifies the use, in our previous estimate, of the dielectric constant, which is a macroscopic rather than a microscopic quantity that characterizes the germanium crystal when it is regarded as a continuum. t If a small quantity of gallium is added to germanium, the situation will be different from that just discussed. Gallium has three electrons per atom in the valence band, so that it has a deficiency of one electron per atom in forming the covalent bonds. The Table 13 2 - Donor and Acceptor Ionization Energies In Germanium Impurity Arsenic Antimony Impurity Gallium Indium ^ — 0.0127 0.0096 In Silicon donor (eV) 0.049 0.039 eacceptor — é, (eV) 0.0108 0.0112 0.065 0.16 Conduction band T ( Donor Li6b= 0.013 eV impurity levels 6, = 0.67 eV Valence band (Ga in Ge) (As in Ge) Left: Schematic energy-level diagram of a germanium crystal containing donor impurity atoms. Right: Containing acceptor impurity atoms. Figure 13 15 - result is a hole, which can drift through the crystal, behaving like a positive charge and mass as successive electrons fill one hole and create another. From an energy point of view, this impurity introduces vacant discrete levels slightly above the top of the valence band. Valence electrons are then easily excited into these impurity levels, which can accept them, leaving holes in the valence band. The energy separation between the acceptor levels and the top of the valence band is small for the same reasons that give a small separation between the donor levels and the bottom of the conduction band: a high dielectric constant and a small effective mass. An impurity that is deficient in electrons is called an acceptor impurity and the resultant semiconductor is called a p type (positive) semiconductor. Whether the conductivity of a semiconductor is p-type or n-type can be determined by the Hall effect. In Figure 13-15 we show schematically the energy-level diagram corresponding to each type. The localized energy levels of impurity atoms are not broadened into bands because these atoms are many lattice spacings apart and interact with each other very weakly. In Table 13-2 we list the energy of the levels introduced into germanium and silicon crystals by small amounts of common impurities. For donor impurities the energy from donor levels to the energy 6', at the bottom of the conduction band is given, whereas for acceptor impurities the energy from the top of the valence band 4, to the acceptor levels is given. Note that these energies are comparable to kT = 0.025 eV at room temperature. Therefore, we can expect to have plenty of thermal ionization at room temperature. In an intrinsic semiconductor the number of vacant states in the valence band is equal to the number of occupied states in the conduction band, so that the Fermi energy is located somewhere in the gap between the bands. If the densities of states in the two bands are symmetrical then the Fermi energy will be in the middle of the gap. The Fermi energy, as the student will recall, is defined as the energy for which the average number of electrons that would occupy a quantum state there is 0.5, where we treat electron spin in such a way that the maximum occupancy is 1.0. - Example 13 5 Consider a forbidden band of width g9 that separates a valence band and a symmetrical empty conduction band in an intrinsic semiconductor. Show that the Fermi energy lies at the center of the forbidden band, i.e., that = 6 /2 if g = 0 is taken to be the upper edge of the valence band. •The proof can be followed by inspecting Figure 13-16. At the top of the figure we plot N(e) the number of quantum states per unit energy interval for the upper part of the valence band and the lower part of the conduction band. The figure tentatively places the Fermi energy eF in the center of the gap of width g9 between the two bands. The density of states N(8) is drawn so that its descending behavior moving towards the top of the valence band is symmetrical to its ascending behavior moving away from the bottom of the conduction band. This is in qualitative agreement with the general behavior of N(e) throughout an entire isolated band (see, for example, Figure 13-5). - saolonaNOOI W3s g SO LID S-CONDUCTO RSANDSEMI CONDU CTORS o N( e) ti n (e) 6°r c as 0 The number of electrons as a function of energy in the valence and conduction bands of an insulator or semiconductor with a forbidden band width y as a product and the Fermi distribution n(s). of the density of states N Figure 13 16 - e, (e) In the middle of Figure 13-16, we show the Fermi distribution n(s), which is the probable number of electrons per state. For clarity, it is constructed for an operating temperature where kT (°g . It is also constructed for eF in the center of the forbidden band. The solid curve in the bottom of Figure 13-16 shows the product n(e)N(e), which gives the number of electrons per unit energy in various states at the temperature just mentioned. The dashed curve shows the same thing for a temperature of absolute zero. At T = 0, the valence states are completely filled and the conduction states are completely empty, so the dashed curve in the valence region is just N(s), while it is the e axis in the conduction region. The area A between the dashed and solid curves is proportional to the number of valence states that electrons leave when the temperature is raised; that is, it is a measure of the number of holes created. The area B between the solid and dashed curves is proportional to the number of electrons that are promoted to states in the conduction band at the operating temperature. In an intrinsic semiconductor it is necessary that area A equal area B, since the density of holes in the valence band equals the density of electrons in the conduction band. It is apparent that this condition is satisfied by Figure 13-16, because we have constructed it with eF in the center of the forbidden band. A moment's consideration will show the student that it would not be satisfied for a different choice of e F, due to the symmetry of n(s) about gF, and to the (approximate) symmetry of N(s) about the center of the gap between the two allowed bands. Example 13 6. Make an estimate of the relative number of electrons in the conduction band of an insulator or semiconductor at temperature T. ^ Figure 13-16 also shows an exaggerated picture of the energy distribution of electrons as a product of the density of states mg)' and the Fermi distribution n(s) appropriate in the valence, forbidden, and the conduction bands of an insulator. If, in the Fermi distribution n(s), we have e - eF » kT, then - n(e) _ 1 1 e(6. e.HAT +1 — e(g - gF)IkT - so that in such an energy range the Fermi distribution varies with energy like the Boltzmann distribution. We know from Example 13-5 that e - gF = e9/2 at the bottom of the conduction band in an insulator, if we measure e from the top of the valence band. So the » kT is met since é9 » kT for an insulator. Thus we can take ( )= 1 n e 12kT = 2 - s /2kT s as the number of electrons per state in the conduction band of an insulator. The Fermi distribution falls in value by an order of magnitude in an energy range of about M = 2kT so that we get a good estimate. of AS', the number of conduction electrons, by evaluating those in the range 2kT above the bottom of the conduction band. Since AS . = n(g)N(g) M we must now evaluate N(s), the density of states. Because N(6) starts at zero at the bottom of the conduction band, a good average value over the range M = 2kT is obtained by evaluating N(s) at e = kT. Hence As- = n(6)N(e) M =g9/2kT (kT)2kT Let us use here the results .iV' = (2/3)gFN(gF) of Example 13-2 for a metal as an estimate of the total number of electrons, .iY. We also note from (13-4) that N(kT)/N(4) = (kT/ ) 1/2, so we have 0 s- e -g 9/2kT N(kT)2kT _ 3e_es/2kT (kT)(kT)1/z ^ N (213)4N(4) \ eF or (kT)3/2 e —G°9/2kT This is the relative number of conduction electrons for an insulator. This fraction is much smaller than the corresponding result kT/gr. of Example 13-2 for a metal, partly because the density of states N(s) is smaller near the bottom of the conduction band in an insulator than at the Fermi energy in a metal, but principally because of the occupation factor e - e912kT . Let us take g9 = 6 eV as the gap in a typical insulator so that at room temperature this factor is e g912kT = e -1 50 = 10 -65 . Not only is the fraction AX/.4( insignificant, but the absolute number of conduction electrons is also negligible for an insulator. If, however, g9 = 1 eV, as for a semiconductor, then although e gy/2kT = e-25 = 10 -1 1 gives a very small fraction, the number of conduction electrons is no longer insignificant. • In an impurity semiconductor containing donors, the Fermi energy lies above the middle of the forbidden band because there are more electrons in the conduction band than there are holes in the valence band. In an impurity semiconductor containing acceptors the Fermi energy is below the middle of the forbidden band because there are fewer electrons in the conduction band than there are holes in the valence band. It is instructive to consider the combined effect of temperature and impurities on the Fermi energy. Let us begin at a temperature of absolute zero in an n-type semiconductor. The donor levels are all occupied but there are no electrons in the conduction band. The Fermi energy then must lie between the donor levels and the bottom of the conduction band, because the number of electrons per state n(s) is one up to and including the donor levels and zero in the conduction band. Now, as the temperature is increased electrons are raised from donor levels to the conduction band. At that temperature at which half the donor states are empty, the Fermi energy corresponds to the donor-level energy. With a further increase in temperature, electrons in the valence band are excited and the Fermi energy drops more. When the number of electrons from the valence band is a very large fraction of those in the conduction band, the semiconductor acts as though it were intrinsic and the Fermi energy drops to nearly the center of the gap. If we had started with a p-type semiconductor we would find in a similar manner that, as the temperature is raised, the Fermi energy moves from between the top of the valence band and the acceptor levels, at absolute zero, to the center of the gap at high temperatures. At low temperatures, where kT « Ç9, conduction is due mostly to the impurities because there is little excitation of valence electrons. At high temperatures the impurity levels have been used up, that is, they have either donated or accepted electrons, so that the S1:1O10n 4NO0 IW3S condition t - SOLI DS-CONDUCTORSAND SEMICONDU CTORS Conduction band Donor levels High concentration Low 4/2 concentration 4g/2 61, = Low oncentration High concentration T Valence band Valence /o /kW/JIIWffl n type band idiV /A T Acceptor levels p type Figure 13 17 Left: The Fermi energy as a function of temperature for n-type semiconductors of two different impurity concentrations. Right: For p-type semiconductors of twb different impurity concentrations. - semiconductor acts as though it were intrinsic. In Figure 13-17 we plot the Fermi energy as a function of temperature for impurity semiconductors. 13 10 SEMICONDUCTOR DEVICES - We shall illustrate the use of impurity semiconductors in electronics by discussing briefly the operation of three semiconductor devices, the rectifier, the transistor, and the tunnel diode. A rectifier is formed by having acceptor impurities (p-type) in one region of a crystal and donor impurities (n-type) in another region. The boundary between these regions is called a p n junction. Figure 13-18 shows the energy band structure of an unbiased p-n junction at room temperature. The boundaries of the bands must be warped in going from the p-region through the junction to the n-region because the Fermi energy is close to the top of the valence band in the p-region and close to the bottom - Electron energy Valence band ^ End of rod p regi on I Junction End of rod Thermal Recombination — Forward biased current n region I region Unbiased electron current Reverse biased current Valence band:: ^ Thermal Recombination Thermal Recombination Figure 13 18 - Electron energy-level diagram for an unbiased p-n junction. ô S301A30 a OlOf10NO0IW 3S of the conduction band in the n-region, yet the Fermi energy must have the same value everywhere. The reason is that if the Fermi energy were not the same in both regions the energy of the system would not be minimized. It could be reduced by electrons in one region flowing to unoccupied states of lower energy in the other region, and so the system would not be in equilibrium. Actually, considerable electron flow did take place in establishing equilibrium when the p-region was initially put into contact with the n-region. This led to an accumulation of electrons on the p-side of the junction, and a deficiency of electrons, or accumulation of holes, on the n-side of the junction. Thus the junction region has similarities to a plane parallel condenser with a negative charge on the p-side and a positive charge on the n-side, as shown in the figure. If an electron is moved through the electric field produced within this dipole layer, its energy will increase in going from the n-side to the p-side. This is reflected in the way the energy levels at the top of the valence band, and at the bottom of the conduction band, are displaced upward in going through the junction region. Even after equilibrium is established there is still a flow of electrons back and forth through the junction. For one thing, from time to time thermal excitation causes an electron to jump up to the conduction band of the p-region (leaving yet another hole in its valence band). The electron can move freely to the junction region, and then be accelerated by the potential hill it sees there into the n-region, constituting part of what is called the thermal current. Also, an electron in the conduction band of the n-region with energy slightly below the bottom of the conduction band in the pregion can gain a little extra energy in a fluctuation and be able to move into the p-region. There it may recombine with one of the many holes in the p-region. That electron is part of the so-called recombination current. There must be such a current because in equilibrium the thermal current must be balanced so that there is no net current across the junction. Now consider an external voltage source applied across the ends of the device, with negative voltage applied to the p-region and positive voltage applied to the n-region. This will increase the energy of all the electrons in the p-region, and decrease the energy of all of those in the n-region, thereby increasing the height of the potential hill between the two regions. Since the junction region was already depleted of charge carriers, its resistance is relatively high and most of the voltage drop due to the applied voltage appears across the junction. As the amount of thermal current depends on the temperature and the width of the gap between the valence and conduction bands, neither of which are changed by applying the voltage, the thermal current will not change. The recombination current will be decreased by a large factor, however, because the potential hill is higher so now only the very many fewer electrons farther out in the exponentially decreasing tail of the Fermi distribution in the n-region conduction band have a chance to move into the p-region conduction band. The net effect will be a small flow of electrons in the direction from the p- to the n-regions, due to the unbalanced thermal current. This flow of electron current is, of course, in the direction that the applied voltage would be expected to produce. It is the small reverse bias current indicated by the arrows at the bottom of Figure 13-18. The junction rectifier is given a forward bias by applying a positive voltage to the p-region and a negative voltage to the n-region. This decreases the height of the electron energy hill between the two regions. Again, there is no appreciable effect on the thermal current, but the recombination current is increased by a large factor. All of a sudden, the very many more electrons that are closer to eF in the Fermi distribution of the n-region have enough energy to pass through the junction into the p-region conduction band, because the bottom of that conduction band has moved down in energy. These electrons do not instantaneously respond to the application of a forward bias, but instead they diffuse into the p-region in much the same way that the molecules of a gas would diffuse into a region of lower density that suddenly SOL IDS-COND UCTORS ANDSE MIC O NDUCTORS became accessible to them. The net electron current in a forward biased rectifier flows in the direction of the recombination electron current, as indicated at the bottom of Figure 13-18. The junction is a rectifier because the magnitude of the forward bias current is much larger than the magnitude of the reverse bias current, for a given magnitude of bias voltage. The reason is that the reverse bias current is limited by the small value of the thermal current, whereas the forward bias current becomes very large as the height of the electron hill is made small by increasing the forward bias. Resistance to current fl ow in reverse bias is typically greater than resistance in forward bias by four or five orders of magnitude. Note that our explanation has been phrased in terms of electron flow. It could as well have been in terms of hole flow; both processes occur, and they result in the same rectifying properties of the junction. A semiconductor rectifier has many advantages over a diode vacuum tube rectifier, including longer life and much smaller size. Like the diode, the p-n junction is a nonOhmic element, the current-voltage relation being nonlinear, as shown in Figure 13-19. Unlike a vacuum tube, however, there is no need for a power-consuming filament in the semiconductor device so that its efficiency is greater. A transistor can be regarded as a combination of two semiconductor rectifying junctions, such as a p-n-p or n-p-n combination. In Figure 13-20 we display a circuit that exhibits transistor behavior. The n-p-n-regions are called emitter, base, and collector, respectively. The emitter-base connection is biased in the forward direction, so that the resistance to current flow is small in this part of the circuit. The basecollector connection is biased in the reverse direction, so that ordinarily there is higher resistance to current flow in that part of the circuit. However, when a voltage is applied in the emitter circuit so that a current is established there, the electrons arriving in the base region (which is very thin and of lower conductivity than the emitter) are attracted by the potential difference between the base and the collector. Hence, there will be a current in the collector circuit. (Because the emitter has a higher conductivity than the base, most of the current across the emitter-base junction is carried by electrons moving from the emitter to the base, instead of holes moving from the base to the emitter.) The basic idea of transistor action is that a current in the emitter circuit controls a current in the collector circuit. More than 90% of the current through the emitter passes through the collector, so that the currents are of similar magnitudes. But the voltage across the base-collector connection can be very much greater than that across the emitter-base connection, because the former is reverse biased, so the power output in the collector circuit can be very much larger than the power input in the Circuit symbol: Silicon rectifier type 1N256 —500 —250 —0.002 Figure 13 19 Left: A circuit in which the voltage across a p-n junction can be varied. The voltage is taken as positive when the p-side is at higher potential. Right: Current through the junction as a function of the applied voltage. Note that very different scales are used for the forward- and reverse-biased portions of the curve. - I^ (mA) Circuit symbol: IE = 5 mA Emitter 0 p-. 4 mA A— Collector 3 mA ® Base 2— 1 mA I r 2mA — 1 I 0 — 2 — 1 I I I { 0 mA1 1 2 VcB (volts) 3 Silicon n-p-n transistor type 2N3646 Figure 13-20 Left: A circuit in which an n-p-n transistor acts as a power amplifier. Electrons flow in the direction shown by the arrow, from emitter to collector. Right: Characteristic curves for a transistor acting as a power amplifier. emitter circuit. Hence, the transistor acts as a power amplifier. Characteristic current versus voltage curves are shown in Figure 13-20. Other circuit connections make transistors useful as current amplification or voltage amplification devices, as well. A tunnel diode is a semiconductor device that makes use of the phenomenon of potential barrier penetration discussed in Section 6-5. It is like a p-n junction made from semiconductors with very high impurity concentration. Figure 13-21a plots the electron energy across an unbiased junction. The bands are similar to those shown in Figure 13-18, except that (1) with a higher impurity concentration the junction is narrower since a smaller length of semiconductor contains enough charge carriers to produce the required dipole layer across the junction, and (2) the donor and acceptor levels, in the n-type and p-type material, are no longer sharp but become broad bands which overlap the valence and conduction bands, since the donors, and also the acceptors, are so closely spaced that they interact. The Fermi energy thus moves up into the conduction band on the n-side and down into the valence band on the p-side. Because the junction is narrow ( 10'8 m), electrons can pass through the forbidden band at the junction by a process that is in every respect the same as barrier penetration. For instance, the eigenfunction describing an electron tunneling through the forbidden band has the same exponential form as the eigenfunction for an electron tunneling through a barrier. At equilibrium, as shown in Figure 13-21a, the rate of electron tunneling through the barrier is the same in both directions. If now a small external voltage is applied across the ends of the rod with forward bias, electron tunneling from the n-side to the p-side is increased because there are empty allowed energy states in the p-side valence band, whereas electron tunneling in the other direction is decreased. Hence, there is a net current flow through the junction as shown in Figure 13-21b. As the applied voltage continues to be increased, the net current begins to decrease because the number of empty states available for electron tunneling decreases. In Figure 13-21c the net current is reduced almost to zero because electrons in the n-type material find no allowed energy states into which to fl ow. With still higher applied voltage the electron current becomes that characteristic of a normal p-n junction. That is, electrons flow through the junction, without tunneling, into allowed energy states in the conduction band of the p-type material. This happens because the difference in the energies of the bands decreases, making it possible for electrons to diffuse through the junction into the conduction band of the p-region. This process is indicated in Figure 13-21d. Figure 13-22 shows the current-voltage curve characteristic of a typical tunnel diode. The letters labeling points on the curve correspond to the four applied voltages of the previous figure. In the region between points b and c, the slope of the Sec . 1 3-1 0 SE MI CON DUCTORD EVI C ES 4 SOLID S-CONDU CTORS ANDSEM ICO NDU CTORS p type n type End of rod (b) (d) (c) Electron energy-level diagrams for the n-type, junction, and p-type regions of a tunnel diode. In (a) the diode is unbiased. In (b) a small voltage is applied between the ends of the device, with the p-type end positive. In (c) and (d) the voltage is increased progressively. The arrows indicate the flow of electrons across the junction between the two regions. Figure 13-21 1.2 1.0 Germanium tunnel diode type 1N2940A 0.8 E 0.6 ti 0.4 0.2 0.5 V (volts) The current flowing through a tunnel diode as a function of the applied potential difference. The points labeled by letters correspond to the four applied voltages of Figure 13-19. Note that the resistance of the diode is negative for applied voltages between b and c. The dashed line indicates the characteristic current were no tunneling to take place—namely that for an ordinary germanium junction rectifier. Figure 13-22 QUESTIONS 1. In the text the solid state is contrasted with the gaseous state in terms of atomic (or molecular) interactions. How would you characterize the liquid state in this regard? 2. Explain the statement that the exclusion principle prevents solids from collapsing to zero volume. 3. Is there an analogy between the splitting of an energy level as two atoms are brought together to form a molecule and the splitting of the resonant frequencies as two resonant electrical circuits are coupled? Why? 4. It is often said that a crystal is one giant molecule. Explain. Can we regard a diatomic molecule as a small solid? 5. Why does metallic binding usually occur with atoms having a small number of valence electrons? 6. Why is it, considering the very similar electronic structures, that lithium is a metal whereas hydrogen is a molecular solid? 7. Explain why metallic binding leads to a close-packed arrangement of atoms; i.e., explain why the lowest energy in metallic binding corresponds to the greatest number density of atoms. 8. Why are metallic solids mostly opaque, covalent solids sometimes opaque, and ionic solids hardly ever opaque to visible radiation? 9. Of the four types of binding in solids discussed in the text, which one (or ones) is most likely to produce an insulator? A conductor? A semiconductor? 10. Justify the statement that (13-1a) meets the criterion that a material obeys Ohm's law. 11. What mechanisms account for the ordinary electrical resistivity of metals? Which are temperature dependent? 12. How do electrons contribute to thermal conductivity? Are they better than lattice vibrations as carriers of heat energy? 13. Explain why the electrical conductivity of materials varies over a factor of 10 24 whereas the thermal conductivity of materials only varies over a factor of about 10 8 . 14. Explain why we regard the sequential filling of holes by electrons as equivalent to a positive current. Could this process be regarded instead as an electron current? 15. How is the result of Example 13-2, concerning the fraction of conduction electrons that is thermally excited, related to the specific heats of metals at high temperatures? 16. Example 13-2 implies that only AiVI ✓V of the free electrons take part in the conduction of electricity, whereas certain other experiments, such as the Hall effect, indicate that all .iV electrons take part. Explain. 17. Explain why a negative effective mass does not lead to a violation of Newton's law of motion. 18. What techniques, other than electron-positron annihilation, might be useful in measuring the momenta of electrons in solids? ^ ^ SNOIlS3f10 curve, dI/dV, is negative and the tunnel diode has a negative resistance, the current decreasing with increasing applied voltage. This feature makes it particularly useful in the switching circuits of computers. The greatest advantage of the tunnel diode is its very fast response time when operating in the region a to c. The current flow in other kinds of semiconductor diodes and transistors always depends on the diffusion process. Since the rate of diffusion can change only as fast as the charge carrier distribution can be changed, these devices have relatively slow response (slower than vacuum tubes) and it is difficult to use them at high frequencies. But the rate of tunneling can change as fast as the energy bands can be changed by the applied voltage, and this is a much less serious limitation. Tunnel diodes have been used as oscillators at frequencies above 10 11 Hz, and in switching circuits that operate in times less than 10 -9 sec. SO LIDS-CONDUC TO RSAND SEMI COND U CTORS Q U 19. How is the optical transparency of a semiconductor related to the energy gap of the forbidden band? 20. What elements other than arsenic and antimony can be used as an impurity with germanium to form an n-type semiconductor? What elements other than gallium and indium can be used to form a p-type semiconductor? 21. Could the conductivity of a semiconductor be affected by electron bombardment? By bombardment by other particles? 22. What effect does an applied electric field have on an insulator? 23. Experimentally the addition of impurities to metals increases their resistivity, but the addition of impurities to semiconductors decreases their resistivity. Explain. Many insulators, however, are not very pure. Why do impurities not affect the resistivity of insulators? 24. Name the properties of solids that are little affected by the presence of small concentrations of chemical impurities. Name the properties of solids that are greatly affected by the presence of small concentrations of chemical impurities. 25. Give an argument, similar to that given in the text for an n-type semiconductor, explaining the variation of gF with Tin a p-type semiconductor. 26. Explain why the curves of Fermi energy as a function of temperature differ for different impurity concentrations, as shown in Figure 13-17. 27. Explain why the junction transition region is narrower in a semiconductor diode when the doping is heavy than it is when the doping is light. 28. Rephrase the discussion of the operation of a p-n rectifying junction in terms of hole flow. PROBLEMS 1. In Figure 13-23 we illustrate schematically four charge density distributions for valence electrons as functions of the location of atoms, ions, or molecules (shown as dots at the bottom). For each distribution (a), (b), (c), (d), state to which type of binding in solids it most closely corresponds. 2. Each element of the row of the periodic table from lithium through neon has a solid form (some at very low temperatures). Solids can also be formed by certain compounds of two elements of this row. For all of these solids, describe the binding and state whether the solid is a metal, a semiconductor, or an insulator. 3. Describe the binding of solids formed by single elements of the column of the periodic table from carbon through lead, and state whether the solid is a metal, a semiconductor, or an insulator. 4. Determine the type of binding in each of the solids described here. (a) Reflects light in the (d) (c AAAAAA ) (b) 1 1 1 1 Figure 13-23 Charge densities for valence electrons in four solids considered in Problem 1. 47rE r where the dipole is located at the origin of coordinates. (a) A molecule with an electric dipole moment p will induce an electric dipole moment p' in a nearby molecule, where p' = aE, a being the polarizability of the nearby molecule. Show that the mutual potential energy of the interacting dipoles is V=- p'• 6. 7. 8. 9. 10. E= a (47CE °) 2 2 (1+3cos 2 0)p r6 where B is the angle between r and p. (b) Show the force is attractive and varies as r -7 . Find the order of magnitude of the electric field needed in ionic solids to free electrons from the filled shells of ions. (Hint: Consider the binding energy of an electron and the approximate dimensions of an ion.) Find the region of the electromagnetic spectrum at which crystals of Si, Ge, CdS, KC1, and Cu become opaque. The band gap energies fq are Si = 1.14 eV; Ge = 0.67 eV; CdS = 2.42 eV; KC1 = 7.6 eV; Cu = 0 eV. (a) Using classical physics show that the resistivity of a metal near room temperature is proportional to the 3/2 power of the absolute temperature, in disagreement with the linear temperature dependence experimentally observed. (Hint: Show that v cc T1I2 and cc T -1 .) (b) How does the application of the ideas of quantum mechanics and quantum statistics yield the proper temperature dependence of the resistivity? Compare the values of (a) the drift velocity, (b) the thermal velocity, and (c) the velocity corresponding to the Fermi energy, or Fermi velocity, for electrons in copper at room temperature. (Hint: Use Table 11-2. A current of 5 amp can easily be carried in a copper wire 0.1 cm in diameter.) Show that, according to the free-electron model, the resistance R of a length L of wire is given by R = mL/nAe 2 T 11. 12. 13. 14. 15. 16. where A is the cross-sectional area of the wire and T is the mean time between collisions. An aluminum wire has a resistance of 0.01 ohm, a diameter of 0.83 mm; the mean collision time is 2.0 x 10- 12 sec. (a) If the effective electron mass is 0.97m, find the length of the wire. (b) Find the mean free path for an electron having the Fermi energy. Use data from Table 13-1. Calculate the number of electrons per atom of aluminum that conduct electricity from the value, -0.3 x 10 -i° m3/coul, of the Hall coefficient. The density of aluminum is 2.7 x 103 kg/m 3. What does the result suggest about the band structure of aluminum? (a) Show that the Hall coefficient for a semiconductor in which there is conduction by both holes and electrons is given by (pµp - nµ„)/e(p ji + nµ„)2 . (b) If in a certain semiconductor there is no Hall effect, what fraction of the current is carried by holes? Copper is a monovalent metal with a density of 8 g/cm 3 and an atomic weight of 64. (a) Calculate the Fermi energy in electron volts at 0°K. (b) Estimate the width of the conduction band. (a) Calculate the Fermi energy of an alloy of 10% zinc (which is divalent) in copper assuming that the alloy has the same atomic spacing and structure as Cu. (b) How does the width of the conduction band of the alloy compare to that of copper? The assumption used in (a) is not strictly accurate. Make an estimate of the width of a conduction band in a metal whose internuclear spacing has the typical value 3.5 x 10 -1° m. SW 31 80ad visible; electrical resistivity increases with temperature; melting point below 1000°C. (b) Reflects light in the visible; electrical resistivity decreases with increasing temperature; melting point above 1000°C. (c) Transmits light in the visible; conducts electricity only at high temperatures. (d) Transmits light in the visible; does not conduct electricity at any temperature. (e) Transmits light in the visible; very low melting point. 5. The field E produced at a point r by an electric dipole p is given by 1 ° (p r•p / E - 3 r5 r 3 o SOLIDS-CONDUCTO RSAND S EMICO NDUCTORS co Û 17. The Fermi temperature is defined by TF = gF/k. (a) Using Table 11-2, calculate the Fermi temperature for sodium. (b) What does this tell us about the applicability of classical considerations to metals near room temperature? (c) What does this tell us about the density of conduction electrons in a metal at room temperature? 18. The Fermi energy of lithium is 4.72 eV. (a) Calculate the Fermi velocity. (b) Calculate the de Broglie wavelength of an electron moving at the Fermi velocity and compare it to the interatomic spacing. 19. The Fermi energy for lithium is 4.72 eV at T = 0°K. Find the density of states at 3.0 eV. 20. Calculate an approximate ratio of the electronic specific heat to the lattice specific heat of lithium at room temperature. (Hint: Use the results of Example 13-2, and justify this use.) 21. (a) Show that the effect of a lattice periodicity a on periodic potentials having Bloch function solutions is to modulate the free-electron solution so that >ji(x + a) = 1/i(x)eika. (b) Show that e ika = —1 at the Brillouin zone boundaries. Comment on the meaning of this result. 22. For a three-dimensional free electron gas confined to a cube, the allowed values of the momentum are distributed uniformly in momentum space. Assume that for each v al ue of the momentum with magnitude less than the Fermi momentum p F (the momentum corresponding to the Fermi energy) there are two electrons which have that momentum and that there are no electrons with momentum greater than pF. Show that the number of electrons that have a given x component px of momentum is proportional to 1 — (px/pF)2 . This result explains the parabolic shape of the angular correlation curves for positron annihilation in metals. 23. (a) For sodium use the concentration of conduction electrons to estimate the Fermi energy, the Fermi momentum, and the maximum correlation angle OF for photons from positron annihilation events involving conduction electrons. Sodium has a cubic unit cell with edge a = 4.22 A and there are two atoms per cube. (b) Repeat the calculations for potassium. Potassium has the same crystalline structure as sodium but the cube edge is 5.22 A. (c) In positron annihilation experiments, which of these two metals produces the greater fraction of photon pairs with correlation angle greater than 9F ? 24. At what temperature will the number of conduction electrons increase by a factor of 20 over the number at room temperature for germanium? The gap energy is 0.67 eV. 25. (a) Show that the number of electrons per unit volume in the conduction band of an intrinsic semiconductor is given by ,ir° e -(‘` - ")I kT where X° = 2(22rmkT) 3 / 2 /h 3 , and where 6, is the conduction band-edge energy. (b) Show that the number of holes per unit volume in the valence band of an intrinsic semiconductor is given .N„e -VF-gOlkT, where ✓V„ = 2(27rmkT) 3 /2 /h 3 , and where ell is the valence band-edge energy. 26. Use the expression for the number of electrons in the conduction band, and the number of holes in the valence band, given in Problem 25, and charge neutrality to find the position of the Fermi energy in an intrinsic semiconductor. 27. (a) Show that the product of the number of holes in the valence band and the number of electrons in the conduction band depends only on temperature and the gap energy. (b) Show that the conductivity a of an intrinsic semiconductor can be used to measure the gap energy by calculating ln a. 28. Write exact expressions for .iY and Xâ, the concentration of ionized and neutral donors respectively, in a semiconductor doped to a concentration of .Nd. 29. (a) The position of the Fermi energy in a doped semiconductor can be found from the condition of charge neutrality: X, + .iV = Afp + ✓V â where A7 is the number of electrons in the conduction band, is the number of ionized acceptors, Xi, is the number of holes in the valence band and 4â is the number of ionized donors. Assuming .N = 0 and ✓V„ » .Np show that charge neutrality leads to an equation quadratic in e gF/kT which has the solution , — e eF/kT = 1+ 1+4 -* eV`—ed)IkT d .11(c 2e - gdlkT 4 ^d e(g'- ea)IkT « 1 This means ./rd small or T large. Use a binomial expansion of the square root to show that ✓V„ = Afd and gF = ^ + kT ln Gird/Arc). This is the exhaustion region. All the donors are ionized but no electrons are excited from the valence band. (c) In the other limit ^ 4 d ^ ✓v^ e_ 1 »1 Also .Nd is large and T is small. Show that ,/Îrn = ^JVic e-(g,-ga)I2kT and ` +ed + kT ln ✓rd 2 2 .Arc This is the extrinsic region. Here the donors are being ionized. 30. Draw an energy-level diagram like that of Figure 13-18 for an n-p-n junction transistor and describe the power amplifier action of the transistor in terms of the figure. 31. The current which flows in a p-n junction is proportional to the number of electrons in the conduction band. (a) For an unbiased p-n junction, show that the current from the pregion to the n-region is proportional to e - (eg - 5F)/kT and this current is equal to the current from the n-region to the p-region so that no net current flows. (b) When a bias potential V is applied show that the net charge flow per unit area of junction is proportional to e — (e g — SF•)IkT( e eV/kT _ 1) 32. where eV is positive for forward bias and negative for reverse bias. A p-n junction is a double layer of opposite charges separated by a small distance and has the properties of a capacitance. The resistivity of a semiconductor can be controlled by doping. Thus the elements in the transistor circuit of Figure 13-24a can be manufactured 00 G (a) N P• Nara Er o (b) Figure 13 24 - An integrated circuit considered in Problem 32. SW31 80a d where ec is the conduction band-edge energy, and ed is the donor-level energy. (b) This equation is soluble in two limits. One is SOLIDS-CONDU CTORS AND SEMICONDUCTORS on a p-n-p semiconductor with appropriate layers etched away as shown in Figure 13-24b. This is an integrated circuit. Label the appropriate parts of Figure 13-24b with the corresponding numbers and letters of Figure 13-24a. 33. A tunnel diode junction is approximated by a rectangular barrier 100 A thick and 3.3 eV high. If 1.00 x 10 25 electrons strike the barrier each second with kinetic energy 3.1 eV, and the effective electron mass is 0.30m, what current passes the junction? 14 SOLIDS SUPERCONDUCTORS AND MAGNETIC PROPERTIES 14-1 SUPERCONDUCTIVITY 484 review of independent electron motion theories of conductivity; temperature dependence of conductivity; resistanceless current in superconductors; critical temperature; Meissner effects and their relation to resistanceless current; critical field; isotope effect evidence for importance of lattice vibrations; attractive electron-electron interactions by means of phonon exchange; conditions for formation of Cooper pairs; ordered pair motion under applied electric field; pair binding energy; origin of energy gap; gap width and relation to critical temperature; estimate of size and density of pairs; applications of superconductivity; Type II superconductors; fl ux quantization 14 2 - MAGNETIC PROPERTIES OF SOLIDS 492 relations between magnetic induction, magnetization, magnetic field strength, and magnetic susceptibility; diamagnetism and Lenz's law; comparison of diamagnetic, paramagnetic, and ferromagnetic susceptibilities 14-3 PARAMAGNETISM 493 role of independent permanent magnetic dipole moments; calculated susceptibility of system of atoms with two spin orientations; Curie's law as an approximation; comparison with experiment; paramagnetic susceptibility in metals 14 4 - FERROMAGNETISM 497 Curie temperature; failure of classical dipole-dipole interaction explanation; role of exchange interactions; structure of 3d bands in transition elements; partial bands; origin of ferromagnetism; domains; hysteresis; permanent magnetism 14 5 - ANTIFERROMAGNETISM AND FERRIMAGNETISM 503 properties; role of exchange interactions QUESTIONS 503 PROBLEMS 504 483 SO LIDS-SU PERCONDUCTORS AND MAG NETICPRO PERTIES 14-1 SUPERCONDUCTIVITY Shortly after the discovery of the electron it was recognized that the high electrical and thermal conductivities of metals could be attributed to the motion of electrons in the metal. Classical theories of metallic conduction treated these electrons as a gas of independent particles within the metals colliding with lattice imperfections. Using methods of the classical kinetic theory, many experimental facts of electrical and thermal conductivity could be explained. With the advent of quantum mechanics, it became possible to take into account the wave nature of electrons and the exclusion principle. A number of phenomena not previously explainable then became clear. For example, the need to use the Fermi distribution for free electrons led to an understanding of the electronic contribution to the specific heats of solids. The further application of wave ideas led to quantization of energy levels and the band theory of solids, which accounted for the wide range in conductivities observed in normal solids. The free-electron model approximation averaged out variations in the interactions of electrons with one another and with the lattice ions, and it could account for resistance to electron flow under normal conditions. A major failure of this independent particle model, however, is its inability to explain superconductivity. To understand that phenomenon requires taking into account the collective behavior of electrons and ions, or the so-called many-body effects, in solids. Let us now examine superconductivity. Many factors contribute to the electrical resistivity of a solid, as we have seen. Electrons are scattered by the deviations from a perfect lattice due to structural defects or impurities in a crystal. In addition, there are vibrations of the lattice ions in normal modes that constitute something like sound waves traveling through the solid; we refer to such waves as phonons. The higher the temperature is, the more phonons there are present in the lattice. When phonons are present, there is an electron-phonon interaction which scatters conduction electrons and causes further resistance. Hence, the electrical resistance of a solid should decrease as the temperature decreases, but we expect a residual resistance even near absolute zero due to the crystal imperfections. It therefore seems remarkable that the electrical resistance of some solids disappears completely at sufficiently low temperatures. In 1911, Kammerlingh-Onnes found that the electrical resistance of solid mercury drops to an immeasurably small value when cooled below a certain temperature, called the critical temperature Tc . Mercury goes from a normal state to a superconducting state as the temperature drops below T, = 4.2°K. Many other elements, and many compounds and alloys, have since been found to be superconductors with critical temperatures as high as 23°K. But not all materials superconduct. Figure 14-1 shows the resistivity at very low temperatures for a superconductor, tin, and a nonsuperconductor, silver. In a superconductor, currents can be set up which persist for years with no detectable decay. In 1933, Meissner and Oschenfeld found that as a superconducting substance is cooled below its critical temperature in the presence of an applied magnetic field, it expels all magnetic flux from its interior. If the field is applied after the substance has been cooled below its critical temperature, the magnetic flux is excluded from the superconductor. Hence, a superconductor acts like a perfect diamagnet. Both Meissner effects are illustrated in Figure 14-2. According to Lenz's law, when the magnetic flux through a circuit is changing, an induced current is established in such a direction as to oppose the change in flux. In a diamagnetic atom, the orbital electrons adjust their rotational motion to produce a net magnetic moment opposite to the externally applied magnetic field. We can say analogously that an external magnetic field does not penetrate the interior of a superconducting substance because in a superconductor the conduction electrons, whose motion is as unimpeded as in an 20 w ^ Silver I 0 I I I I 10 I I I I 20 T (° K) A plot of resistivity p versus temperature T, showing the drop to zero at the Figure 14-1 critical temperature T, for a super-conductor, and the finite resistivity of a normal metal at absolute zero. atom, adjust their motion to produce a counteracting magnetic field. The entire superconductor behaves like a single diamagnetic atom in this respect. Hence, the two principal characteristics of superconductors, namely the exclusion of magnetic flux and the absence of resistance to current flow, are related to one another. It is necessary to have a persisting (resistanceless) current to maintain the flux exclusion when the external field is on. Figure 14-3 shows a photograph of superconducting levitation. If a small permanent magnet is placed over a perfectly conducting surface, it will float there. If the magnet is placed on a surface which thereafter is made superconducting (by lowering its temperature), it will rise and float. A repulsive force large enough to overcome the weight of the magnet exists between the magnet and the diamagnetic superconductor, because the superconducting body excludes the magnetic lines of flux associated with the magnet. Serious engineering studies have indicated the feasibility of using this phenomenon to provide very smooth support for high-speed passenger trains. It is found that if the external field is increased beyond a certain value, called the critical field H„ the metal ceases to be superconducting and becomes normal. The value of this critical field for a given material depends on the temperature, as shown for the case of lead in Figure 14-4. As the external magnetic field increases, therefore, the critical temperature is lowered until when H > H,(0°K) there is no superconductivity for that material at any temperature. We can understand this as follows. Suppose that at some temperature below T, we turn on a magnetic field; the superconductor will act to exclude this field (the Meissner effect). The energy decrease of the magnetic field appears as increased energy of the electrons that make up the superconducting current. As the strength of the external magnetic field is increased, the energy acquired by the superconductor also increases. At the critical value of the field, H„ the energy of the superconducting state becomes higher than the energy of the normal state, so that the material becomes normal. -NŒ=t- tiff = - H=O,T<TT H#O,T<TT H#0,T>T° H#0,T<Tc Figure 14-2 Left: A schematic illustration of expulsion. Right: The exclusion of magnetic flux in a superconductor. Both are called Meissner effects. AlInIl0f14NO 0b13d f1 S Tin SOLID S-SU PERCO NDU CTORSAND MA GNETICPROPERTIES r c Figure 14-3 A permanent magnet floating over a superconducting surface. Evidence that the lattice vibrations play an important role in the phenomenon of superconductivity came in 1950 when experiment revealed that the critical temperature of crystals made from different isotopes of the same element depends on the isotopic mass. The dependence, given by (14-1) M' 12 TC = const in which M is the average isotopic mass of the solid, is called the isotope effect. This relation shows that the critical temperature would go to zero (hence, no superconductivity) in the absence of lattice vibrations (when M co). The importance of lattice vibrations suggests that an electron-phonon interaction is responsible for superconductivity. We can no longer ignore those very interactions which were neglected in the independent particle model of a solid—the electron-phonon and also the electron-electron interactions if we hope to get a theoretical explanation of 4 6 T^ T (° K) Figure 14-4 The variation with temperature of the critical field He for lead. Note that He is zero when the temperature T equals the critical temperature Tc. ^ ^ Al ln llOn aN O01:1 3d ns superconductivity. In 1957, Bardeen, Cooper, and Schrieffer proposed a detailed microscopic theory, now known as the BCS theory, in which these interactions are included. The predictions of the BCS theory are in excellent agreement with experimental results. Let us now consider a qualitative picture of it. An electron in a solid passing by adjacent ions in the lattice can act on these ions with a set of Coulomb attractions which gives each of them momentum that causes them to move slightly together. Because of the elastic properties of the lattice, this region of increased positive charge density will then propagate as a wave, which carries momentum, through the lattice. The electron has emitted a phonon! The momentum the phonon carries is supplied by the electron, whose momentum changed when the phonon was emitted. If a second electron subsequently passes by the moving region of increased positive charge density, it will experience an attractive Coulomb interaction, and thereby it can absorb all the momentum the moving region carries. That is, the second electron can absorb the phonon, thereby absorbing the momentum supplied by the first electron. The net effect is that the two electrons have exchanged some momentum with each other, and thus they have interacted with each other. Although the interaction was a two-step one, involving a phonon as an intermediary, it certainly was an interaction between the two electrons. Furthermore, it was an attractive interaction, since the electron involved in each of the steps participated in an attractive Coulomb interaction. The BCS theory shows that in certain conditions the attraction between two electrons due to a succession of phonon exchanges can exceed slightly the repulsion which they exert directly on each other because of the (shielded) Coulomb interaction of their like charges. Then the electrons will be weakly bound together, and form a so-called Cooper pair. We shall see that Cooper pairs are responsible for superconductivity. The conditions for their formation, in numbers large enough to allow superconductivity, are (1) that the temperature be low enough to make the number of random thermal phonons present in the lattice small (they would inhibit the ordered processes involved in superconductivity); (2) that the interaction between an electron and a phonon be strong (so that a substance which has a relatively low resistance at room temperature, because its conduction electrons interact weakly with thermal lattice vibrations, will not be a possible superconductor at low temperature); (3) that the number of electrons in states lying just below the Fermi energy be large (these are the electrons which are energetically able to form Cooper pairs); (4) that the two electrons have "antiparallel" spins (then their space eigenfunction will be symmetric in a label exchange, which means that they will be close enough together to form a pair); and (5) that, in the absence of an externally applied electric field, the two electrons of a pair have linear momenta of equal magnitude but opposite direction (as will be explained next, this facilitates the participation of the maximum number of electrons in pair formation). Because Cooper pairs are weakly bound, they are constantly breaking up and then reforming, usually with different partners. Also, because they are weakly bound they are large. (In Example 14-2 we shall estimate the typical separation of two electrons in a pair to be of the order of 10 ¢ A.) Thus, within the region occupied by the electrons of a pair, there are very many other electrons that would also like to participate in the pairing process. The system will be most tightly bound, and therefore most stable, if they can do so. The system achieves this by having the total linear momentum of each pair equal to zero, in the absence of an applied electric field. The discussion of the formation of a pair shows that the total momentum of any pair is a constant, since the net result of exchanging a phonon between the two electrons is to preserve the total momentum of the pair. If all the pairs have the same constant total momentum, then there will be no inhibition to the unavoidable process of old pairs breaking up and new pairs reforming, because any pair can be converted to any other pair by SO LI DS-SUP ERCONDU CTORS A ND MAGN ETI C PRO PERTIES co co phonon exchange, and so the maximum number of pairs will be present. This conclusion is plausible from the qualitative argument we have given. It is put on a completely firm foundation by the quantitative calculations of the BCS theory, which show that the wave functions describing pair formation are in phase, and thus add constructively and lead to a large total probability for pair formation, when the pairs all have the same total momentum. In the absence of an applied electric field, symmetry considerations obviously demand that the common value of the pair total momentum be zero. So we see why the two electrons of each pair have linear momenta of equal magnitude, but opposite direction, in such circumstances. We also see that the ground state of the system is very highly ordered, in that all the pairs in the lattice are doing exactly the same thing as far as the motion of their centers of mass is concerned. This order extends through the lattice, and not just through the region occupied by a pair, because the pairs are relatively large and there are many of them so there is multiple overlapping. The order propagates through adjacent overlapping regions. When an external electric field is applied, the pairs, which behave rather like particles with two electron charges, move through the lattice under the influence of the field. But they do it in such a way as to continue to maintain the order, because that will maintain their number at a maximum. Thus they carry current by moving through the lattice with all of their centers of mass having exactly the same momentum. The motion of each pair is locked into the motion of all the rest, and so none of them can be involved in the random scatterings from lattice imperfections that cause lowtemperature electrical resistance. This is why the system is a superconductor. It is tempting to think of a Cooper pair as acting like a boson, since it contains two fermions. If this could be done, superconductivity would be simply another example of Bose condensation, as in the superfluidity of liquid helium. That is, it would be the completely correlated motion of a set of bosons all in the same quantum state due to the effect of the (1 + n) boson enhancement factor discussed in Chapter 11. Theories which preceded the BCS theory tried unsuccessfully to use this approach. The reason why it is not valid is that the individual electrons in each pair are weakly bound to the pair, which also means the pair is large. As a consequence, the eigenfunction for the system of overlapping pairs must take into account the exchange of labels of one electron from one pair and one electron from another pair, as well as the exchange of labels of one complete pair and another complete pair. In the latter exchange the system eigenfunction will not change sign because two fermion labels are being exchanged, but in the former the eigenfunction does change sign since only one fermion label is being exchanged. So Cooper pairs are neither purely bosonlike (no sign change), nor purely fermionlike (sign change) with respect to all eigenfunction label exchanges that must be considered. In a system of tightly bound helium atoms, the only type of label exchange that must be considered is an exchange of the label of one atom with the label of another. Such an exchange actually involves an even number of fermion label exchanges (each atom contains two electrons, two protons, and two neutrons), so the eigenfunction does not change sign and the atoms of the system act like bosons. According to the BCS theory, the binding energy of a Cooper pair at absolute zero is about 3kTT . As the temperature rises, the binding energy is reduced, and goes to zero when the temperature equals the critical temperature T c. Above T„ a Cooper pair is not bound. With a binding electron-electron interaction at absolute zero, it is energetically advantageous for two electrons, each in single-particle states just below the Fermi energy, 4, to promote themselves to vacant states just above where they can interact in such a way as to form a Cooper pair. The energy required to put the electrons into the higher single-particle states is more than compensated for by the energy The critical temperature of mercury is 4.2°K. (a) What is the energy gap in electron volts at T = 0? •As stated earlier, the Cooper pair binding energy, or gap energy, is Example 14-1. eg 3kT ° So eig 3 x 1.4 x 10 -23 joule/°K x 4.2°K = 1.8 x 10 -22 joule 1.1x10 -3 eV • (b)Calculate the wavelength of a photon whose energy is just sufficient to break up Cooper pairs in mercury at T = 0. In what region of the electromagnetic spectrum are such photons found? AlI nIlOf10NO01:13d f1S made available by the binding of the Cooper pair they form. Thus the zero temperature Fermi distribution of a superconductor is unstable, in the sense that electrons in states within a range of the order of kT, below the Fermi energy will leave those states and enter states within a similar range above the Fermi energy, where they will form pairs. The result is that the T = 0 distribution of occupied states of a superconductor looks something like a T = TT Fermi distribution for a normal conductor. The reason why the electrons must be above `f to be able to freely form pairs is that a large number of unoccupied states are found only above 4, and unoccupied states must be available for the two electrons of a pair to enter after they change their momenta by one emitting and the other absorbing a phonon. Although there is an almost continuous distribution of single particle states available to each electron in a superconductor at T = 0, the distribution of states available to the system is anything but continuous. As far as the system is concerned, there is its superconducting ground state, then an energy gap of width eig in which there are no states at all, and above the gap a set of states which are nonsuperconducting. The gap width eg equals the binding energy of a Cooper pair. The gap arises because if one electron of the system in a single particle state in the region of width — kT, surrounding 6'F absorbs energy from some source, so that it makes a transition from that state to another single particle state only infinitesimally different in energy, then the pair of which it had been a member will be broken and the binding energy of the pair will be lost to the system. Thus the source must be able to supply an energy equal to a pair binding energy before an electron near gF can make a transition to the energetically nearest state. (Even more energy must be supplied to excite an electron well below eF, despite the fact that it is not in a pair, since all the nearby states are already occupied.) Therefore the minimum energy that can be accepted by the ground state system, which is the width of its energy gap, is the binding energy of a Cooper pair. The states which begin at the top of the gap are not superconducting since in them the system has enough energy for pairs to be broken. The width of the gap at T = 0 is (' g ^ 3kT,. But it narrows as the temperature rises, and it becomes of zero width at T = TT where the pairs are no longer bound. At temperatures below TT the superconducting ground state corresponds to a large scale quantum state in which the motions of all the electrons and ions are highly correlated. It takes the gap energy Çg to excite the system to the next higher state, which is not superconducting, and this is more energy than the thermal energy available to the system. For instance, at T = 0.1T, the value of the gap energy is still about eg = 3kT„ while the thermal energy is about kT = 0.1kTc . For most superconductors near T = 0 the energy needed to bridge the gap corresponds to photons in the very far infared, or microwave, portion of the electromagnetic spectrum. The existence and width of the gap is established experimentally by the abrupt change in absorption of far infared or microwave radiation when the photon energy by drops below the gap energy. 0 SO LIDS-SUPER CONDUCTO RSAND MAGNETIC PRO PERTIES rn • The energy is = hv = he So the wavelength is 3 6.6 x 10 -34 joule-sec x 3 x 10 8 m/sec =1.1x 10 - m 1.8 x 10 -22 joule • These photons are in the very short wavelength part of the microwave region. (c) Does the metal look like a superconductor to electromagnetic waves having wavelengths shorter than that found in part (b)? Explain. • No, since the energy content of shorter wavelength photons is sufficiently high to break up the Cooper pairs, or excite the conduction electrons through the energy gap into the non• superconducting states above the gap. ^,_ he ( ©g (a) Estimate the size of a Cooper pair of binding energy gg . • The wave function of a Cooper pair is made up of waves, describing its two component electrons, with wave numbers drawn from a range Ak corresponding to an energy range Al — Sg . The energy range is centered on SF, and the wave number range is centered on the corresponding k F. Since the energy of one of the electrons is Example 14 2. - ^ _ 2 h2k2 P 2m* 2m* we have M= and h22k Ak 2m* _ h2 k Ak2m* 2AktiAk k k ^ m *h2k2 M Setting g = SF, k = kF, and A i = gg , we have Ak gg kF 1F As 1g/1F — 10 -4 in a typical case, we obtain Ak 10 -4kF Since we saw in Chapter 13 that at the top of a band k = ic/a, if the zeros of k and I are taken at the bottom of the band as we do here, we can set kF 1/a. We also know that the lattice spacing is a — 1 A. Thus we find that 10 4 Ak ^ l A is the range of wave numbers contained in the wave function for a Cooper pair. A very general property of waves ((3-14), which leads to the uncertainty principle) then immediately tells us that the extent in space of the wave function is Ax ^ Ak `^ 104 A This is the size of a typical Cooper pair. (b) Estimate the density of Cooper pairs in a superconductor. ^ Example 13-1 shows that the density of conduction electrons in a metal is n 102 2 /cm 3 . The fraction that will form Cooper pairs in a superconductor is of the order of Ak/k F —10 -4. So nCooper pairs `v 10 18 / cm 3 Note that the volume of one pair is —(10 4 A)3 = (10 -4 cm) 3 = 10 -12 cm 3 . So each such volume contains —10 6 overlapping pairs! • The width of the forbidden gap, and the density of quantum states, in a superconductor can be determined from the current-voltage characteristic of a tunnel The Meissner effect can be stated in another way, namely, that it is possible to induce currents in a specimen in a time-invariant magnetic field simply by lowering the temperature. Such a statement contradicts Maxwell's equation E • dl = —d(1:1 B /dt (or V x E = — aB/at) and shows that the Meissner effect is not a classical effect but a quantum effect revealing itself on a macroscopic scale. This has been confirmed by experiments on a superconducting ring. If such a ring in a normal state is placed in a uniform magnetic field, and then cooled to the superconducting state, electric currents are established that fl ow in opposite directions on the inner and outer surfaces of the ring, as in the upper part of Figure 14-5. This excludes the field from the interior of the ring but does not affect the field inside the hole of the ring. When the external field is removed, the outside surface current disappears but the inside surface current persists. We say that the superconducting ring has trapped the original magnetic field in the hole, as in the lower part of Figure 14-5. When the magnetic flux trapped in the ring is measured as a function of the strength of the applied magnetic field, it is found that the flux is quantized, i.e., it increases in discrete steps. The system acts very much like a macroscopic Bohr atom in which one eigenfunction describes the correlated motion of the entire set of AlIAIlOnaNO 0a 3d nS junction. In such junctions a thin oxide layer ( 10- 9 m thick) separates a normal and a superconducting metal. Electrons tunnel through the barrier, which the nonconducting oxide layer represents, with the aid of an applied voltage. In 1962, Josephson predicted that if the metals on both sides of the junction are superconducting, a current can flow when no voltage is supplied. If a small voltage (' a few millivolts) is applied, an alternating current of frequency in the microwave range results. These effects can be used to detect extremely small voltage differences and to measure with enormous precision the ratio e/h used in determination of the fundamental physical constants. Other superconducting effects predicted by Josephson permit a number of quantum properties to be seen in a very simple way, particularly the quantization of magnetic flux, discussed below. There are many important applications of superconductivity. An obvious application is to superconducting electromagnets, whose fields arise from resistanceless currents flowing through the magnet windings, for use in electric motors and generators. A difficulty is that magnetic fields tend to be induced in the wires of the windings, which tends to destroy their superconductivity. But progress is being made in finding what are called Type II superconductors, which have Cooper pairs whose dimensions are small enough to allow a magnetic field to thread its way through the length of a wire in a set of localized channels. These channels lose their superconductivity, but the channels in between them do not. Several niobium-titanium alloys have been found which are Type II superconductors, and they also have the convenience of relatively high critical temperatures (T, ^ 20°K). The absence of power dissipation in superconducting elements makes possible many electronic applications in which space requirements and transmission time requirements are limited, as in computers. Because superconductors are diamagnetic, they can be used to shield out unwanted magnetic flux. This can be put to use in shaping the magnetic lens system of an electron microscope, for example, to eliminate stray field lines and to greatly improve the practical resolving power of the instrument thereby. Apart from such technological applications of superconductivity, of which a great many more can be cited, there is an increasing application of the theoretical ideas to other fields of physics. For example, these ideas have been applied to analyzing nuclear structure, with much success in accounting for otherwise unexplained experimental facts. In the next chapter we shall see similarities between the collective model of the nucleus and the BCS collective model of superconductivity. Some of the methods of superconductivity theory are being applied to the elementary particles of high-energy physics, as well, so that the theory suggests a unity underlying the various areas of quantum physics. N SOLIDS- SU PERCONDUCTORS AND MAGNETIC PROPERT IE S ^ Top: A ring of superconducting material is cooled below the critical temperature in the presence of a uniform magnetic field. Currents are established as shown on the inner and outer surfaces of the ring, thereby excluding the field from the superconducting material comprising the ring. Bottom: The external field is removed. The outside surface current disappears, and the inside surface current persists. The result is that magnetic flux is trapped in the hole enclosed by the ring. Figure 14-5 Cooper pairs traveling around the ring. Flux quantization arises because the eigenfunction must be single valued. The quantum of flux is 2ific/q, where q is the charge carried by one pair. The measurements confirm the BCS prediction that q = 2e. 14-2 MAGNETIC PROPERTIES OF SOLIDS Materials may have intrinsic magnetic dipole moments, or they may have magnetic dipole moments induced in them by an applied external magnetic field of induction. In the presence of a magnetic field of induction, the elementary magnetic dipoles, whether permanent or induced, will act to set up a field of induction of their own that will modify the original field. The student will recall that magnetic dipole moments, which can be regarded as microscopic currents (e.g., in atoms), are a source of magnetic induction B just as are macroscopic currents (e.g., in magnet windings). In fact, we can write B = ,uo H + µoM (14-2) in which M, called the magnetization, is the volume density of magnetic dipole moment, and H, called the magnetic field strength, is associated with macroscopic currents only. The magnetic vector H, which can be written as H = (B — µoM)/µ o , plays a role in magnetism that is analogous to the role of D in electricity, since D, the electric displacement, originates only with free charges, not polarization charges. The magnetic vector M, which can be written as µ/V, the magnetic dipole moment per unit volume, has the same dimensions as H. For certain magnetic materials, it is found empirically that the magnetization M is proportional to H. Hence, we can write M = xH (14-3) 14-3 PARAMAGNETISM In a paramagnetic material the atoms contain permanent magnetic dipole moments. These moments are associated with the intrinsic electron spin and the orbital motion of the electrons. (Nuclear magnetic dipole moments are three orders of magnitude smaller than the electronic magnetic dipole moments, and so they can be neglected for our purposes here.) An externally applied field of induction B will tend to align these dipole moments parallel to the field. Because the energy is lower when the magnetic dipole moment is parallel to the field than when it is antiparallel, the parallel alignment is preferred. The result is an induced field that adds to the applied field so that the susceptibility is positive. In comparison, diamagnetic effects are negligible. The tendency of magnetic dipole moments to line up in the field direction is opposed o.) co SII3N JH Wt/a tld w ■ in which the dimensionless quantity x is called the magnetic susceptibility. The principal problem in studying the magnetic properties of such materials is to determine x for them and to find how it depends, if at all, on the temperature T and the value of H. The magnetization M can be put in terms of x and B as xB M= (14-4) uo( 1 + x) From this expression we can see that if the susceptibility x is small compared to one, then M ^ xB/,u o and the contribution made to B by the magnetic moments, that is poM in (14-2), is small. This applies in fact to magnetic materials which are diamagnetic or paramagnetic. Diamagnetism is negative magnetic susceptibility, and paramagnetism is positive susceptibility. In diamagnetic materials the magnetization is opposite in direction to the field of induction, so that x is negative in (14-4). The value of B is smaller in the region of the diamagnetic material than it would be if the material were absent. The origin of diamagnetism is Lenz's law: the magnetic dipole moment arising from currents induced by an applied field opposes that field. A perfect diamagnet, such as a superconductor, excludes all flux from its interior so that B = 0 and x = —1 for such materials. For nonsuperconducting diamagnets, however, the magnitude of x is generally less than 10 -5 . In a vacuum, there is, of course, no magnetization and x = O. All substances exhibit diamagnetism, but the induced magnetic dipole moment responsible for it is masked in most substances by the existence of a permanent magnetic dipole moment. In such substances, called paramagnetic, the permanent magnetic dipole moments of the atoms tend to line up in the direction of the applied field. Here the magnetization M is in the direction of B and the magnetic susceptibility x is positive. For typical paramagnetic materials, x ^ 10 -4. In the presence of a strong field of induction diamagnetic substances are weakly repelled and paramagnetic substances are weakly attracted by the field, corresponding to the fact that x is relatively small for both types of substance though of opposite sign. A third, and most important, type of magnetic material is ferromagnetic. Ferramagnetism is the presence of a spontaneous magnetization in materials even in the absence of an externally applied field of induction. The only ferromagnetic elements are iron, cobalt, nickel, gadolinium, and dysprosium, but there are many compounds and alloys of these and other elements that are ferromagnetic. Ferromagnetic substances are strongly attracted even by relatively weak fields, their magnetization being very large. Ferromagnetic susceptibilities are as large as 10 5 . There is a connection between ferromagnetism and paramagnetism, only those crystals whose atoms or molecules are individually paramagnetic being capable of exhibiting the kind of cooperative behavior that leads to ferromagnetism. In the succeeding sections we examine paramagnetism and ferromagnetism in greater detail, and we discuss their relationship to one another and to diamagnetism. SO LIDS- SUPERCONDUCTOR SAND MAG NETICPRO PERTIES rn by the thermal motion which tends to make the directions of the magnetic dipoles random. Hence the susceptibility is temperature dependent, and its value is determined by the relative strength of the thermal energy kT and the magnetic interaction energy —p. • B. We expect the susceptibility to decrease with increasing temperature and, indeed, Curie found at low fields and not too low temperatures that C x T where C is a positive constant characteristic of the particular paramagnetic material. This is called the Curie law. In atoms with filled subshells, the spin magnetic dipole moments, and separately the orbital magnetic dipole moments, cancel in pairs. Only unfilled subshells can have unpaired electrons, so that we expect paramagnetism only in materials containing atoms whose electronic subshells are partly filled. In such materials the orientation in space of the total magnetic dipole moments can change without changing the electronic configurations of the constituent atoms. The inert gases, and many ions, have closed subshell configurations, so that they do not exhibit paramagnetism and are excellent for diamagnetic studies. Likewise in materials in which the pairing of spins is required, such as in covalent crystals and many ionic crystals, the magnetic dipole moments cannot change direction and such materials are also diamagnetic. The basic requirement for paramagnetism in solids is that the individual magnetic dipole moments have some degree of isolation. The atoms must act independently, for if the wave functions overlap significantly the operation of the quantum mechanical requirements concerning indistinguishable particles will tend to pair up the magnetic dipole moments. Many of the transition elements, and all of the rare earths, form paramagnetic solids. In these cases we have unfilled inner subshells, and the required isolation of the individual moments results from the shielding of these inner subshells by the filled outer subshells of the atoms. Let us now calculate the paramagnetic susceptibility for the simplest kind of system, that is one containing separated atoms, in each of which the electronic orbital angular momentum is zero and there is an unpaired electron of spin angular momentum with two possible space orientations. We imagine unpaired electrons placed in a magnetic field B, and we neglect the interactions between such electrons. Let n represent the number of unpaired magnetic dipole moments per unit volume. If n _ represents the volume density of moments that are parallel to the field and sn + represents the same for moments that are antiparallel, then n_ + n + = n. For a parallel alignment of the magnetic dipole moment it the magnetic potential energy is —12B, and for an antiparallel alignment the energy is µB. Then, from the Boltzmann distribution, we have for the number in each energy state n_ = cne" B/kT and n + _ cne - uB/kT , in which c is some constant of proportionality. The resultant magnetization, i.e., the magnetic dipole moment per unit volume, is M = ,ll(n_ — n +) = µcn(euB/ kT _ e-AB/kT) It is convenient to consider the average net moment, defined as µ = M/n and given by - e - µB/kT ) =M =µcn(e^`B/kT ^ n (n_ + n + ) cn(eµ B/kT _ e - µ B /kT ) or cn(eµB /kT + e-uB/kT) eµ B/kT - e - µ B /kT µ - µ eµB/kT + e- µB/kT (14-5) Since under ordinary circumstances µB « kT, we can expand the exponentials and obtain ( 1 + µB/kT) — (1 — icB/kT) µ 2B CD ^ (1 + yB/kT)+(1 —12B/kT) kT M nµ nu 2B ,uonu2 (14-6) H H kTH kT where we have used (14-4), for small x, to write B y oH. Hence, we obtain an approximation to the Cu ri e result x = C/T, in which C = µon,u2/k and the suscepti- bility varies inversely with the temperature. Note (14-5) shows that if the applied field B is removed we have µ = 0, and there is no net magnetization. The alignment of the elementary dipoles depends on the presence of the field and, in its absence, the thermal motion randomizes the dipole directions so that the net magnetization is zero. In the top of Figure 14-6 we plot the magnetization, M = nµ from (14-5), as a function of the applied field B for different temperatures. For small values of B, M is essentially a straight line whose slope is greater the lower the temperature. As B is increased the magnetization approaches the value nµ asymptotically. This is the saturation condition, in which all the unpaired magnetic dipole moments it are aligned with the applied field B. The strength of the field required for saturation increases with the temperature. In the bottom of Figure 14-6 we plot the ratio M/Mmax, where Mmax is the saturation magnetization, versus B/T for a paramagnetic salt. The curve is predicted by the exact theoretical calculation, (14-5), which agrees very well with the experimental 1.0 I 1 1 1 – Curie'ss = law I I I ^ I – – ^ 0.75 • 1.30°K – – – = – – ^ 0.50 • 2.00° K X 3.00°K • 4.21°K —Theory >^ – 0.25 I 0 0 I II ri 10 1 ll il 20 I l 30 I 1 1 – 40 BIT (10 3gauss/ °K) Figure 14-6 Top: A plot of magnetization M versus the magnetic induction B in a paramagnetic substance for two temperatures T 1 and T2 = 3T1 . Bottom: A plot of MIMmax versus BIT for the paramagnetic salt potassium chromium sulfate. WSII3N J`dWb'add The paramagnetic susceptibility then is given by SOLIDS-SU PERCO NDU CTORSAND MAGNE TICPROPERTIES points. The Curie law prediction, (14-6), is seen to be a good approximation at small values of B/T. ^ rx ci Û Example 14 3. (a) A magnetic field of induction achievable with an iron core eletromagnet is 1.0 tesla. Compare the magnetic interaction energy of an electron spin magnetic dipole moment with this field to the thermal energy at room temperature. ^^We have for spin magnetic dipole moment - eh = µb = 2m = 9.3 x 10 - 24 joules /tesla and for the magnetic interaction energy ,uB = 9.3 x 10 -24 joule /tesla x 1.0 tesla = 9.3 x 10 -24 joule = 5.8 x 10 -5 eV At room temperature, T = 300°K, the thermal energy is kT = 8.6 x 10 -5 eV/°K x 300°K = 2.6 x 10 -2 eV so that µB 5.8 x 10 -5 eV 3 = 2.2 x 10 kT 2.6 x 10 -2 eV Hence, the assumption jiB « kT is quite valid at ordinary temperatures and fields, ,uB being about 0.2% of kT in this example. In practice, the saturation region of Figure 14-6 is reached by going to lower temperatures rather than to higher fields. (b) For this case estimate the paramagnetic susceptibility in a solid material having n = 2.0 x 1028 moments/m 3 , a typical value for substances with one unpaired electron per atom. ■ From (14-6) we have, when tiB « kT 2 l2onk =kT _ 47.c x 10 -7 tesla -m/amp x 2.0 x 10 28/m 3 x (9.3 x 10 - 24 j oule/tesla) 2 1.38 x 10 -23 joule/°K x 300°K 5.2 x = 10 - The result is an estimate because the theory used is approximate, neglecting, as it does, interactions between the electrons. Most paramagnetic substances have measured values somewhat smaller than this result. • It is found that the Curie relation deduced above does not apply to metals, although it does apply to nonmetallic paramagnetic materials. Indeed, in metals the paramagnetic susceptibility is much smaller and virtually independent of temperature. We have a situation here somewhat like the one in Section 11-11 where we sought an understanding of the electronic contribution to the specific heats of metals. In the analysis leading to (14-6), we used the classical Boltzmann distribution. That was valid because the electrons were associated with different atoms and they could be distinguished by their location, but in metals we must use the Fermi distribution because the electrons behave there as a Fermi gas of indistinguishable particles. When we do so we get a smaller susceptibility than before, and one that is independent of temperature, as we now explain. In Figure 14-7a we plot the energy distribution of electrons in a metal, the energy states that correspond to spin magnetic dipole moments aligned antiparallel to the field being plotted above the energy axis and those that correspond to moments aligned parallel being plotted below the axis. Here we imagine the field B to be (nearly) zero. When B is increased, at first all the electron energies shift, the energy rising by ,uB for antiparallel moments and dropping by µB for parallel moments, as shown in Figure 14-7b. Some electrons will subsequently make transitions from the higher energy antiparallel states to the lower energy parallel states, leading to the equilibrium situation of minimum total energy shown in Figure 14-7c. We have seen in Example 14-3 that µB = 10 -4 eV at B = 1.0 tesla, which is a very small energy n (1;)N(&) (a) n (&)N(e) (b) (c) The distribution of electrons with energy in a metal; the electrons occupy states indicated by the shaded areas. States with spin magnetic dipole moments antiparallel to the applied field are plotted above the energy axis, and states with moments parallel to the field are plotted below. (a)-The applied field is essentially zero. (b) The situation immediately after the field is increased to value B. (c) The equilibrium situation in applied field B. In these diagrams the magnetic interaction energy )uB is greatly exaggerated relative to the Fermi energy eF. Figure 14-7 shift compared to the Fermi energy, eF ^ 1 eV. Hence, the number of electrons with parallel moments is only slightly larger than those with antiparallel moments, the randomizing thermal effect dominating, so that the susceptibility should have a small value. Furthermore the situation would not be expected to be sensitive to reasonable temperature changes so the susceptibility should be practically independent of temperature, as is observed experimentally for metals. 14-4 FERROMAGNETISM Ferromagnetism is a spontaneous magnetization of small regions of a material that exists even in the absence of an external field of induction. Let us summarize the principal known features of ferromagnetism. First, the spontaneous magnetization in ferromagnetic materials varies with the temperature. The magnetization is a maximum at T = 0°K and drops to zero at a temperature T c, called the ferromagnetic Curie temperature, as is illustrated in Figure 14-8. Secondly, at temperatures higher than Tc the materials become paramagnetic and have a magnetic susceptibility that is given by the relation x = C/(T — T e). This is a modification of the Curie relation for paramagnetic materials, in which x is not defined for temperature below Tc where the material has a permanent magnetization. Thirdly, a ferromagnetic material is not magnetized in the same direction throughout its volume but has many smaller regions of uniform magnetization direction, called domains, that may be randomly oriented with respect to each other. Finally, the only ferromagnetic elements are iron, cobalt, nickel, gadolinium, and dysprosium There is a quantum theory of ferromagnetism that can explain all these observed properties. But before going into it, we show in the following example that a simple classical explanation, which obviously suggests itself, is not sufficient. M,. nµ 0 Tcc The spontaneous magnetization M, versus temperature T in a ferromagnetic material. Tc is the ferromagnetic Curie temperature. Figure 14-8 WS113N Ob'W O H 1:13J n (4)N(e) SOLIDS-SUPERCONDUCTO RSAND MAGNETICPRO PERTIES Example 14-4. The field of induction produced by a magnetic dipole of moment along a line parallel to its axis is given by B = µ0µ/2xx 3 , where x is the distance from the dipole. Calculate the interaction energy of two iron atoms, with parallel and collinear magnetic dipole moments of magnitude µ = 2.2 Bohr magnetons, separated by the interatomic spacing in iron, 3 A. Then evaluate the temperature at which the magnetic interaction energy equals the thermal energy, to show that this classical dipole-dipole interaction will not explain ferromagnetism in iron. The interaction energy, when one dipole aligns itself in the field produced by the other dipole, is negative (binding) and of magnitude µoµ 2_ 4n x 10 - ' tesla-m/amp x (2.2 x 9.3 x 10 -24 joule/tesla) 2 E— 2itx 3 22t x (3 x 10 -10 m) 3 = 3.1 x 10 -24 joule Equating this energy to the thermal energy kT, and solving for T, we find 3.1 x 10 -24 joule 1.38 x 10 -23 joule/°K The temperature is very low because the dipole-dipole interaction energy is very small. At room temperature, thermal energy is three orders of magnitude larger, and the randomizing tendency of thermal agitation would completely destroy the tendency for the dipole-dipole interaction to align the individual magnetic dipole moments and produce a large total magnetization. Such alignment is, however, actually found in iron at room temperature because it is ferromagnetic at that temperature. So we conclude that the explanation of ferromagnetism cannot be the very weak classical dipole-dipole interaction. • E k To illustrate the quantum theory of ferromagnetism consider iron, cobalt, or nickel, all of which are transition elements that have partially filled 3d inner subshells. The quantum numbers m 1 and ms for the 3d electrons in an atom of a ferromagnet containing such atoms will have those values that minimize the energy of the ferromagnetic system, consistent with the requirements of the exclusion principle. If the z component orbital angular momentum quantum numbers m 1 of two 3d electrons have the same values, for example, the z component spin angular momentum quantum numbers m s must have opposite values. If the m 1 values are different, the m s the g factor, which specifies the ratio of the total magnetic dipole moment to the total angular momentum, has a value for ferromagnetic materials near the value g = 2 that corresponds to electron spin (see Section 10-6, particularly (10-23)). This indicates that the magnetization is due to "parallel" spin rather than orbital magnetic dipole moments. Thus the electrons in the 3d subshell of an atom of iron align themselves so that the spins are essentially parallel. The reason is that it reduces the energy of the atom. That is, two 3d electrons stay farther apart on the average if their spins are "parallel" than if their spins are "antiparallel," and if they are farther apart their mutual Coulomb repulsion energy is reduced. This is just the tendency (see Section 10-4) for the spins in an unfilled subshell to all couple "parallel" and maximize the total spin, to the extent allowed by the exclusion principle, because this minimizes the residual Coulomb energy. Thus a single atom of iron is paramagnetic, because it has a permanent spin magnetic dipole moment, basically because of the interaction between the spin coordinates and space coordinates imposed by the quantum mechanical requirements concerning the exchange of labels of indistinguishable particles. For this reason the spin coupling is sometimes said to be due to the strong exchange interaction operating within the atom. Now consider a crystal lattice of iron atoms. There is also a strong exchange interaction between adjacent atoms of the lattice because the electrons in the atoms are indistinguishable and the atoms are close enough to each other that indistinguishability makes a difference. This exchange interaction will also lead to a coupling valuescnbthm,wieansthpcbenialyr.Now WS1 13N OdW0lia33 of spins, i.e., the total spins of adjacent atoms, but it is more complicated than the exchange interaction within a single atom because the geometry of the system of atoms is more complicated than the geometry of a single atom. The results of the exchange interaction can be that the lowest energy of the system occurs when the spins of adjacent pairs of atoms are "parallel," or that it occurs when they are "antiparallel." In the first case the system will be ferromagnetic; in the second it will be antiferromagnetic. We can understand ferromagnetism by considering the five overlapping 3d energy bands of a crystal composed of one of the transition element atoms. The totality of these bands, which we shall here call the 3d band, can hold ten electrons per atom. When full, the band has five electrons with spin "up" and five with spin "down," per atom. The band is narrow because the 3d subshell is an inner subshell, as we discussed in Section 13-7. In the ferromagnetic atoms, however, the 3d band is only partially filled. In iron, for example, there are six 3d electrons per atom. If we at first assumed that three of these electrons have spin with one orientation and three have spin with the other orientation, the electrons occupying the lowest energy available states in each of two partial bands of opposite spin, we could not be sure that this is the state of lowest energy for the system because the exchange interaction of the lattice will shift the partial bands of opposite spin with respect to each other. The partial band of one spin, i.e., the collection of energy levels in which all the electrons have one spin orientation, will be lowered in energy by the exchange interaction and the partial band of the other spin will be raised in energy by the interaction. We could have five electrons per atom in one partial band, and the sixth in the partial band of the opposite spin, if the total energy of the system is lowered more by the exchange interaction than it is raised by the higher energy resulting from the asymmetrical population of electron energy levels between the two partial bands. That is, competing with the desire of all electrons to go into the partial band of lowest energy is the fact that, if they do, some will be forced by the exclusion principle to go into the higher energy levels of that partial band. We shall soon present a figure that illustrates, and further explains, this competition. Calculations show that for a few elements one partial band will indeed be filled and the other will not, so that a large spontaneous magnetization will exist in them. When the interaction between spins is calculated as a function of the ratio of one-half the internuclear separation to the radius of the 3d subshell in transition elements, it is found that parallel spin alignment is favored if this ratio exceeds 1.5. Typical values of the ratio are Mn, 1.47; Fe, 1.63; Co, 1.82; Ni, 1.98; so that iron, cobalt, and nickel are expected to be ferromagnetic and manganese not to be. In fact manganese crystals are not ferromagnetic. The theory is further confirmed by the fact that certain compounds (such as the Heusler alloys) which contain manganese atoms that are farther apart are ferromagnetic. In Figure 14 -9 we plot the energy difference between magnetized and unmagnetized configurations versus the ratio of half the internuclear separation to the 3d radius. As the separation between atoms is increased from the value giving the maximum, the 3d wave functions overlap less and less and the indistinguishability requirements soon cease to apply; hence, the exchange interaction reduces the energy less and less. If in a crystal lattice the valence electron subshell radii are small compared to the internuclear spacing, as in the rare earth elements, we expect the material to be paramagnetic because the individual spin magnetic dipole moments are isolated from one another. As the separation between atoms is decreased from the value which yields the maximum, the energy bands widen and the excess energy associated with the asymmetrical population in the magnetized state increases more than the exchange interaction reduces the energy. Indeed, we approach the situation in diatomic molecules wherein "antiparallel" spins give the lowest energy since the electrons spend SOLIDS-SUPERCONDU CTORS AND MAGNETI C PROPERTIES Ni 1.4 1.6 n 1.8 2.0 2.2 R 2 r3d Figure 14-9 The variation of the energy difference between unmagnetized and magnetized configurations with the ratio of the internuclear separation to the diameter of the 3d subshell, for some transition elements. most of their time between nuclei. In elements with valence electrons in outer unfilled subshells, the subshell radius is large enough, compared to internuclear separation, that we expect all these electrons to form pairs having "antiparallel" spins. Then there will be no spin magnetic dipole moment and the material will be diamagnetic. Figure 14-10 illustrates schematically the population of two partial bands of opposite spin, for internuclear separation smaller than, equal to, and larger than the range of values that leads to ferromagnetism. We see that the ferromagnetic situation is a delicate one in which the valence subshell radius is large enough to permit sufficient space overlap to allow the requirements of indistinguishability to apply, but at the same time small enough to prevent the width of the valence band from becoming too large. In those cases in which the magnetized state is favored, the energy difference between magnetized and unmagnetized states is of the order of a tenth of an electron volt per atom. This situation makes it clear, therefore, that the spontaneous magnetization is temperature dependent and that additional thermal energy made available by an increase in temperature can eliminate the conditions favoring the spin alignment responsible for ferromagnetism. At T = 0°K all the spin alignment permissible exists, but as the temperature is raised successively more of the "parallel" alignments are made random by thermal motion. Just below the Curie temperature, Tc, the alignment breaks up rapidly (see Figure 14-8), and it is entirely gone above Tc . For iron the Curie temperature is 1043°K, for cobalt it is 1400°K, and for nickel 631°K. The origin of domains remains to be explained. Ferromagnetic materials are not observed to be magnetized unless they have been put in an external magnetic field previously. It is said that, although spontaneous magnetization exists, the magnetization in one small region, or domain, of a ferromagnetic material can be oriented in a direction different from that in another domain, so that the macroscopic resultant magnetization can be zero. Domains arise in the first place because the energy of a large crystal is not a minimum when it is uniformly magnetized. The particular size and shape of a domain is determined by a process that minimizes the total of three different types of energy involved. There is first the magnetic field energy. If, for example, the entire solid specimen formed a single domain there would be a large external field and a large magnetic energy associated with the field. The external magnetic field can be greatly reduced, thereby decreasing the energy in it, by dividing the specimen into domains whose magnetizations tend to cancel one another as in Larger spacing Smaller spacing Illustrating schematically the valence band structure for three different internuclear spacings of a system of atoms which are, individually, paramagnetic. With decreasing spacing, the wavefunctions of electrons in valence subshells of adjacent atoms overlap, and exchange effects set in. They cause the valence level to split into a band and, from the point of view of the band being decomposed into two partial bands of oppositely aligned spins, they also cause the partial bands to be displaced with respect to each other. The possibility of ferromagnetism arises because, in a favorable case such as is illustrated, with decreasing spacing the displacement at first increases about as rapidly as the band width increases. This relation is not maintained into very small spacings because the band width increases more and more rapidly with decreasing spacing (see Figure 13-3). At all spacings, the levels of the two partial bands will be occupied in such a way that the Fermi energies are equal, since this minimizes the total energy of the system. For the situation described by the central figure, the number of valence electrons in the total band is sufficient to completely fill all levels of the lower partial spin band, but only the lower levels of the upper partial spin band. The system is then ferromagnetic since most of the valence electron spins are aligned in the same direction. In the figure on the right this does not happen because the energies associated with both exchange effects are small compared to kT. It does not happen in the figure on the left because the band width is large compared to the partial band displacements. Thus ferromagnetism requires not only that there be a range of valence subshell overlap where the two exchange effects have a particular relation, but also that the internuclear spacing to valence subshell diameter ratio be such as to make the overlap in the actual system be in that range. Figure 14 10 - Figure 14-11. However, the domain boundaries, or walls, are sites of highly localized and nonuniform magnetic fields of considerable intensity, and a second type of energy is required to create them. The third energy is the difference in energy between a situation where the specimen is magnetized in one direction relative to the axis of the crystal and a situation in which it is magnetized in another direction. In an unmagnetized piece of iron the individual domains, within which the magnetic dipole moments are aligned, are oriented at random. As we magnetize the iron by placing it in an external magnetic field, two effects take place. One is a growth in size of the domains that are favorably oriented with respect to the field at the expense of those that are not, as shown in Figure 14-12. Another is a rotation of the direction of magnetization within a domain toward the direction of the applied field. The wellknown hysterisis effect, in which the magnetization of ferromagnetic materials does not return to zero as we first apply an external field and then remove it, is due to the fact that the domain boundaries do not move completely back to their original positions when the external field is removed. The motion of these boundary walls is not reversible and is affected by crystal imperfections such as impurities and strains. The material is left magnetized even though there is no externally applied field, a condition called permanent magnetism. WSI13N rJb'W O1:1a3d Ferromagnetic spacing N O SOLIDS-SUP ERC ONDU CTORS AND MAGNETICPROP ERTIES ^ Ferromagnetic domains. Top left: In a single crystal the magnetization vectors must lie along equivalent axes of the crystal. This crystal has no net magnetization, although each domain is magnetized. Top right: In a polycrystalline substance the crystal axes are randomly oriented, so that the magnetization vectors are randomly oriented. Bottom: Domain patterns for a single crystal of iron containing 3.8% silicon. The white lines show the boundaries between the domains. (Courtesy H. J. Williams, Bell Telephone Laboratories) Figure 14-11 H= 0 %u ^H \ ^ \ \ U n magnetized Preferential domain growth Sudden domain rotation Saturation 0.01 mm Top: The growth of domains in a single crystal in an externally applied magnetic field H, showing schematically preferential domain growth, domain rotation, and saturation. Bottom: An external magnetic field, directed to the right, is imposed on a specFigure 14-12 imen. The magnetization in each domain is shown by white arrows. The domain boundary moves down across a region in which there is a crystal imperfection as the preferentially oriented domain grows. (Courtesy H. J. Williams, Bell Telephone Laboratories) ✓ J ^/ J Ferromagnetism (a) ✓ ✓ ✓ J J J Antiferromagnetism (b) J ✓ $ ✓ ^ $ J $ ✓ Ferrimagnetism J (c) Figure 14-13 Showing how elementary magnetic dipole moments are oriented by the interatomic exchange interaction in (a) ferromagnetism, (b) antiferromagnetism, and (c) ferrimagnetism. 14-5 ANTIFERROMAGNETISM AND FERRIMAGNETISM Two other types of magnetism, closely related to ferromagnetism, are antiferromagnetism and ferrimagnetism. In antiferromagnetic materials, of which Mn0 2 is an illustration, the exchange interaction forces adjacent atoms to have "antiparallel" spin orientations, as in Figure 14-13b. In Mn0 2 , for example, the negative oxygen ion has on each side a positive manganese ion; the magnetic dipole moments of the positive ions are aligned essentially antiparallel because each is paired with one of the oppositely oriented electron spins of the oxygen ion in the lowest energy configuration of the system. Hence such materials show very little gross external magnetism. If they are heated sufficiently the materials become paramagnetic, the exchange interaction ceasing to act. In ferrimagnetic substances two different kinds of magnetic ions are present; in nickel ferrite the two ions are Ni + + and Fe+ + + The exchange interaction locks the ions into a pattern like that of Figure 14-13c. The same antiferromagnetic exchange interaction exists, which aligns the magnetic dipole moments "antiparallel," but since ions with two different magnitudes of magnetic dipole moment are present, the net magnetization is not zero. The external magnetic effects are intermediate between ferromagnetism and antiferromagnetism, and here too the exchange interaction disappears if the material is heated above a certain characteristic temperature. The ferrites are crystals having small electrical conductivity compared to ferromagnetic materials, and they are useful in high-frequency situations because of the absence of significant eddy current losses. QUESTIONS 1. Why do superconducting currents flow on the surface of a superconductor? 2. Why is the electric field zero inside a superconductor? 3. Does perfect conductivity require that the interior magnetic field of a body be zero? What does it require of the interior magnetic field? 4. How would you measure the critical field of a superconductor as a function of temperature? 5. The critical external magnetic field at absolute zero varies with the material as M-1/2. Explain. 6. Can you say whether lead or aluminum has the higher superconducting critical temperature from the fact that at room temperature the electrical conductivity of aluminum is much larger than that of lead? SNOIlS3f1 0 J o, o CA) 0 SO LIDS- SUPERCONDU CTORSAND MAGNETICPRO PERTIES ^ a tro 0 7. A superconducting film can be used as a high sensitivity bolometer (an instrument for measurement of heat radiation). Explain. 8. To what extent can the two electrons in a Cooper pair be thought of as moving as if they were bound to opposite ends of a spring? What property of the system constitutes the spring? 9. Exactly what is the distinction between the energy states of an electron in a superconductor and the energy states of the superconductor itself? 10. Are there analogies between superconductivity and superfluidity? 11. Superconductors whose Cooper pairs are small enough to allow the existence of magnetic field carrying channels also have relatively high critical temperatures. What is the reason for this very convenient behavior of Type II superconductors? 12. Discuss the use of a paramagnet as a thermometer. In what temperature range would it be useful? 13. The magnetization induced in a diamagnetic sphere by an external magnetic field does not vary with the temperature, in sharp contrast to the situation in paramagnetism. Make this plausible. 14. Does the orbital motion of an electron contribute to paramagnetic behavior of the atom or only the intrinsic spin of an electron? 15. The paramagnetic susceptibility of the rare earth elements is generally greater than that of the transition elements. Take into account the electronic shell structure and explain why. 16. Is the neglect of the nuclear spin magnetic dipole moment justifiable in our discussion of paramagnetism? Explain. 17. From the fact that most organic molecules have magnetic dipole moments of less than a few Bohr magnetons, show that life processes cannot be affected by laboratory magnetic fields. 18. Why do the ferromagnetic elements come from the middle of the group of transition elements or from the middle of the rare earth elements rather than the ends of the respective groups? 19. Copper has a filled inner 3d electronic subshell and one 4s valence electron. Explain why you would not expect it to be ferromagnetic. 20. Why is susceptibility not defined for temperatures below the Curie temperature in ferromagnetic materials? 21. Are the electronic configurations of gadolinium and dysprosium consistent with the fact that they are ferromagnetic elements? Explain. 22. Why can the exchange interaction have a significant effect on a narrow band with a high density of states (as the 3d band in the transition elements) although the interaction energy is small? 23. A nail is placed at rest on a smooth table top near a strong magnet. It is released and attracted to the magnet. What is the source of the kinetic energy the nail has just before it strikes the magnet? 24. Why, for permanent magnets, do we use materials composed of small crystals and having large imperfections? Also why, for transformer magnets, do we use materials composed of large crystals having few imperfections? PROBLEMS 1. Estimate the size of a Cooper pair in mercury by equating the binding energy at 0°K to the electrostatic repulsion energy between the two electrons. 2. (a) Show, from Maxwell's equations, that resistivity p = 0 (a perfect conductor) implies that B = const inside the material. (b) Show, from Maxwell's equations, that B = 0 inside a material (a superconductor) implies that the resistivity of the material is p = O. SW37 8O1:id eF Figure 14-14 k The energy as a function of positive wave number for a superconductor; for Problem 5. 3. Show from Lenz's law that the Meissner effect implies perfect conductivity, but that perfect conductivity does not imply the Meissner effect. 4. The critical field of tin at 2°K is 0.02 weber/m 2 . Draw a graph of the magnetization at 2°K of a long thin sample of tin as a function of applied field. .5. Part of the e versus k diagram for electrons in a superconductor is shown in Figure 14-14. (a) Draw a curve of the density of electrons as a function of e for a superconductor at T = 0°K. (b) Draw a graph of the energy necessary to place holes in the superconducting state and electrons in the normal state. This is a graph of (e - eF) versus k; is at the center of the gap for a superconductor. The notion that only electrons are in the normal state and only holes in the superconducting state is not accurate. 6. When two metals are separated by a very thin insulator, electrons from one metal can tunnel through the insulator to the other metal. Electrons flow until the Fermi levels of the two metals are equal. When a battery is connected between the two metals, as shown in Figure 14-15, the Fermi levels are displaced and a current flows if there are filled electron levels in one metal opposite empty levels in the other metal. Draw current voltage characteristics for the following junctions. (a) Normal metal-normal metal. (b) Normal metal-superconductor. (c) Superconductor-superconductor. (Hint: The Fermi energy of a superconductor lies at the center of the energy gap.) 7. Use Faraday's law of induction to show that a hole in a superconductor will trap magnetic flux, i.e., dB/dt = 0 in the hole. Remember that the electric field E = 0 in any circuit through the superconductor which encloses the hole, and also that the Meissner effect does not apply to the hole. 8. Estimate the magnitude of the isotope effect for superconducting materials. Take the critical temperature for naturally occurring vanadium (99.76% V 51 , with mass 50.9440u; 0.24% V 50, with mass 49.9472u) to be 5.300°K precisely. What is the critical temperature for pure V50? 9. Derive (14-4) for the magnetization, using (14-2) and (14-3). 11 11 AA/U` Figure 14-15 Metals separated by a thin insulator; for Problem 6. co 0 SO LIDS-SUPERCO NDUCTORSAND MAGN ET ICPROPERTIES ^ 10. Show from (14-2) and (14-3) that x = —1 for a superconductor. Is this result consistent with (14-4)? 11. (a) Calculate the magnetization of 1 mole of oxygen at standard temperature and pressure in the earth's magnetic field. The susceptibility of oxygen is 2.1 x 10 -6 and the earth's field is 5 x 10 - 5 tesla. (b) What is the saturation magnetization of 1 mole of oxygen? Its magnetic dipole moment is 2.8 Bohr magnetons. 12. (a) Find the value of ,uB/kT for a paramagnetic material with a magnetization one-half the saturated value. (b) Use this result to find the magnetic dipole moment per molecule of pot as sium chromium sulphate. 13. Calculate the temperature of the sample of Example 14-3 when the magnetic field is reduced isentropically from 1 tesla at 1°K to 0.01 tesla, assuming Curie's law. (An isentropic process is one in which the populations of the states do not change. Hence the magnetization must remain constant.) This process is called adiabatic demagnetization and is useful in low-temperature physics. 14. What is the magnetization of the two-level system, discussed in connection with (14-5), when 1 uB » kT? 15. From Figure 14-7 it can be argued that the magnetization due to conduction electrons should be proportional to the number of electrons within µB of the Fermi energy. (a) Show that this leads to the susceptibility being given approximately by 3.AV µ0µb = TF x 2k where AT is the number of conduction electrons, y o is the permeability constant, jb is the Bohr magneton, and TF is the Fermi temperature. (b) Evaluate x for copper. 16. (a) Show that the specific heat at constant field CH for the two-level system, discussed in connection with (14-5), is given by ,/rk Cg = 2f lB 2 e2µB/kT ( kT^ (e 2µB1kT + 1) 2 where ✓Y is the number of atoms in the system. This is the Schottky specific heat. (Hint: Take the energy of the dipoles aligned parallel to the field to be zero.) (b) What is the temperature dependence of cH at high and low temperatures? (c) Sketch cH as a function of T. Estimate (do not calculate) where CH will be a maximum. 17. A ferromagnet can be considered to be similar to a paramagnet except that there is an internal molecular field Hw tending to spontaneously align the elementary dipoles. (a) The material will become spontaneously magnetized when the energy of interaction between the dipole and the molecular field is equal to kT c . Calculate the value of Hw for iron where the magnetic moment is 2.2 Bohr magnetons and Tc is 1000°K. (b) What is the magnetization of a 1 cm 3 sample of iron which has a single domain? (Density = 7.9 g/cm 3 ; atomic weight = 56). (c) What is the energy in the field? 18. The molecular field of Problem 17 can be taken as proportional to the magnetization of the sample so that Hw = 2M. (a) Show that this leads to a susceptibility given by X = C T—Tc where Tc = CA. (b) Calculate the value of for iron. 19. A simple model for an antiferromagnet is a lattice of two kinds of paramagnetic ions such that the nearest neighbors of A atoms are B atoms. If the antiferromagnetic interactions are between nearest neighbors only, the magnetization of the sample above the Curie point can be written as TM A = C'(H — 2 MB) and TM B = C'(H — A MA) C x — T+Tc where C = 2C' and Tc = CA. 20. Sketch curves of x -1 versus T for T > Tc for (a) a paramagnet, (b) a ferromagnet, and (c) an antiferromagnet, and discuss the meaning of the intercept on the T axis. cn 0 s ■ 31aoad Here C' is the Curie constant for one sublattice only. The effective field in sublattice A is H — AMB, and positive A corresponds to antiferromagnetic interactions between A and B atoms. Show that this leads to a susceptibility above Tc given by 15 NUCLEAR MODELS 15-1 INTRODUCTION 509 role of models; comparison of nuclear and atomic energy scales 15 2 - A SURVEY OF SOME NUCLEAR PROPERTIES 510 previously considered and newly introduced information concerning nuclear masses, charges, radii, magnetic dipole moments, spin, symmetry, and electric quadrupole moments; nuclear forces and their strong, attractive, short range, charge independent character; neutrons as nuclear constituents 15 3 - NUCLEAR SIZES AND DENSITIES 515 electron scattering measurements of nuclear charge distributions; charge density; half-value radii; surface thickness; similar value of interior mass density for all nuclei 15 4 - NUCLEAR MASSES AND ABUNDANCES 519 mass spectrometry; mass unit; isotopes; energy balance in reactions; Q value relations; results of mass determinations; mass deficiency; binding energy per nucleon and its roughly constant value for nearly all nuclei; saturation; fission; fusion; relation between stable N and Z values; tendency for even N and even Z 15 5 - THE LIQUID DROP MODEL 526 relation of universal values of interior mass density and binding energy per nucleon to properties of liquid drop; classical arguments for volume, surface, and Coulomb terms of mass formula; introduction of asymmetry and pairing terms; parameters; use of formula to predict neutron binding energies; . Brueckner theory determination of volume term parameter 15-6 MAGIC NUMBERS 530 experimental evidence; analogy to atomic physics; apparent difficulties in considering independent particle motion 15 7 - THE FERMI GAS MODEL 531 net nuclear potentials; exclusion principle production of independent particle motion; estimate of Fermi energy; origin of asymmetry term in mass formula 15-8 THE SHELL MODEL 534 relation to Hartree theory; eigenfunctions; radial node quantum number n; ordering of energy levels according to n and l; centrifugal potentials; exclusion principle construction of nuclei; failure to explain higher magic numbers; introduction of strong, inverted, spin-orbit interaction 15 9 - PREDICTIONS OF THE SHELL MODEL spins at or near magic numbers; JJ coupling; attractive residual interaction; antiparallel pairing tendency; origin of pairing term in mass formula; spins 508 540 and parities for nuclei of odd A, or of even A with N and Z even; nuclei of even A with N and Z odd; difficulties with magnetic dipole moments o 0 C/) 545 deformable net nuclear potentials describing collective motions; satisfactory prediction of magnetic dipole moments; shell model difficulties with electric quadrupole moments and satisfactory collective model predictions 15-11 SUMMARY CD C) ^ ^ 1 549 tabulated features of nuclear models QUESTIONS 550 PROBLEMS 551 15-1 INTRODUCTION In the past chapters our considerations have taken us from atoms to the larger systems, molecules and solids, of which atoms are constituents. Now we reverse our direction and consider the smaller systems, nuclei, which are constituents of atoms. There is a pronounced difference between the theoretical study of atoms, or systems of atoms, and the theoretical study of nuclei. Long before the theory explaining the properties of atoms was being developed, the basic nature of the electromagnetic forces acting on individual electrons in atoms was known in complete detail. But during most of the period when the understanding of the properties of nuclei was being developed, very little was known about the details of the nuclear forces acting on the protons and neutrons in nuclei. Although a fairly complete knowledge of nuclear forces has recently become available, they turn out to be complicated enough that it has not yet been possible to use this knowledge to construct a comprehensive theory of nuclei. That is, we cannot explain all of the properties of nuclei in terms of the properties of the nuclear forces acting between their protons and neutrons. However, there are a number of models, or rudimentary theories of restricted validity. Each of these can explain a certain limited range of nuclear properties, using arguments that do not involve all the details of the nuclear forces. Even though progress is being made on the development of a comprehensive theory, an introductory study of nuclei is still largely the study of the various nuclear models. In this chapter we treat the most important models and use them to describe and explain the properties of nuclei in their ground states. In Chapter 16 we use these models to study nuclei in their excited states, and to study naturally occurring transitions between nuclear states (nuclear decay, including radioactivity) and artificially produced transitions (nuclear reactions, including fission and fusion). The detailed properties of nuclear forces are treated in Chapter 17, where we consider the elementary particles which are constituents of nuclei. A pronounced difference between the experimental study of atoms and the experimental study of nuclei arises from the difference between their characteristic energies. The energy characteristic of nuclei is of the order of 1 MeV. For instance, we saw in Chapter 6 that the attractive nuclear potential exerted on a neutron when it is in a nucleus is a few MeV deep, and that the height of the repulsive Coulomb barrier separating two positively charged nuclei is also a few MeV. We shall soon see that the same order of magnitude characterizes the binding energy of a proton or neutron to a typical nucleus, and the separation in energy between its ground state and first excited state. The energy characteristic of atoms is of the order of 1 eV. Because this is so NOIlJ flaOalNl 15-10 THE COLLECTIVE MODEL cn 10 O NU CLEAR MODELS - 5 - Ga - a> n3 ^ ^ ¢ 2 – 1.0 - \ La \ 0.5 — \ - 0.2 \ \ Mo — 0.1 Bi Pb n = 0.05 0.02 I 001 60 80 100 120 140 160 Mass number A I I 180 200 220 Figure 15 1 The relative abundance of the elements. Note strong fluctuations superposed on a general decreasing trend with increasing A, the mass number. - much higher than room temperature thermal energy kT ^ 0.025 eV) atoms are easily excited, and they have little difficulty in combining to form molecules and solids. For nuclei, very special circumstances are required to produced excitation because of their very high characteristic energy. Weisskopf has described the situation well: low (not "In our immediate environment atomic nuclei exist only in their ground state; they affect the world in which we live only by their charge and mass and not by their intricate dynamic properties. In fact, all the interesting nuclear phenomena ... come into play only under conditions which we have created ourselves in accelerating machines. It is to some extent a man-made world. It is not completely man made, however. The centers of all stars are regions of the universe where nuclear reactions go on, and thus where nuclear dynamics plays an essential role in the course of nature. Hence the nuclear phenomena are the basis of our energy supply on earth, in reactors as well as in the sun. But nuclear physics is even more important for the world in which we live from the point of view of the history of the universe. The composition of matter as we see it today is the product of nuclear reactions which have taken place a long time ago in the stars or in star explosions, where conditions prevailed which we simulate in a very microscopic way within our accelerating machines. Hence the material basis of the world in which we live is a product of the laws of nuclear physics. I cannot better illustrate the interconnection of all facts of nature, the tightly woven net of the laws of physics, than by pointing to the chart of abundances of elements in our part of the universe (see Figure 15-1). Each maximum and minimum in the curve of abundances corresponds to some trait of nuclear dynamics, here a closed shell, there a strong neutron cross section, or a low binding energy. If the 7.65 MeV resonance in carbon did not exist, then, according to Hoyle and Salpeter, practically no carbon would have been formed and we would probably not have evolved to contemplate these problems. Whenever we probe nature—be it by studying the structure of nuclei, or by learning about macromolecules, or about elementary particles, or about the structure of solids we always get some essential part of this great universe." (From "Problems of Nuclear Structure," by Victor Weisskopf, Physics Today 14: 7, 1961.) 15-2 A SURVEY OF SOME NUCLEAR PROPERTIES We begin our study of nuclei by quickly reviewing what we have already learned about them in the process of studying atoms and molecules, and by adding some new information that is also obtained largely from atoms and molecules. The items of new 1 F = 1 x 10 -15 m (15-1) Note that this length, characteristic of nuclei, is five orders of magnitude smaller than the length 1 A characteristic of atoms since 1 A = 1 x 10 -1 ° m. 3. Both the a-particle scattering and the a-particle emission analyses showed that there is a nuclear force, which is attractive, acting between the particle and the nucleus, in addition to the repulsive Coulomb force acting between the two. They indicated that the nuclear force is of very short range, i.e., that it extends only for a distance appreciably less than 10 F. The analyses also indicated that the nuclear force is strong, compared to the Coulomb force, since it dominates the latter, which is repulsive, to produce an overall attraction on the a particle when it is very close to the nucleus. Modern experiments involving the scattering of protons from protons show that the range of the nuclear force is 2 F, and that the magnitude of the negative energy associated with the attractive force is larger than their Coulomb energy, when the two protons are separated by that distance, by roughly a factor of 10 2. Furthermore, experiments involving the scattering of protons from neutrons indicate that the nuclear force is charge independent. That is, the nuclear force between protons and neutrons is the same as between protons and protons, or between neutrons and neutrons (except for exclusion principle effects that apply in the latter two cases only). Although the scattering experiments which provide direct experimental proof of the charge independence of nuclear forces are fairly recent, an educated guess was made at an early stage that the nuclear force would have this simplifying property. We shall consider the scattering experiments in Chapter 17, and certain other evidence for charge independence later in this chapter and in Chapter 16. Until then we too shall make the assumption that the nuclear force is charge independent. Finally, it should be mentioned that the nuclear force is extremely strong compared to the gravitational force. The magnitude of the energy associated with the nuclear force acting between two protons separated by less than 2 F is larger than their gravitational energy by a factor of about 10 40 4. It has been mentioned (Chapters 8 and 10) that nuclei have magnetic dipole moments. They arise from the intrinsic magnetic dipole moments of the protons and neutrons in the nuclei, and from the currents circulating in the nuclei due to the motion of the protons. Nuclear magnetic dipole moments are studied by using optical AS URVEYOFSOMENUCLEAR P ROP ERTIES information are considered here only briefly; each will be discussed in more detail later: 1. We have learned (Chapter 4) that the mass of a nucleus is only slightly less than the mass of an atom containing that nucleus. Thus the nuclear mass is approximately equal to the integer A times the mass of a hydrogen atom, or approximately equal to A times the mass of a proton, the nucleus of a hydrogen atom. The integer A, called the mass number, is the one closest to the atomic weight of the atom containing the nucleus in question. We have also learned (Chapters 4 and 9) that the charge of a nucleus is exactly equal to the atomic number Z of the corresponding atom, times the negative of the charge of an electron, or exactly Z times the charge of a proton. The atomic number gives the location of an atom in the periodic table of the elements. That table (Chapter 9) shows that A is roughly equal to 2Z, except for the proton for which Z = A = 1. 2. Analysis of a-particle scattering from nuclei of low A (Chapter 4) indicated that the radii of such nuclei are somewhat less than 10 F, where the radius is defined as the distance from the center of the nucleus at which the potential acting on the a particle first deviates from a Coulomb potential. Analysis of the rate of emission of a particles by radioactive nuclei of high A (Chapter 6) indicated that the radii of these nuclei, defined in the same way, are ^ 9 F. The symbol F represents the unit of length, called the fermi, used in nuclear physics. Its value is spectroscopic equipment of extremely high resolution to measure the hyperfine splitting of atomic energy levels, which results from the interaction of the dipole moments with the magnetic field produced by the atomic electrons. The value of the interaction energy AE depends on the orientation of the nuclear magnetic dipole moment in the internal magnetic field, and is given by the equation (15-2) AE = C[f(f + 1) — i(i + 1) — j(j + 1)] where j, i, and f are quantum numbers specifying the magnitudes of the atom's total electronic angular momentum, total nuclear angular momentum, and grand total angular momentum, respectively. This equation is completely analogous to (10-15), r which describes the atomic spin-orbit interaction energy. The constant C is proportional to the magnitude of the nuclear magnetic dipole moment ,u. Measurements of AE, and therefore of C, show that for all nuclei it is of the order of the nuclear Û magneton µ,,. This quantity is N NU CLEAR MODELS T eh -26 amp-m2 ^ 10 -3 /4 µn = 2M = 0.505 x 10 (15-3) where M is the proton mass and µb is the Bohr magneton. Measurements of hyperfine splitting also show that the sign of the nuclear magnetic dipole moment (giving the relative orientation of the magnetic dipole moment vector and the angular momentum vector of the nucleus) is positive (parallel) in some cases and negative (antiparallel) in others. Nuclei with both A and Z even have µn = 0. 5. The total nuclear angular momentum quantum number i, usually called the nuclear spin, can be obtained simply by counting the number of energy levels of a hyperfine splitting multiplet. If the multiplet is associated with a value of j larger than i, then f can assume 2i + 1 different values so there will be 2i + 1 different energy levels. It is found that i is an integer for nuclei of even A, with i = 0 if Z is also even, and that i is a half-integer for nuclei of odd A. The magnitude I of the total nuclear angular momentum is given in terms of i by the usual relation I = Ji(i + 1) h. The total angular momentum of a nucleus arises from the intrinsic spin angular momenta of its protons and neutrons and also from the orbital angular momenta due to the motion of these particles within the nucleus. It should be emphasized that in nuclear physics the word spin frequently refers to the total angular momentum of a nucleus, in contrast to atomic physics where the word refers to the intrinsic spin angular momentum only. When there is possibility of confusion, we shall henceforth use the terminology intrinsic spin angular momentum, and we shall continue to use the symbol s, when referring to that part of the angular momentum of a single particle that has nothing to do with orbital angular momentum (e.g., the intrinsic spin angular momenta of both protons and electrons are given by s = 1/2). 6. Closely related to the spin of a nucleus is the symmetry character of the eigenfunction for a system containing two or more nuclei of the same species (Chapter 9). This is studied by analyzing the spectra of diatomic molecules containing two identical nuclei (Chapter 12). It is found that nuclei with integral spin quantum number i (nuclei of even A) are of the symmetric type, i.e., they are bosons, while nuclei with half-integral i (nuclei of odd A) are of the antisymmetric type, i.e., they are fermions. Such molecular spectra also provide independent measurements of i, which confirm values obtained from hyperfine splitting. 7. As we have already indicated, nuclei are composed of protons and neutrons. The neutron is an uncharged particle of nearly the same mass as the proton, and precisely the same intrinsic spin angular momentum and symmetry character (s = 1/2, antisymmetric). A nucleus with mass number A and atomic number Z contains A nucleons, a word used for both protons and neutrons, of which Z are protons and A — Z are neutrons. This rule obviously leads to a mass and charge in agreement with item 1. Example 15 1. The mass number and atomic number of the nucleus of the most prevalent variety of nitrogen are: A = 14, Z = 7. Its measured nuclear spin and symmetry character are: i = 1, symmetric. (See Examples 12-4 and 12-5.) Show that the spin and symmetry character disagree with the assumption that nuclei contain A protons and A — Z electrons. Also show that the spin and symmetry character are in agreement with the assumption that nuclei contain A nucleons, of which Z are protons and A Z are neutrons. ^ If the nucleus contains 14 protons and 7 electrons, it contains an odd number, 14 + 7 = 21, of the particles that all have half-integral intrinsic spin angular momentum quantum numbers. (They all have s = 1/2.) The rules for combining angular momentum quantum numbers presented in Section 8-5 make it apparent that, whether or not these particles have orbital angular momenta, each of their tot al angular momentum quantum numbers will be halfintegral since orbital angular momentum quantum numbers are always integral. Furthermore, it is apparent that a nucleus containing an odd number of particles, each with half-integral total angular momentum quantum number, can only have a half-integral total angular momentum quantum number. In other words, its nuclear spin will be half-integral, in disagreement with the measurements. It is also apparent from the discussion of Section 9-3 that the symmetry character of a nucleus containing an odd number of antisymmetric particles must be antisymmetric. The reason is that an exchange of labels of two such nuclei amounts to an odd number of exchanges of labels of antisymmetric particles. This multiplies the total eigenfunction of the system by an odd power of minus one, which equals minus one, so that the total eigenfunction is antisymmetric. Again we see that the nitrogen nucleus cannot contain 14 protons and 7 electrons, giving it an odd total number of particles, since the measurements show that it is a nucleus of the symmetric type. If the nucleus contains 7 protons and 7 neutrons, the total number of particles is 7 + 7 = 14, an even number. Since neutrons have the same intrinsic spin angular momentum and symmetry character as protons (or electrons), we see that the nucleus will be symmetric because in an exchange of labels of two nuclei the total eigenfunction will be multiplied by an even power of minus one, and an even power of minus one equals plus one. Its nuclear spin will be integral since an even number of particles of half-integral intrinsic spin angular momentum quantum numbers must have an integral total angular momentum quantum number. Both of these predictions are in agreement with the measurements. • - — Some years before its discovery, Rutherford suggested the existence of a particle having the properties of what we now call the neutron. A number of people tried to devise experiments to detect it. But this was difficult because, being uncharged, the neutron does not easily ionize atoms when it passes through matter, and most devices for detecting particles depend on ionization. In 1932 Chadwick succeeded in detecting neutrons emitted from beryllium nuclei when they are bombarded with cc particles obtained from a radioactive source. He used a Geiger counter behind a layer of paraffin. The neutrons collide with protons in the paraffin, and they transfer an appreciable fraction of their kinetic energy to the protons. The protons then penetrate the Geiger counter, where they are counted with high efficiency since they are charged and therefore produce much ionization. The experimental arrangement is indicated in Figure 15-2. 8. Many nuclei are not precisely spherical in shape, but instead they are in the shape of an ellipsoid. The earliest evidence for this came from accurate measurements m P w S3 111:13d Oa d1:It/310 f1N 3 W OS3 OA3 naflS`d Before the discovery of the neutron, it was thought that a nucleus of mass number A and atomic number Z contains A protons and A — Z electrons. This rule also leads to a mass and charge in agreement with item 1, but we have seen that the zero-point energy is unrealistically high if a particle as light as an electron is confined in a region as small as a nucleus (Chapter 6). Furthermore, the spin and symmetry character of nuclei composed of protons and neutrons are, in all cases, in agreement with the measurements described in items 5 and 6. For nuclei in which A is even and Z is odd, the spin and symmetry character disagree with the measurements if nuclei are composed of protons and electrons. NUCLEAR MODELS a source Paraffin wax film n Geiger counter Beryllium foil Figure 15 2 A schematic depiction of the experimental arrangement used by Chadwick in the discovery of the neutron. - of the hyperfine splitting of the energy levels of atoms of these nuclei. If the hyperfine splitting were due entirely to the energy of orientation of the nuclear magnetic dipole moment in the internal magnetic field of the atom as assumed in (15-2), the analogy with (10-15) for the spin-orbit interaction would require that the pattern formed by the split atomic energy levels obey an interval rule like Landé's (10-16). But deviations from such an interval rule are seen in the hyperfine splitting of many atoms. The deviations indicate that in these atoms the hyperfine splitting is partly due to an electric interaction between an ellipsoidal distribution of the nuclear charge and the atomic electric field. That is, in these atoms the energy depends on the orientation of the ellipsoidal nuclear charge distribution in the internal electric field of the atom, as well as on the orientation of the nuclear magnetic dipole moment in the internal magnetic field of the atom. The observed departure of the nuclear charge distribution from spherical symmetry is specified by the nuclear electric quadrupole moment q. As is illustrated in Figure 15-3, for q > 0 the ellipsoidal charge distribution is elongated in the direction of its symmetry axis, with the elongation increasing as q becomes more positive. For q < 0 the ellipsoidal charge distribution is flattened in the direction of its symmetry axis, with the flattening increasing as q becomes more negative. A more precise definition of q will be given in Section 15-10. For nuclei with spin i _ > 1, the hyperfine splitting measurements show that there are cases with electric quadrupole moment q > 0, as well as cases with q < 0. But for nuclei with i = 0 or i = 1/2, these measurements always yield q = 0; that is, no departures from spherical shape are observed for such nuclei in these measurements. It is easy to see why nuclei appear to have a spherical shape if they have zero nuclear spin. If they have no nuclear spin they do not have any particular orientation in space, as there is no total angular momentum vector that must maintain a fixed component in some direction. The nuclei must then have all possible orientations in space. So even though they are actually nonspherical, we cannot see this in the hyperfine splitting measurements because, averaged over a sample containing many nuclei, the nuclei would appear to be spherical. But we can see their true shape in measurements involving nuclear reactions. As will be discussed in the following chapter, they show that certain nuclei with nuclear spin i = 0 do have quadrupole moments. Figure 15 3 Left: A prolate (football-shaped) charge distribution gives rise to a positive quadrupole moment q. Right: An oblate (fat pumpkin-shaped) charge distribution gives rise to a negative quadrupole moment. Both ellipsoids are symmetrical about the axis through q< 0 their center. - q> 15 3 NUCLEAR SIZES AND DENSITIES - We begin our detailed discussion of nuclei by considering the results of measurements of their sizes. The most straightforward and accurate measurements involve scattering of electrons, of several hundred MeV kinetic energy, from thin targets containing atoms whose nuclei are to be investigated. As nuclear forces do not act on an electron, its scattering is due to its Coulomb interaction with the nuclear charge distribution. An electron scattered through an appreciable angle has had a single close encounter with a nucleus, just as in oc-particle scattering from nuclei (see Section 4-2). Therefore, measurements of electron scattering should be able to provide information about the nuclear charge distribution, such as its size. The charge distribution is, of course, only the distribution of protons in the nucleus, but there is much additional evidence indicating that the neutrons have approximately the same distribution as the protons. The method can be thought of as the use of an "electron microscope" to "look at" the charge distribution. What is actually seen is not the charge distribution itself, but a diffraction pattern which it produces in scattering the electron wave function. Qualitatively, we know that the separation in angle between adjacent minima of the diffraction pattern, 0, will obey the usual diffraction relation (see Chapter 3 and, in particular, Appendix L) (15-4) where 2 is the electron de Broglie wavelength, and r' is the radius of the charge distribution. Thus a measurement of 0 gives immediately an estimate of r', since 2 can be calculated from the known kinetic energy. Electrons of kinetic energy K = 500 MeV are scattered from a target of nuclei, of charge distribution radius r', into a diffraction pattern that has minima with an average separation of 0 ^ 30°. Estimate r'. •First we must evaluate the de Broglie wavelength 2 from the electron momentum p. Since the total energy E of the electrons is very high compared to their rest mass energy m oc2 = 0.51 MeV, we may use expressions that are valid in the extreme relativistic limit Example 15 2. - E K p = —= — c 1 joule 500 MeV = 2.7 x 10 l a kg-m/sec x 3 x 10 8 m/sec 6.2 x 10 12 MeV c S31 1I SN3 a dNb' S3Z IS13 b'3 -10 f1 N Nuclei must also be observed to be spherical in hyperfine splitting measurements if they have nuclear spin i = 1/2. The reason is that for i = 1/2 there are only two possible orientations of the nuclear shapes relative to the direction defined by any electric field which is applied to the nuclei. Since both give the same energy of interaction between this field and the electric quadrupole moments, on the average the energy splitting is zero, and so no evidence of quadrupole moments can be observed in these measurements. The largest values of q are found for nuclei in the region of the rare earth elements. In the most extreme case the largest dimension of the ellipsoidal charge distribution is along the direction of the symmetry axis, and it exceeds the smallest dimension by about 30%. But for typical nuclei with i > 1, the difference in the largest and smallest dimensions of the ellipsoid is only a few percent. So for most purposes it is a good approximation to assume that typical nuclei are spherical, particularly since more than half of all the nuclei have i = 0, and so they appear in most circumstances to be precisely spherical. NU CLEAR MODELS Beam stopper Concrete shielding Scatteri g cha m be Deflecting magnet Target Figure 15-4 An apparatus used to study the scattering of high-energy electrons from a target of nuclei. Only the end of the electron linear accelerator is shown. It is actually a very long evacuated tube in which radio frequency fields accelerate the electrons to the required energy. Then the de Broglie relation gives _h p 6.6 x 10 -34 joule-sec _ 24 x 10 -15 m 2.7 x 10 -19 kg-m/sec Converting 0 to radians, and invoking (15-4), we find r' 0 2.4 x 10 -15 m = 4.5 x 10 -15 m= 4.5F 0.53 rad for an estimate of the charge distribution radius. • An accurate determination of the nuclear charge distribution can be obtained if the shape of the electron diffraction pattern is analyzed quantitatively. This involves adding up the portions of the electron wave function scattered from each region of the nucleus, in proportion to an assumed charge density in that region, and taking into account the phase differences that produce the constructive or destructive interference at different scattering angles which constitutes the diffraction pattern. The assumed charge distribution is varied until the best fit to the measured diffraction pattern is obtained. It is found that the fit is very sensitive to the details of the charge distribution, so that it can be well determined even if the diffraction pattern contains only one minimum. The analysis is related to the one-dimensional Schroedinger scattering calculations of Chapter 6. But it is much more complicated because it is three dimensional and because it is relativistic, so the Dirac version of quantum mechanics must be used. Thus we can only quote results. Figure 15-4 indicates the experimental apparatus used by Hofstadter, and collaborators, to measure the scattering of high-energy electrons from various nuclei. The electrons are produced in a linear accelerator, part of which is shown. It operates something like a very large-scale version of the electron guns used in electron microscopes, or television tubes. The electrons are scattered from a thin target foil, whose atoms contain the nuclei of interest, located at the center of the evacuated scattering 30° 40° 50° 60° 70° 80° 90° B chamber. Scattered electrons are detected by the spectrometer, which determines their kinetic energy by bending them in its magnetic field. Only the elastically scattered electrons are counted, that is, those whose kinetic energy is the same as the electrons of the incident beam, less the small amount of kinetic energy of the recoiling nuclei. This requirement ensures that the nuclei remain undisturbed, so that their ground state charge distribution will be obtained. Figure 15-5 shows results obtained in the scattering of 420 MeV electrons from the small mass number nucleus 6C. The ordinate is the differential scattering cross section da/dS2, a quantity defined in (4-8) which is proportional to the number of electrons scattered at each angle. The points with accuracy estimates are the data, and the solid curve is the best fit to the data obtained from the analysis. The radial distribution of nuclear charge density p(r), which produces this fit, is shown by the curve labeled 6C in Figure 15-6. For a given electron energy, the diffraction patterns measured for nuclei of larger mass number A develop additional minima, which become more closely spaced as A increases. Equation (15-4) indicates this means the radius of the charge distribution increases with increasing A. The quantitative results are shown by the curves in Figure 15-6, which represent the charge densities p(r) obtained for a number of nuclei. All of these charge densities can be described fairly accurately by the empirical equation P(0) P(r) = 1 + e(r - a)/6 (15 5) - where the parameters a and b have the values 10 -15 m = 1.07A 1 "3 F b = 0.55 x 10 -15 m=0.55F a = 1.07A 1 "3 x (15-6) (15-7) We draw the following conclusions from Figure 15-6 and (15-5) through (15-7): 1. The charge density of nuclei, which is essentially the distribution of protons in the nuclei, is constant in the nuclear interior and falls fairly rapidly to zero at the nuclear surface. S3I1IS N3a aNdS3ZIS a d31 0f1N A measure of the number of electrons scattered from 6C as a function of the scattering angle for 420 MeV incident electrons. The differential scattering cross section du/di-2 is the measure used. It is evaluated in terms of the area unit commonly employed in nuclear physics, called the barn; -2. The curve is the fit to the 1 bn = 10- 24 cm data points obtained from the scattering analysis described in the text. Figure 15-5 r (F) Figure 15-6 The charge densities of a number of nuclei. The charge density labeled 6 C produced the fit to the scattering data shown in Figure 15-5. The half-value radius parameter a, surface thickness 2h, and interior charge density p(0), are shown for 6 C. 2. The radius at which the density has one-half its interior value, a, increases slowly with increasing number of nucleons in the nucleus, A. Specifically, the radius a is proportional to A 1 t 3 3. The thickness of the nuclear surface is given approximately by the quantity 2b, since most of the drop in the value of the factor 1/[1 + ear - a)11, from its interior value of one to its exterior value of zero, occurs when r charges from a — b to a + b. This surface thickness 2b has approximately the same value for all nuclei. 4. The interior value of the charge density, p(0), decreases slowly with increasing A. 5. If we assume that the distribution of protons in nuclei is approximately the same as the distribution of neutrons (there is good evidence for this assumption), then the charge density p(r), which gives the density of protons in the nucleus, is the same as the mass density p M(r), which gives the density of all nucleons in the nucleus, except for a factor proportional to Z/A, the ratio of the number of protons to the total number of nucleons in the nucleus. That is p(r) cc Â pm(r) (15-8) Then the decrease of p(0) with increasing A is explained entirely by the decrease in Z/A with increasing A. (The periodic table shows that Z/A ^ 1/2 for A 40, while Z/A ^ 1/2.5 for A ^ 240.) This indicates that the interior value of the mass density, pM (0), is approximately the same for all nuclei. Example 15 3. - Evaluate approximately the interior mass density of a nucleus. ^ Approximate results can be obtained most easily by noting that the ratio of the density of a nucleus to the density of a solid, containing atoms with that nucleus, is 1 density of nucleus volume of nucleus -1 [(radius of nucleus)31a density of solid matter volume of atom C radius of atom For all nuclei radius of nucleus radius of atom . 10 s For instance, the radius of the outer shell of the 6C atom is a little less than 2 A = 2 x 10 -1 ° m, while the half-value radius of its nuclear charge or mass distribution is a little more than 2 F = density of nucleus N 10 1 s density of solid matter Since the density of solid matter is of the order of 10 3 kg/m 3 , we find that the density of a nucleus has the extremely high value density of nucleus — 10 18 kg/m 3 The densities of nuclei are some 15 orders of magnitude larger than the densities encountered in the macroscopic world. It is, therefore, not surprising that other properties of nuclei can differ remarkably from the properties of macroscopic objects. • 15-4 NUCLEAR MASSES AND ABUNDANCES Very precise measurements of nuclear masses provide information about some of the most basic nuclear properties. Now the masses of atoms of a particular Z, but possibly a mixture of A, can be obtained to several significant figures by chemical techniques and a knowledge of Avogadro's number. Since the mass of a nucleus differs from the mass of the corresponding atom by a known amount, these techniques provide fairly accurate determinations of nuclear masses. But for the extremely accurate determinations needed in the study of nuclei, we must use the physical techniques of mass spectrometry or energy balance in nuclear reactions. Both give information about the masses of atoms of a particular Z and A. From these masses, the masses of the corresponding nuclei can be evaluated by subtracting Z times the electron mass. The mass equivalent of the electron binding energies is small enough to be ignored. An example of one of the many types of mass spectrometers is the Bainbridge design, illustrated in Figure 15-7. The source produces singly ionized atoms with charge + e, mass M, and a distribution of velocities. These atoms travel through an evacuated region of crossed electric and magnetic fields which act as a velocity filter, passing only those with velocity y satisfying the equation eE = Bev An apparatus used to measure atomic masses. Magnetic pole pieces above and below the plane of the paper provide a uniform magnetic field into the paper throughout the region enclosed by the dashed line. The entire apparatus shown is contained in a vacuum chamber. Figure 15-7 S3JNVdN f1 8 V aNVS3 SSVW1:1V 310 f1N 2 x 10 -15 m. Thus we obtain 0 N NUCLEAR MODELS ^ The terms on the left and right are the magnitudes of the opposing electric and magnetic forces. Atoms of velocity y = E/B enter a region of uniform magnetic field, are bent into a semicircle of radius R, and fall on a photographic plate where they produce an image. The distance from the diaphragm S2 to the image is 2R, where R satisfies the equation Bev = Rv 2 The term on the right is the mass times the centripetal acceleration. Solving for M, we obtain RBe _ RB 2e (15-9) v E The singly ionized atomic mass can be determined from absolute measurements of the quantities on the right side of (15-9). But in practice use is made of various hydrocarbon molecules to calibrate the apparatus over a wide range of masses, in terms of the standard mass of carbon. The main reason that carbon is used as a standard, or unit, of mass is that many different hydrocarbons are readily available. In fact, the ion source usually produces some ionized hydrocarbons automatically, since hydrocarbons in the form of vacuum pump oil are present in the apparatus. The mass of the neutral atom can be obtained from that of the singly ionized atom by adding one electron mass. With the mass spectrometry technique, extremely accurate measurements can be made. As an example, consider the nucleus 20Ca40. (The superscript before the chemical symbol gives the value of Z; the superscript after the symbol gives the value of A.) The mass of atom with this nucleus is quoted as M2oCa4o = 39.962589 ± 0.000004u The symbol u represents one mass unit; it is defined in terms of the prevalent species of carbon in such a way that M6C i2 - 12.0000000u (15-10) A number of other examples of atomic masses are found in Table 15-1. Using the first mass spectrometer, Thomson discovered the existence of isotopes in 1911. When the ion source contained a mixture of noble gases, he found an image on the photographic plate with mass corresponding to A = 20, and an associated M= Table 15 1 - Atomic Masses and Binding Energies Binding Energy in MeV on 1 1H1 1H2 1H3 2He 3 2He4 4Be 9 6Cr2 8016 29 CU 63 5osn 12o 74W 184 92U238 A Mass in u Total (SE) Per Nucleon Z 0 1 1 1 2 2 4 6 8 29 50 74 92 1 1 2 3 3 4 9 12 16 63 120 184 238 1.0086654 (±4) 1.0078252 (± 1) 2.0141022 (+1) 3.0160500 (+10) 3.0160299 (+2) 4.0026033 (±4) 9.0121858 (±9) 12.0000000 (+0) 15.994915 (±1) 62.929594 (+6) (±1) 119.9021 (±4) 183.9510 238.05076 (±8) 2.22 8.47 7.72 28.3 58.0 92.2 127.5 552 1020 1476 1803 1.11 2.83 2.57 7.07 6.45 7.68 7.97 8.75 8.50 8.02 7.58 (AE/A) A bombarding particle 2 He4 (an a particle) interacts with a target nucleus 7N 14 to produce a residual nucleus 80 17 and a product particle 'H' (a proton). This was the first artificially produced nuclear reaction, discovered in 1919 by Rutherford who used 7.7 MeV a particles from a radioactive source. Now x particles of a variety of energies obtained, perhaps, from an electrostatic generator would be used to investigate this typical reaction. As is discussed in Appendix A, mass and kinetic energy are not separately conserved in nuclear reactions. Instead, there is conservation of total relativistic energy, E = K + mc 2, where K is kinetic energy and m is used here for rest mass. For the general case, illustrated in Figure 15-8, a bombarding particle a interacts with a target nucleus A to produce a residual nucleus B and a product particle b; that is (15-12) a+A-413+b In this case the conservation of total relativistic energy in the laboratory frame of reference reads (15-13) (K a + mac') + mA c2 = (KB + mBc 2) + (Kb + mbc2) a A Before After A nuclear reaction wherein a bombarding particle a is incident on a target nucleus A. After the reaction takes place, the product particle b is emitted at the angle 0, and the residual nucleus B recoils in such a way that momentum is conserved. Figure 15 8 - NUCLEARMASSES AN DA BUNDAN C ES weaker image corresponding to A = 22. A number of tests proved these were both due to a noble gas, and this could only be Ne, with chemical atomic weight of 20.18. He interpreted these results to mean that there are two chemically indistinguishable species of Ne atoms, called isotopes, one with A = 20 and relative abundance of about 91%, and one with A = 22 and relative abundance of about 9%. They are chemically indistinguishable since they have exactly the same structure of atomic electrons because their nuclei have the same charge and therefore the same Z, but they are physically distinguishable since they have different masses because their nuclei have different A. The nuclei of the Ne isotopes are: lo Ne2o, loNe21 , loNe22; the second occurs with relative abundance of about 0.3%, and it could not be detected by Thompson's apparatus. All three of these nuclei contain 10 protons; however, the first contains 10 neutrons, the second contains 11 neutrons, and the third contains 12 neutrons. Modern mass spectrometers, using detectors which are very sensitive and have a linear response, provide accurate determinations of the relative abundance of the various isotopes. As an example, the abundances of the normally occurring mixture of 8 0 isotopes are 8 O i6 = 99.759% 8 0 17 = 0.037% 8 0 18 = 0.204% Another technique of accurate mass determination, which provides a supplement and check for the technique of mass spectrometry, is the study of energy balance in nuclear reactions. Consider the nuclear reaction 2 He4 + 7 N 14 _÷ 8017 + 1H1 NUCLEAR MO DELS Note that KA = 0 since A is stationary in the laboratory frame. Because there can be an exchange of energy between kinetic energy and rest mass energy, it is possible for the final kinetic energy KB + Kb to be greater, or less, than the initial kinetic energy K a. The difference is called the Q value of the reaction. That is Q= K B +Kb — K a (15-14) From (15-13), this can also be written Q= (Ma +mA — mB mb)c 2 (15-15) We see that a measurement of the Q value of a reaction gives information about the rest masses of the entities involved in the reaction. The Q value can be measured by measuring K a, Kb, and K B . However, the latter quantity is usually difficult to measure. The difficulty can be avoided by using a relation that comes from the conservation of momentum to eliminate K B from (15-14), This is easy to do in the limit K a/mace « 1 K b/m bc2 « 1 KB/mBC 2 « 1 where the classical expressions such as K a = maya/2 and pa = maya can be used. The result is that in this classical limit — (15-16) 2 (KaKbmamb) 112 cos 0 m a — MB where 0 is the angle of emission of the product particle, defined in Figure 15-8. This result is of sufficient accuracy for the analysis of nuclear reactions at the energies which have been used in most experiments. In (15-15), the masses refer to the rest masses of the nuclei A and B, and to the rest masses of the completely ionized nuclear particles a and b. However, to the accuracy of the approximation in which the mass equivalent of the electron binding energy is ignored, this equation can also be considered to read Q = (Ma + MA MB — Mb)c 2 (15-17) where the large M refer to the masses of the neutral atoms. The second form is obtained from the first by adding (Za + ZA)mc 2 to the first two terms and subtracting (Z B + Z b)mc 2 from the last two, where mc 2 is the rest mass energy of an electron. This procedure is valid since the relation Za + ZA =ZB+Zb (15-18) must be true in any nuclear reaction in order to have conservation of charge. Q = Kb(1 + mb/ — K a (l — MB — In Rutherford's reaction, (15-11), bombarding 2 He4 particles (a particles) of kinetic energy Ka = 7.70 MeV interact with 7N14 target nuclei to produce 8 0 17 residual nuclei and 1 H 1 product particles (protons). The protons emitted at 90° to the beam of bombarding a particles are found to have kinetic energy Kb= 4.44 MeV. (a) Determine the Q value of the reaction. (b) Then use it to determine the atomic mass of 80 17 in terms of the other three atomic masses involved in the reaction. ^ (a) Since the emission angle is 8 = 90°, (15-16) for the Q value simplifies to Example 15-4. Q= Kb 1-I mb — K a (1 — 111a MB MB With sufficient accuracy, we can take mb/m B , the ratio of the product particle and residual nucleus masses, as 1/17; we can also take m a/mB , the ratio of the bombarding particle and residual nucleus masses, as 4/17. So Q = Kb(1 + 1/17) — Ka(1 — 4/17) = 1.06Kb — 0.765Ka = 1.06 x 4.44 MeV — 0.765 x 7.70 MeV = —1.18 MeV (b) The atomic masses involved in the reaction are related to the Q value divided by c 2 , which is Q 1.18 MeV c2 C2 M8 0 17 = M2He 4 + M7N i4 — M1 H 1 — = M2 H e 4 + M7N i4 — M1 H 1 + 0.00127u Thus the atomic mass of 8 0 17 can be determined from the measured Q value, if the other • atomic masses are accurately known. The analysis of energy balance in a large number of reactions has provided results which accurately check the results obtained by mass spectrometry. Furthermore, the agreement between these two methods provides the most accurate confirmation of the relativistic theory of mass and energy, upon which the energy balance is based. Table 15-1 lists a few of the many atomic masses that have been measured by these methods, as well as the mass of the neutron. Now let us begin to extract information about the nuclei from the precise measurements of their masses. Use the data of Table 15-1 to compare the mass of the 2He4 atom with the mass of its constituent parts. ^ The mass of the 2 He4 atom is M2He4 = 4.0026033u The mass of its constituent parts is the mass of two 1 H 1 atoms plus the mass of two neutrons; that is 2M1 H 2 + 2M08 1 = 2 x 1.0078252u + 2 x 1.0086654u = 4.0329812u Both M 2He 4 and 2M2H i + 2M08 1 contain two electron rest masses. But the former is smaller than the latter by the amount AM = 4.0329812u — 4.0026033u = 0.0303779u We shall see immediately that this result is a manifestation of the binding energy of the 2He4 • nucles. Example 15 5. - For any atom, a calculation as in Example 15-5 will show that its mass is less than the mass of its constituent parts by an amount AM called the mass deficiency. The origin lies in the nucleus, and in the equivalence between energy and mass. For instance, consider any one of the four nucleons in the 2He4 nucleus. Since the nucleon is stably bound to the nucleus, it must be moving in some sort of an attractive potential representing the net attraction of the other three nucleons. Furthermore, to be bound it must have a negative energy E < O. The situation is depicted in Figure 15-9. The energy required to remove the nucleon from the nucleus, leaving it a free nucleon Attractive potential A schematic representation of the potential and total energies of a nucleon in a helium nucleus. The potential extends beyond the nuclear mass distribution by about the range of the nuclear force, and then it rapidly goes to zero. Figure 15 9 - N w cp 1 S3O N t/aN f18t/ aN `dS3SSdW 1:1br319 f1N To express this in mass units, we use the relation uc2 = 931.5 MeV which comes from evaluating the rest mass energy of a particle of rest mass lu. We obtain 2 1 .18c2 Q x 931.5 MeV —0.00127u c2 = According to (15-17), the atomic mass of 8 0 17 can be expressed in terms of the other atomic masses, and Q/c2, as follows NUCLEAR MO DELS N 67 Lo T as Û of negligible kinetic energy at r -* co, is IEI. Conversely, if such a free nucleon comes in from r -4 co and combines with the other nucleons to form the nucleus, its energy must decrease by the amount IEI. The excess energy could be carried off by the emission of electromagnetic radiation., The same situation holds for the other nucleons in the nucleus. Thus we see that when a dispersed system of free nucleons combines to form a nucleus, the total energy of the system must decrease by an amount AE, the binding energy of the nucleus. The decrease AE in the total energy of the system must, according to relativity theory, be accompanied by a decrease AM in its mass, where (15-19) AMc 2 = AE For 2He4, the mass deficiency is AM = 0.0303779u. Therefore its binding energy is AE = AMc 2 = 28.3 MeV, where we have used the convenient relation from Example 15-4 lu x c2 = 931.5 MeV (15-20) This value of AE is listed in the next to last column of Table 15-1. The last column of the table lists AE/A, called the average binding energy per nucleon, which is the binding energy of the nucleus divided by the number of nucleons it contains. For 2 He4, the value of AE/A is 28.3 MeV/4 = 7.07 MeV. One of the most important features of a nucleus is its average binding energy per nucleon. The quantity is plotted as a function of A in Figure 15-10. The points are the data obtained from the measured masses in the manner just described. Note that AE/A at first rises rapidly with increasing A, but very soon AE/A is roughly constant at a value 8 MeV (15-21) If each nucleon in a nucleus exerted the same attraction on all the other nucleons, the binding energy per nucleon would continue to increase as more and more nucleons were added to the nucleus; that is, AE/A would be proportional to A. The extremely important fact that AE/A is not proportional to A is due, in part, to the short range of nuclear forces. A complete explanation of the saturation of nuclear forces, which is responsible for the fact that AE/A has approximately the same value throughout most of the periodic table, will be given in Chapter 17. This saturation 9— 01 6 8 C. He4 7 ■ • B11 e Be 9 B R) 6— • Li 7 âi 5 ••Li6 ^ Gj 4 3— • He 3 2 1 H2 I 0 I I I I I I I I I I I 20 40 60 80 100 120 140 160 180 200 220 240 A Figure 15-10 The average binding energy per nucleon for stable nuclei. The smooth curve is obtained from the semiempirical mass formula developed in Section 15-5. Use Figure 15-10 to estimate the difference between the binding energy of a nucleus and the sum of the binding energies of the two nuclei produced if it fissions symmetrically. ^ The figure shows that the average binding energy per nucleon for a nucleus of mass number around A = 238 is 7.6 MeV. So the binding energy of the nucleus present before the fission is 238 x 7.6 MeV 1810 MeV. The figure also shows that the average binding energy per nucleon for a nucleus of mass number around A = 238/2 = 119 is ^ 8.5 MeV. So each of the two nuclei present after the symmetrical fission has a binding energy of ^ 119 x 8.5 MeV 1010 MeV. The sum of their binding energies is ^ 2020 MeV. This sum is larger than the initial binding energy 1810 MeV by about 210 MeV. Thus the final state (after the nucleus fissions) is more stable than the initial state (before the nucleus fissions), because the total binding energy is higher in the final state. When the total binding energy increases by about 210 MeV in the fission, energy in this amount is liberated. Most of it goes into the kinetic energy of the two nuclei produced in the fission. In a nuclear reactor this kinetic energy is degraded into thermal energy, which is the source of the power produced by the reactor. • Example 15 6. - 92U238 In nuclear fusion two or more nuclei of very small A combine to form a larger nucleus that has a higher average binding energy per nucleon because its value of A is nearer the value A 60 , at which AE/A maximizes. It might seem that only a few nuclei near A = 60 would be stable. This is not true because there are other factors, to be discussed later, which inhibit fission and fusion. We conclude this section by considering the distribution of Z and A values of the stable nuclei, which is additional information obtained from the mass spectrometer measurements. The data are plotted in Figure 15-11. Each stable nucleus is indicated 100 90 -J r 80 . _^ . lb% 70 60 ■■ 50 40 • 30 e 20 1 . • • . . • ■ ■ ■■ .. ■ ■■ • • • . J••. as oil ■ la .■ 10 10 20 30 40 50 60 70 80 90 100 110 120 130 140 N = (A — Z) Figure 15-11 The distribution of stable nuclei. S3ONbaNf18 `d aMd S3SSb'W 1:1b31 0 f1N has a certain analogy to the saturation of molecular forces in covalent bonding, but the origins of the two saturation phenomena have no relation to each other, as we shall see in that chapter. Inspection of Figure 15-10 shows that AE/A actually maximizes at about 8.7 MeV for A ^ 60, and then decreases slowly to about 7.6 MeV for A ^ 240. We shall find that the decrease is due to Coulomb repulsions between protons in the nucleus. One consequence is the phenomenon of nuclear fission, in which a large A nucleus, such as 92 U238 , splits into two intermediate A nuclei because the two intermediate A nuclei are more stable than the large A nucleus. ^ N Table 15-2 The Distribution of Stable Nuclei NUCLEAR MODELS ^ A Even Odd N Z Number of Stable Nuclei Even Odd Even Odd Even Odd Odd Even 166 8 57 53 with a square whose abscissa is the neutron number N = A — Z, the number of neutrons in the nucleus, and whose ordinate is the atomic number Z, the number of protons in the nucleus. Note that for small Z there is a tendency for stable nuclei to have Z = N. We shall see that this is due to the fact that nuclear forces operate symmetrically on neutrons and protons because nuclear forces are charge independent, as mentioned in Section 15-2. For large Z, stable nuclei tend to have Z < N. Thisanoterfc Culmbepsiontwr,hicpoduea The effect discriminates energetically against the positive energy proportional to Z 2 . presence of protons in nuclei of large Z, but it is not important in nuclei of small Z Z = N tendency dominates. whert There is a tendency for stable nuclei to have even Z and also even N. This can be seen from the data of Table 15-2, which lists the number of stable nuclei of various types. We shall find that this tendency is present because two nucleons of the same species can form a closely spaced pair in which they interact particularly strongly, and thereby make a particularly large contribution to the nuclear binding energy. 15 5 - THE LIQUID DROP MODEL We shall now employ the liquid drop model of the nucleus, and information obtained from the data concerning the distribution of Z and A values for stable nuclei, to obtain a formula for the masses of these nuclei. This formula will then be used in a variety of ways throughout our treatment of nuclei. The liquid drop model is based on two properties that we have found are common to all nuclei, except those of very small A, (1) their interior mass densities are approximately the same and (2) their total binding energies are approximately proportional to their masses since AE/A ^ const. Both of these properties can be compared with analogous ones concerning macroscopic drops of some incompressible liquid. For such classical liquid drops of various sizes (1) their interior densities are the same and (2) their heats of v porization are proportional to their masses. The second comparison is meaningful since the heat of vaporization is the energy required to disperse the drop into its constituent molecules, and so it is comparable to the binding energy of the nucleus. The mass formula will be developed by using the model to suggest other analogies between a nucleus and a classical liquid drop, but it will also be necessary to include terms in the formula that describe certain nuclear properties whose origins are nonclassical. The liquid drop model approximates the nucleus as a sphere with a uniform interior density, that abruptly drops to zero at its surface. The radius is proportional to A 113 ; the surface area is proportional to A 213 ; and the volume is proportional to A. Since the mass is also proportional to A, which is the number of nucleons in the nucleus, this gives the result that density = mass/volume cc A/A = const, in agreement with the electron scattering measurements. The mass formula consists of a sum of six terms Mz,A = .fo(Z,A) + fl(Z,A) + f2(Z,A) + f3(Z,A) + f4(Z,A) + f5 (Z,A) (15-22) where Mz, A represents the mass of an atom whose nucleus is specified by Z and A. The first term is the mass of the constituent parts of the atom N; f0(Z,A) = 1.007825Z + 1.008665(A — Z) m The coefficient of Z is the mass of the 'IV atom in mass units, and the coefficient of (A — Z) is the mass of the neutron, ° n', in the same units. The remaining terms correct for the mass equivalents of various effects contributing to the total nuclear binding energy. Of most importance is the volume term fi(Z,A) = —a,A (15-24) This accounts for a binding energy proportional to the nuclear mass, or volume. The term describes the tendency to have the binding energy per nucleon a constant. Such a term would be present for a classical liquid drop. Because it is negative, it reduces the mass, and therefore increases the binding energy. Next is the surface term (15-25) f2(Z,A) = +a 2A 213 It is a correction proportional to the surface area of the nucleus. Since the term is positive, it increases the mass and consequently reduces the binding energy. In a classical drop of liquid, this term would represent the effect of the surface tension energy. It would arise from the fact that a molecule at the surface of the drop feels attractive forces only from one side, so its binding energy is less than the binding energy of a molecule in the interior which feels attractive forces from all sides. Therefore, simply setting the total binding energy proportional to the volume of the drop overestimates the binding energy of the surface molecules, and a correction proportional to the number of such molecules, or to the surface area, must be made to reduce the binding energy. The same thing happens in a nucleus. The Coulomb term is f3 /(Z,A) = Z2 +a3 A113 (15-26) It accounts for the positive Coulomb Ipnergy of the charged nucleus, which is assumed to have a uniform charge distribution of radius proportional to A" 3 . This effect of the Coulomb repulsions between the protons increases the mass and reduces the binding energy. A similar -term would be present for a charged drop of a classical liquid. The next term brings in a property specific to nuclei. It is the asymmetry term (Z — A/2) 2 (15-27) A which accounts for the observed tendency to have Z = N. Note that it is zero for Z = N = (A — Z), or 2Z = A, but is otherwise positive and increases with increasing departures from that condition. That is, the greater the departure from Z = N, the f4(Z,A) _ + a4 larger the mass or the smaller the binding energy. The form used in (15-27) is about the simplest one having these properties, but there is also some theoretical justification, involving the charge independence of nuclear forces, that will be indicated later. The tendency of nuclei to have even Z and even N is accounted for by the pairing term = —f(A) if Z even, A — Z = N even if Z even, A — Z = N odd f5(Z,A) = 0 (15-28) or Z odd, A — Z = N even = +f(A) if Z odd, A — Z = N odd , ^ C131 13a01A1 d Obla al f101i3Hl (15-23) NUCLEAR M ODELS It decreases the mass if both Z and N are even, and increases it if both Z and N are odd. Thus it maximizes the binding energy if both Z and N are even. A qualitative explanation of the origin of this term will be given later; it involves the quantum mechanical properties of indistinguishability of identical particles. But the exact form of the function f(A) is usually determined by fitting the data. For a simple power law, the best fit is obtained with -112 (15-29) f(A) = a 5 A Gathering together (15-22) through (15-29), we have 2I 3 MZ,A = 1.007825Z + 1.008665(A — Z) — a l A + a 2 A 1 ^ Q U 0 a5A- 1/2 (in u) (15-30) +1 This is called the semiempirical mass formula because the parameters al through a5 are obtained by empirically fitting the measured masses. A formula of this type was first developed by Weizsacker in 1935. Determinations of the parameters have since been made on several occasions. One set providing good results is al = 0.01691 a2 = 0.01911 (in u) (15-31) a3 = 0.000763 a4 = 0.10175 a5 = 0.012 Using these parameters, the formula yields excellent agreement with the average trend of the measured masses of all the stable nuclei except those of very small A. A comparison is shown in Figure 15-10, in which the smooth curve is AE/A evaluated from the sum of the volume, surface, Coulomb, and asymmetry terms. Figure 15-12 shows these terms individually. The semiempirical mass formula is of great practical utility because it is a simple formula that predicts with considerable accuracy the masses, and therefore the binding energies, of some 200 stable nuclei, and many more unstable nuclei. As we shall see in the following example, predictions of nuclear binding energies can lead immediately to predictions of other quantities of interest. + a3Z2A -1/3 + a4(Z — A /2)2A 1 + Surface term Coulomb term Asymmetry term Volume term I I 50 I i i i I l 100 150 Mass number A Net binding energy per nucleon i i l t 200 i i i l 250 Figure 15-12 Illustrating how the volume, surface, Coulomb, and asymmetry terms of the semiempirical mass formula combine to yield the average binding energy per nucleon. Use the semiempirical mass formula to predict the binding energy made available if a 92U235 nucleus captures a neutron. This is the energy which induces fission of the 92U236 nucleus that is formed in the capture. •The binding energy is Example 15 7. - The term in the first square bracket is the mass of a 92U235 atom plus the mass of a neutron, which are the constituents of the 92U236 atom whose mass appears in the second square bracket. Since the neutron mass, M 0, 1 , is precisely 1.008665u, the first two terms from the semiempirical mass formula, (15-30), cancel out in the expression for En. Then we obtain (92)2 2/3 (92 — 235/2) 2 ] En = —a i (235) + a2(235)213 + a, + a4 235 J 2 (92)2 (92 — 236/2)2 — [—ai (236) + a2 (236) 213 + a3 (2 cz 236 6) 13 + a4 (236)1/2 ai = — { — a4 a2 [(236)213 — (235) 2/3 ] + a3 (92) 2 L [(26.0) 2 (25.5) 2 1 + a 5 ^ c2 r 1 1 (235) 1/3 (236) 1/3 ] J 236 235 (236) 1 / 2 {0.0169 — 0.0191 x 0.11 + 0.00076 x 1.9 — 0.1018 x 0.097 + 0.012 x 0.065}c 2 {0.0169 — 0.0021 + 0.0014 — 0.0099 + 0.0008}c 2 = {0.0071u}c2 = 6.6 MeV where we have used (15-20) to convert to MeV. If the neutron has negligible kinetic energy before it is captured, the 92U236 nucleus is formed in a state of excitation energy equal to En . As we shall discuss at length in the next chapter, the excitation energy often sets the nucleus into a vibration in which it oscillates between being elongated (having a positive quadrupole moment) and being flattened (having a negative quadrupole moment). This vibration cannot take place without the excitation energy since the surface term of the semiempirical mass formula inhibits departures of the nucleus from the approximately spherical shape it has in its ground state. When the nucleus has a maximum elongation, the effect of the Coulomb term can cause it to fission. Of great importance in nuclear reactor technology is the fact that En for neutron capture by a 92 U 238 nucleus is about 1.5 MeV smaller than the value just calculated for capture by 92U235 The terms in the preceding expressions have almost the same values, except that the contribution of the pairing term (the last term) is negative instead of positive. Since all 92U nuclei require an excitation of about 6 MeV to overcome the surface term inhibition, 92U238 will fission only if the neutron it captures brings in more than about 1 MeV of kinetic energy, in addition to its binding energy. We shall see that this means 92U238 is not very useful in the "chain reaction" that takes place in reactors. • _ The liquid drop model is the oldest, and most classical, nuclear model. At the time the semiempirical mass formula was first developed, mass data was available, but not much else was known about nuclei. The parameters were purely empirical, and there was not even a qualitative understanding of the asymmetry and pairing terms. Nevertheless, the formula was significant because it described fairly accurately the masses of hundreds of nuclei in terms of only five parameters. At present we do have an insight into the origin of the two terms mentioned. And the most important parameter, the al of the volume term, is no longer purely empirical. Nuclear theory has been developed to the point that it predicts the value of a l , reasonably well, in terms of the detailed properties of nuclear forces. The nuclear theory, which is largely the work of Brueckner, is very similar to the Hartree theory of the atom in the sense that it involves self-consistent calculations for a system of fermions, but the calculations are even more complicated because of the complicated nature of nuclear forces. We shall make no attempt to describe them. 134OW d OaO difl b11 3H1 En = {[M92,235 + M0,1] — [M92,236 ]}c 2 o C'M NU CLEAR M OD ELS u7 15-6 MAGIC NUMBERS The liquid drop model gives a good account of the average behavior of nuclei in regard to mass, or binding energy. Since binding energy is a direct measure of stability—the higher the binding energy of a nucleus the more stable it is—the liquid drop model describes well the average behavior of nuclei in regard to their stability. However, nuclei with certain values of Z and/or N show significant departures from this average behavior by being unusually stable. These values of Z and/or N are the magic numbers (15-32) The situation is analogous to the unusual stability of the electron shells of noble gas atoms containing Z = 2, 10, 18, 36, 54, 86 electrons. But in the nuclear case the indications are not as pronounced as in the atomic case, and it is necessary to consider several of them to demonstrate the "magic" character of the numbers quoted in (15-32). The two most convincing are: 1. Nuclei prefer having magic Z and/or N. This can be seen by inspecting Figure 15-11. To quote just two examples, there are six stable isotopes for Z = 20, whereas the average number of stable isotopes in that region is about two. For Z = 50 there are ten stable isotopes, whereas the average number in that region of the periodic table is about four. All plausible explanations of how nuclei were originally formed relate this type of abundance to stability; i.e., the more stable a particular type of nucleus is, the more numerous are its stable isotopes. 2. Figure 15-10 shows that the average binding energy per nucleon is significantly higher for nuclei that have Z and/or N equal to 2 or 8 than it is for neighboring nuclei. The outstanding example is ZHe4, for which Z = N = 2. The effect is even more pronounced if a measure of stability more sensitive than AE/A is considered. This is En , or Ep, the minimum energy required to separate a neutron, or proton, from the nucleus; it is usually called the binding energy of the "last" neutron, or proton. As an example, for 'He' the value of En is 20.6 MeV (i.e., this much energy is required to produce the reaction 2 He 4 2 He3 + ° n ' The value of Ep for 2He4 is 19.8 MeV. These are abnormally high. Figure 15-13 is a plot of the difference between the value of En measured for a number of nuclei, and the value predicted by the semiempirical mass formula. Except for the effect of the pairing term, the predicted value is a smooth function that decreases slowly from around 8 MeV for intermediate values of N to around 6 MeV for large values of N (as we saw in Example 15-7 where we predicted En for 92U236). The unusual stability of nuclei with N = 28, 50, 82, 126 is shown by the exceptionally large energy required to remove their last neutron. There are a number of other somewhat less convincing pieces of evidence for the magic numbers, such as the fact that for most of the known spontaneous neutron Z and/or N = 2, 8, 20, 28, 50, 82, 126 ). S +3 — +2 — •• • —1— • —2 — W -3 20 28 40 50 60 SF 80 82 100 120 126 140 N Figure 15-13 The difference between the binding energy of the last neutron and the prediction of the semiempirical mass formula, as a function of the number of neutrons in the nucleus. These data provide clear, evidence for the magic numbers 28, 50, 82, and 126, for neutrons. Similar evidence shows that 20, 28, 50, and 82 are also magic numbers for protons. But there is no concrete evidence, pro or con, concerning 126 for protons since nuclei with such large Z values have not yet been detected. 15 7 THE FERMI GAS MODEL - Weisskopf first pointed out that there is a simple explanation of how nucleons can move independently through a nucleus in its ground state. The explanation is based on the Fermi gas model of the nucleus. This model is essentially the same as the freeelectron gas model of the conduction electrons in a metal, considered in Section 11-11. It assumes that each nucleon of the nucleus moves in an attractive net potential, that represents the average effect of its interactions with other nucleons in the nucleus. The net potential has a constant depth inside the nucleus since the distribution of nucleons is constant in this region; outside the nucleus it goes to zero within a distance equal to the range of nuclear forces. Thus the net potential is approximately like a three-dimensional finite square well of radius a little larger than the nuclear radius, and of depth that will be determined in Example 15-8. In the ground state of the nucleus, its nucleons, which are all fermions of intrinsic spin s = 1/2, occupy the energy levels of the net potential in such a way as to minimize the total energy without violating the exclusion principle. Figure 15-14 indicates the quantum states filled by the neutrons in the ground state of a nucleus. Since protons are distinguishable from neutrons, the exclusion principle operates independently on the two types of nucleons, and we must imagine a separate and independent diagram representing the quantum states filled by the protons. It is immediately apparent from these diagrams why the exclusion principle prevents almost all the nucleons from scattering from each other when the nucleus is in its ground state. The point is that almost all the states which are energetically accessible are already completely filled, and so there can be essentially no collisions except those in which two nucleons of the same type exchange quantum states. The net effect of such an exchange of two indistinguishable particles is, however, the same as if there had been no collision at all. Of course, if there is a set of partly filled degenerate states at the Fermi energy, the few nucleons in these states can collide with each other, but only a small fraction of the total number of nucleons can be in such states. Thus we see why almost all of the nucleons that compose a nucleus can move freely within the nucleus if it is in its ground state. Example 15 - 8. Evaluate the Fermi energy of a typical nucleus, and use the results to determine the depth of the net nuclear potential. •The Fermi energy, SF, is the energy indicated in Figure 15-14 of the nucleon in the highest filled level of the system, measured from the bottom of the potential well. It is related to the 13401/1Sb'J IWa33 3Hl L-51'09S emitters, like 8017, 36 Kr 87, and 54Xe 137, N equals a magic number plus one. This implies an unusually small affinity for the extra neutron. The analogy between nuclear and atomic magic numbers prompted many people to look for an explanation of the nuclear phenomenon that was similar to the explanation of the atomic phenomenon. The student will recall that the key point in that explanation is the formation of closed shells by the electrons moving independently in the atomic potential. However, when the nuclear magic numbers were first being discussed seriously, around 1948, it seemed very difficult to understand how nucleons could move independently in a nucleus. The reason was that the liquid drop model had been dominant for a number of years, and it seemed basic to this model that a nucleon in a nucleus (of density — 10 1s kg/m 3 !) would constantly interact with its neighbors through the strong nuclear force. If so, the nucleon would be repeatedly scattered in traveling through the nucleus, and it would follow an erratic path, resembling Brownian motion much more than the motion of an electron moving independently through its orbit in an atom. NU CLEAR MOD ELS Figure 15-14 A schematic representation of the energy levels filled by the neutrons in the ground state of a nucleus. The lowest levels are filled, according to the limitations of the exclusion principle, up to the Fermi energy S F . nucleon mass M, and nucleon density p, by (11-57), which we write here as 2h2/3 3 )2 (15-33) 2M \n P (This expression can be obtained directly from the equation for the energies of the levels of a three-dimensional square well simply by filling its lowest levels up to the Fermi energy.) Let us consider the Fermi gas of neutrons in a uniform spherical nucleus of radius gF r' = aA 1 / 3 For a typical nucleus, the number of neutrons is N 0.60A Thus P'= 4 3 N na 3A gives 0.60A P= 1.33na 3 A and the Fermi energy is F 0.45 na3 n 2 h 2 (0.26) 2Ma2 (15-34) Using a radius constant a ^ 1.1F consistent with the electron scattering measurements as summarized by (15-6), and evaluating the other parameters, we obtain cfF ^ 43 MeV The relations between the depth of the potential Vo, the Fermi energy 4, and the binding energy of the last neutron En, are shown in Figure 15-15. As mentioned in the previous section, E„ is approximately equal to 7 MeV for a typical nucleus. Thus for this nucleus the aA 1/3 0 T E„ T 110 0) Vo C w 66F r Figure 15-15 Illustrating the relation between the depth Vo of a nuclear square well potential of radius r' = aA 1 "3 , the Fermi energy en F, and the binding energy E„ of the last neutron. There is evidence from a number of studies of the behavior of nucleons of various energies that the depth of the net nuclear potential, Vo , is not a constant, but instead it decreases slowly, and approximately linearly, as the energy of the nucleon increases. This causes no difficulty because its effect on the dynamics of nucleon motion in the net potential can be completely described by introducing an effective nucleon mass, in much the same way as we did in Section 13-7 when treating the independent particle motion of a conduction electron in the net potential for a crystal lattice. That is, it is possible to continue treating V o as a constant with the value we have obtained in Example 15-8, if the actual nucleon mass M is replaced by an effective nucleon mass M*. Furthermore, because the actual change in, Vo is slow, M* is not very different from M, and so for most considerations involving nucleons of not too high energy it is permissible to take M* = M, i.e., to completely ignore the fact that Vo is not quite a constant. There is also a dependence of the depth of the net nuclear potential V o seen by a proton, or by a neutron, on the difference between the number Z of protons and number N of neutrons that the nucleus contains. This is described by adding to Vo a term A Vo cc ± (N — Z)lA, with the plus sign used for the potential seen by a proton and the minus sign used for the potential seen by a neutron. The dependence is a result of the exclusion principle, which restricts the interactions between two protons, or two neutrons, to certain quantum states, but puts no restrictions on the interactions between a proton and a neutron. Consequently, the attractive interaction between two nucleons in a nucleus is stronger between a proton and a neutron than between two protons or between two neutrons. Thus the net nuclear potential acting on a proton is deeper than that acting on a neutron if the nucleus contains more neutrons than protons in proportion to the fractional neutron excess, and vice versa if there is a proton excess. This dependence plays an important role in the effect described by the asymmetry term of the semiempirical mass formula, as we shall indicate. In most other considerations it is not so important and can be ignored. The tendency for nuclei to have Z = N also has a simple explanation in the Fermi gas model. Consider a nucleus of very small Z, for which the Coulomb force acting between protons can be ignored in comparison to the stronger nuclear force. In this nucleus there are two independent Fermi gases, the neutrons and the protons. Both move in net nuclear potentials which, in this approximation, are the same—basically because the nuclear force acting between neutrons is the same as the nuclear force acting between protons since the nuclear force is charge independent. As is indicated in Figure 15-16, the energy levels of the two systems must then also be the same in this approximation. For a given value of A, the total energy of the nucleus is obviously minimized if the levels are filled with Z = N, because nucleons would occupy Neutrons Protons Figure 15-16 A schematic representation of independent Fermi gases of neutrons and protons in the minimum energy state of a nucleus of very small Z, which is indicated by a square well with rounded edges. cn Co) CA) l3a01/1 SVOI 1A11:133 3H1L-51- 39S depth of the net nuclear potential acting on its neutrons is Vo= gF+ E„^ 43 MeV + 7 MeV = 50 MeV A very similar result is obtained for the net nuclear potential for protons. (Of course protons also feel a net Coulomb potential exerted by the charges of other protons in the nucleus.) • co NU CLEAR MODEL S ^ U levels of energy higher than necessary if this condition were violated. A nucleus can adjust its N and Z values while maintaining a fixed value of A = N + Z by using the beta decay process (discussed in Chapter 16) to convert neutrons to protons, or vice versa. When the argument is made quantitative, it leads to the mathematical expression, (15-27), used in the asymmetry term of the semiempirical mass formula. The reason why the factor 1/A appears in the term is that the levels of a threedimensional potential well are more closely spaced the larger the value of A. So with increasing A there is a scaling down of the energy penalty, associated with violating the N = Z condition, that is described by the factor (Z — A/2) 2 . The effect of the term AVo cc ±(N — Z)/A in the depth of the net nuclear potential, explained previously, also contributes significantly to the presence of the asymmetry term in the semiempirical mass formula, and its consequences. Consider a typical nucleus containing N neutrons and Z protons, with N > Z. The contribution of the AK, term to the total binding energy from the Z protons is canceled by its contribution from the first Z neutrons. But there is an uncanceled contribution from the remaining (N — Z) neutrons which decreases the total binding energy, or increases the nuclear mass, in proportion to (N — Z) 2/A cc (Z — A/2) 2/A. 15-8 THE SHELL MODEL The Fermi gas model establishes the validity of treating the motion of the bound nucleons in a nucleus in terms of the independent motion of each nucleon in a net nuclear potential. The next step is obviously to solve the Schroedinger equation for that potential, and to obtain a detailed description of the behavior of the nucleons. This procedure is employed in the shell model of the nucleus. The shell model plays a role in nuclear physics comparable to that played by the Hartree theory in atomic physics. But the shell model is cruder since the exact form of the net atomic potential is internally determined by the self-consistent atomic theory, while the exact form of the net nuclear potential must be inserted into the nuclear model. Of course, some general information about the net nuclear potential is available from the Fermi gas model. The procedure of the shell model involves first finding the neutron and proton energy levels for an assumed form of the net potential of a particular nucleus. That is, if each nucleon is treated as moving independently in a net nuclear potential V(r), the nucleon has allowed energy levels which are determined by the form of V(r), and which are found by solving the Schroedinger equation for that potential. The only forms for the net potential considered are spherically symmetrical functions, V(r), where r is the distance from a nucleon to the center of the nucleus; other forms would greatly increase the difficulty of solving the Schroedinger equation. Just as in the Hartree theory of atoms, it is found that the energy of a nucleon energy level of the net nuclear potential V(r) depends on quantum numbers n and 1, which specify the radial and angular behavior of a nucleon in the level. The quantum number 1 is just the same as the one we encounter throughout atomic physics when dealing with any spherically symmetrical potential like V(r). The quantum number n used in nuclear physics is related to, but not the same as, the quantum number of atomic physics that is symbolized by the same letter. Because of the approximate square well form of the net potential V(r) which arises in nuclear physics, it is more convenient in that field to use what is called the radial node quantum number n. Figure 15-17 contains schematic illustrations of some of the energy levels, and associated eigenfunctions, of the bound states of a three-dimensional square well V(r). On the left, the n dependence of the energies of the levels is indicated for a well which is wide and deep enough to bind a ls, 2s, and 3s state. The radial behaviors of the corresponding eigenfunctions iji(r,9, p) = R(r)0(9)t(q) are indicated by plot- - ting for each rR(r), whose square is proportional to the radial probability density, using the appropriate energy level as an r axis. The notation is means n = 1 and l = 0, as usual. Note that for fixed 1, the energy increases with increasing n. The reason is that rR(r) for n = 1 contains essentially one-half of an oscillation within the well region, rR(r) for n = 2 contains two half oscillations, and rR(r) for n = 3 contains three half oscillations. So the eigenfunctions kfr for higher n necessarily have higher curvature, and higher curvature requires higher kinetic or total energy. Note also that the number of nodes within the well of the radial dependence of r times each eigenfunction is just equal to n, as its name implies. There are bound states in the well of Figure 15-17 for values of 1 other than 1 = 0. On the right side of that figure the 1 dependence, for fixed n, of the energies of the levels, and r times the radial behavior of the corresponding eigenfunctions, are indicated by showing them for the ls, 1p, and ld states. Since all of these have n = 1, all the rR(r) have only one radial node. Nevertheless, the radial behavior of the eigenfunction i/r changes with changing 1 because of the property expressed by (7-32) i/roc R(r)ocr` r —> 0 and discussed at length in Chapters 7 and 9. This is the familiar tendency of a particle in states of any spherically symmetrical potential, for which orbital angular momentum is constant so that 1 is a good quantum number, to avoid the origin more .and more as 1 gets larger. Thus, with increasing 1 the one-half of an oscillation in the various rR(r) for n = 1 is contained within a smaller and smaller region of the r axis. So the eigenfunctions i/i have higher curvature, and the corresponding energy levels are found higher in the well. The results concerning three-dimensional square wells that are of most consequence are that the energies of bound levels increase with increasing n, for a given 1, and that they also increase with increasing 1, for given n. The student should further observe that when using the radial node quantum number n of nuclear physics there is no restriction on the largest possible value of l for a given n. There is such a restriction in atomic physics because the quantum number n used there, called the principal quantum number, is just equal to the sum of the radial node quantum number and the orbital angular momentum quantum number. That is nprincipal = nradial + 1 13a0 1A11131-IS31-11 Figure 15 17 Left: Illustrating qualitatively the product rR of the radial coordinate r and the radial dependence R of the eigenfunction V/ for states, of the indicated three-dimensional square well, with I = 0 and n = 1, 2, 3. Each is shown by using its energy level as an r axis. Since the radial probability density is P = 4nr 2 R*R = 4n(rR) 2 , if the student visualizes the squares of the functions depicted he can make comparisons with the radial probability densities for states of a one-electron atom Coulomb potential, or a multielectron atom Hartree net potential, by looking also at Figures 7-5 or 9-10. In so doing, he should keep in mind that the quantum number n is used differently in atomic physics. The fact that the radial node quantum number n of nuclear physics just specifies the number of nodes of rR within the well is made apparent by this figure. Right: The same for states with n = 1 and I = 0, 1, 2. The way that what might be called a centrifugal effect tends to prevent a nucleon from approaching r = 0 as the orbital angular momentum quantum number I becomes larger than 0 is seen in this figure. NU CLEAR MODELS Since the minimum value of nradial is 1, the largest possible value of 1 for a given nprincipai is (nprincipai — 1). The reason why nprincipal is used in atomic physics is that when V(r) is an attractive Coulomb potential, V(r) cc — 1/r, the way the energy of a level increases with increasing nradial happens to be precisely the same as the way it increases with increasing 1. Thus the energy of the levels of a Coulomb potential does not depend on both nradiai and 1, but only on their sum n principal. This gives yet another insight into the origin of the degeneracy of the energy levels of the hydrogen atom. Additional insight into the properties of the quantity rR can be obtained by considering the radial part of the time-independent Schroedinger equation for a spherically symmetrical potential V(r), which is (7-17). Inspection will show that we can immediately put it in the form h2 ^n cs 2µ d2(rR)+[1(1( +2µr21)h2+ V(r)Jll (rR) = E(rR) dr2 This is seen to be equivalent to the Schroedinger equation in the function rR for motion in one dimension, r, except that the term 1(1 + 1)h2 /2121.2 = L2 /2 µr2 is added to the potential V(r). This term is often called the centrifugal potential, for reasons which can be seen by considering the energy conservation equation for a classical particle of mass µ moving under the influence of a potential V(r). As a particle will move in a plane containing the origin, it can be described by the coordinates r, 0, and the equation is 1 µ (dry2+ 1 µ (rde 2 E= I + V(r) Also the orbital angular momentum of the particle is a constant L = — µr so the energy equation can be written 1 2 2 d8 dt 2 + 2 µ (dt) + L2 µ r2 V(r)i This is seen to be the energy conservation equation for classical motion in one dimension, if r is the one-dimensional coordinate, with the term L 2/2µr 2 added to the potential V(r). This positive term acts like a repulsive potential, tending to keep the particle away from the origin. The higher the value of L, the stronger is the effect, in agreement with our usual conclusion. Note also that for 1 = 0 the differential equation for rR is mathematically identical to the one-dimensional time-independent Schroedinger equation for fr. This is why the plots of rR in Figure 15-17 for ls, 2s, and 3s states look so much like the plots of ip for a one-dimensional square well potential in that they are both sinusoidal within the well and decreasing exponential outside. They are not identical, however, because rR necessarily has the value zero in all states at the point r = O. E= Having found the nucleon energy levels in the assumed square-well-like form of the net nuclear potential V(r), the next step of the shell model is to "construct" the nucleus by filling them, in order of increasing energy, with the N neutrons and Z protons that the nucleus contains. The exclusion principle limits the occupancy of each level to 2(2l + 1) neutrons, or protons. This occupancy corresponds to the 2 possible values of the quantum number ms, which specifies the orientation of the intrinsic spin angular momentum of a nucleon, and the (2l + 1) possible values of the quantum number ml which specifies the orientation of the orbital angular momentum of the nucleon. These two z component angular momentum quantum numbers are the same as in the Hartree theory of atoms. And the procedure for constructing a nucleus by filling its nucleon energy levels is just the same as that used in the Hartree theory to construct an atom by filling its electron energy levels, except that in a nucleus there are particles of two distinguishable species—the neutrons and the protons—to which the exclusion principle applies independently. Originally, it was hoped that a particular form for the potentials V(r) of the various nuclei could be found in which the ordering and spacing of the nucleon energy levels would be such that an unusually , 13aOW-113H S 3 H.I. tightly bound level, containing an appropriate number of neutrons or protons, would completely fill in those nuclei having values of N or Z equal to the magic numbers— just as the filling of unusually tightly bound electron energy levels leads to the noble gas atoms for Z equal to the atomic magic numbers. Many different detailed forms for the radial dependence of the nuclear potential were tried (including one aptly called the "wine bottle potential," a square well with a bump centered in the bottom, like the profile of a wine bottle bottom, which suppresses somewhat the 1 dependence of the energy). It was found that there is no form for V(r) which leads even to the ordering of the nucleon energy levels required to explain the magic numbers. The mystery of the magic numbers was solved in 1949 by Mayer, and independently by Jensen, who introduced the idea of a nuclear spin-orbit interaction. They proposed that each nucleon in a nucleus feels, in addition to the net nuclear potential, a strong inverted spin-orbit interaction proportional to S • L, the dot product of its spin and orbital angular momentum vectors. Strong means that the interaction energy is much (about 20 times) larger than would be predicted by using the atomic spin-orbit formula, (8-35), equating V(r) to the net nuclear potential and m to the nucleon mass. Inverted means that the energy of the nucleon is decreased when S • L is positive, and increased when it is negative. Thus the sign of the interaction is opposite to the sign of the magnetic spin-orbit interaction experienced by an electron in an atom; that is, the interaction energy is negative when the total angular momentum of the nucleon J = S + L has its maximum possible magnitude (i.e., when S and L are as parallel as possible, and S • L is positive). However, as the magnitude of the spin-orbit interaction is proportional to S • L just as it is for an atomic electron, the magnitude of the spinorbit splitting of the nucleon energy levels will be approximately proportional to the value of the quantum number 1, just as it is for the electron energy levels. Although there are similarities between the atomic and nuclear spin-orbit interactions, their differences make it clear that the latter is not magnetic in origin. Instead, it is an attribute of the nuclear force whose origin will be explained in Chapter 17. The left-hand part of Figure 15-18 shows the ordering and approximate spacing of the energy levels which nucleons are filling in nuclei with potentials V(r) in the form of square wells with rounded edges, like the potential shown in Figure 15-16. As the levels are filled, in proceeding up the periodic table, the depth of the potentials is held constant while their radii increase in proportion to the cube root of the number of nucleons they contain in the filled levels. The same general features seen in the left part of Figure 15-18 are found in all spherically symmetrical potentials that have a form bearing any resemblance to an attractive square well. Of course, the details of the ordering and spacing of the nucleon energy levels depends on the details of the competition between the n dependence and the 1 dependence of the energy, and this depends on the details of the radial behavior of the nuclear potential; but any reasonable nuclear potential gives essentially the same ordering of the levels according to n and 1 as that for square wells with rounded edges, and it also gives gaps between the levels in essentially the same places. Since, as we saw in Example 15-8, the net nuclear potential is related to the nuclear mass density, square wells with rounded edges are most certainly the correct forms for the potential as they reflect the constant interior values, and fairly gradual changes at the nuclear surface, of the mass densities. But as we have already said, and will see specifically in Example 15-9, the ordering and spacing of the energy levels for these potentials, shown in the left-hand part of Figure 15-18, does not lead to the observed magic numbers if there is no spin-orbit interaction. The right-hand part of Figure 15-18 shows how the nucleon energy levels are split by the nuclear spin-orbit interaction. In the presence of the spin-orbit interaction, m1 and ms are no longer useful quantum numbers because the z components of the orbital and intrinsic spin angular momenta of a nucleon are no longer constants when NU CLEAR MODELS ^ \ + ^ ^ 8 12 6 184 168 164 162 154 142 2g9/2 10 136 1/ 13(2 14 2 4 31) 3/2 — 6 126 112 110 106 lh 9/2 — 10 8 100 92 — 12 82 2d3/2 — 2 4 70 68 6 64 8 58 \— 1./15/2 — — 4s 3d — 2g - li J = < ^ / u•D - y f _----^ ^ ^ ^ ^-----__^— 1 G 4 3d312 2 v + 4 , 1 ,2 11 11/2 2 g7/2 — 3d5; 2 — ^ U Ea) 184 ^ — 3p -^!— — 2f 2j5/2 21712 31)1/2 — 126 —lh 11111/2 —' 3s 1/2 3s — 2d 2d 572 — g I 12 „ - 82 1g ^ \ 4,9 /2 — 2p 1f — C ^ < 10 2p112 1r512 2 p3/2 — 117/2 — 2s — ld — — lp — is Without S• L ^--'— ^^_ _ ^ 1d312 — 25 1/2 1 d512 — 1p112 lp3/2 15 1/2 50 50 2 40 4 6 38 32 8 28 28 4 20 16 20 2 6 14 2 8 4 6 2 2 8 2 With S•L Figure 15-18 Left: The order of filling, as the occupancy and well radius increase, of the levels of rounded edge square wells with no spin-orbit interaction. Right: The levels that arise when a strong inverted S • L interaction is added. The column marked (2j + 1) shows the number of like nucleons that may occupy the corresponding level without violating the exclusion principle. The column marked E (2j + 1) gives for each level the cumulative number of nucleons that lie in all levels up through that level. Significant energy gaps lie above each of the levels marked with a magic number in the last column. these angular momenta are coupled by the interaction. Thus n, 1, j, mi must be used to label the split energy levels. The quantum number j specifies the magnitude of the total angular momentum, J, of a nucleon, which is the sum of its spin and orbital angular momenta; and m ; is the quantum number specifying the z component of its total angular momentum, J. As a result of the spin-orbit interaction, the energies of the levels depend on j as well as on n and 1, with the larger j (corresponding to the larger value of J, or S • L) yielding the smaller energy since the sign of the nuclear spin-orbit interaction is inverted. According to the exclusion principle, each of these levels has a capacity of (2j + 1), which is equal to the number of possible values of ;. This is shown in the first column on the right in the figure. The second column m 13aOW 113HS 3H18-9L '0 99 shows the total capacity of the levels up to and including the level in question. The third column shows the same thing for each level which lies unusually far below the next higher level. Since these are the levels which will be unusually tightly bound, we see that the shell model with strong inverted spin-orbit interaction predicts precisely the magic numbers of (15-32). Figure 15-18 is so frequently used by nuclear physicists that many of them have it memorized. An easier procedure is to construct it by using the acrostic spuds if pug dish of pig which means: (eat) potatoes if the pork is bad. Deletion of all vowels, except the last, yields spdsfpgdshfpig This is the ordering of 1 for all the unsplit levels, through those leading to the magic number 126. The values of n are assigned easily since the first s level is ls, the second is 2s, etc. The remainder of the figure is constructed by applying an inverted spin-orbit splitting, proportional to 1. It should also be pointed out that Figure 15-18 is not an energy-level diagram for any particular nucleus; instead it gives the order in which the nuclear levels appear below the Fermi energy as the radius of the nuclear potential increases in proportion to A 113 . That is, it gives the order in which the highest energy levels of the various nuclei fill. It also gives an indication of the relative magnitudes of the separation between adjacent levels as they are filling So it is analogous to the diagram that could be constructed for atoms by using only the left side of Figure 9-14. Finally, we should mention that there is some recent experimental and theoretical evidence showing that there may be small but important changes from Figure 15-18 in the filling order of the highest levels in the case of protons. We shall discuss this in Section 16-2. Use Figure 15-18 to predict the first four magic numbers for nuclei with potentials in the form of square wells with rounded edges (a) under the assumption that there is no spin-orbit interaction, and (b) under the assumption that there is a strong inverted spinorbit interaction. •(a) If there is no spin-orbit interaction then the nucleon energy levels are simply those shown on the left-hand part of the figure. Recalling that the capacity of each level is 2(21 + 1), and that s, p, d, f, g, ... mean l = 0, 1, 2, 3, 4, ... , we see that the first few levels, and their capacities, are, in order of increasing energy: ls, capacity 2; 1p, capacity 6; ld, capacity 10; 2s, capacity 2; 1f, capacity 14; 2p, capacity 6; 1g, capacity 18. The first magic number will be the number of nucleons required to fill the first level, i.e., 2. The next magic number will be the number required to fill the first two levels, i.e., 2 + 6 = 8. If the third and fourth levels are very close in energy, as indicated in the figure, the next magic number will be the number of nucleons required to fill the first four levels, i.e., 2 + 6 + 10 + 2 = 20. So far these magic numbers are in agreement with the observed magic numbers: 2, 8, 20, 28, 50, 82, 126. But the next magic number predicted in the absence of spin-orbit interaction will be the total number of nucleons required to fill the first five levels, or the first six levels, depending on whether or not the fifth and sixth levels are considered to be very close in energy. The two possibilities are, Example 15-9. magic number 28. Similar numerology will make it apparent that the higher predicted magic numbers also disagree with those that are observed, and that there is no way to remove the discrepancy by rearranging the spacing, or even the ordering, of the nucleon energy levels in the absence of spin-orbit interaction. (b) If there is a strong inverted spin-orbit interaction, then the nucleon levels are split into the filling pattern shown on the right-hand part of Figure 15-18. The figure also shows the capacity (2j + 1) of each level, as well as the sum E(2j + 1) of its capacity and the capacity of all the lower energy levels, as explained in the text. The spin-orbit interaction splitting does not change the first three predicted magic numbers, 2, 8, 20, as is clear from the figure, so the agreement with observation is maintained. But agreement is also obtained with the higher magic numbers. For instance, the spin-orbit interaction splits the 1f level into the 1f 712, whose energy is depressed, and the 1 f 5/2, whose energy is elevated. Since the capacity of the 1 f 7/2 2+610 4=3,or2+610 4=.Bothdisagrew bvd 0 NUCLEAR MODELS ^ level is (2j + 1) = 2 x 7/2 + 1 = 8, the magic number after 20 is predicted to be 20 + 8 = 28, in agreement with the observation. The observed magic number 50 is obtained because the 1992 level, with a capacity of 2 x 9/2 + 1 = 10, is depressed in energy and so comes close to the 2p level. Since the total number of nucleons filling the levels up to and including the 2p is 40, as we saw earlier, the total number filling the levels up to and including the 1g 912 is 40 + 10 = 50. Inspection of Figure 15-18 makes the origin of the remaining magic numbers apparent. Note that the fact that the spin-orbit splitting increases in magnitude, with increasing 1, plays an important role in achieving agreement with the observations. • 15-9 PREDICTIONS OF THE SHELL MODEL The shell model can do much more than predict the magic numbers, and all their consequences. For instance, it can also predict the total angular momentum of the ground states of almost all the nuclei. Consider nuclei for which both N and Z are magic, such as 8016, 20C a40, and 82 Pb208 . According to the model, they will contain only completely filled subshells of neutrons and protons, and the exclusion principle therefore requires that, for both the neutron and proton systems, the intrinsic spin and orbital angular momentum vectors of all the nucleons couple together (add up) to yield zero total angular momentum. (The formal proof of this obvious requirement is essentially the same as that given in Appendix P.) This agrees with the measurements, discussed in Section 15-2, which show that for these nuclei the total angular momentum quantum number, called the nuclear spin, is i = O. For nuclei which contain a magic number of nucleons of one type, and a magic number plus, or minus, one of nucleons of the other type, the exclusion principle demands that the total angular momentum of the nucleus be the total angular momentum of the extra nucleon, or (compare Appendix P) of the hole. For such nuclei the nuclear spin i should equal the total angular momentum quantum number j of the extra nucleon, or hole. Example 15 10. Use Figure 15-18, and the exclusion principle argument just stated, to predict the ground state spins of the following nuclei: (a) 7N15 (b) 8 0 17 (c) 19K39 (d) 82pb207 and (e) 83Bi209 ■ (a) Figure 15-18 predicts that 7N15 is doubly magic except for a proton hole in the 1 p1/2 subshell. So it should have a spin i equal to the value j = 1/2 for that subshell. This prediction agrees with measurement. It will also be obtained from a somewhat different point of view in Example 15-11. (b) The figure predicts that 8 0 17 is doubly magic except for an extra neutron in the 1d572 subshell. So it should have i = j = 5/2, in agreement with measurement. (c)19 K 39 is predicted to be doubly magic except for a proton hole in the 1d 3/2 subshell, so it should have i = j = 3/2. It does. (d)According to Figure 15-18, 82 Pb 207 is doubly magic except for a neutron hole in the 11 13/2 subshell. So the exclusion principle predicts that it should have a spin i = j = 13/2. However, the measured spin is i = 1/2. This is not a failure of the exclusion principle, but instead is a failure of Figure 15-18, as we shall explain shortly. (e)The figure predicts that 83 Bi 209 is doubly magic except for an extra proton in the 1h 9î2 • subshell. So its spin should be i = j = 9/2. This agrees with measurement. - Now consider nuclei for which N and/or Z are not near magic numbers. These nuclei contain subshells with several nucleons, or holes, and the problem of how the intrinsic spin and orbital angular momenta of these nucleons couple is much the same problem as that studied in Chapter 10 in connection with the behavior of electrons in atoms. But there are important differences between atoms and nuclei in this regard. One is that most atoms obey what is called LS coupling, while essentially all nuclei obey what is called JJ coupling. The difference in the angular momentum coupling schemes obeyed by atoms and nuclei has to do with the fact that the spin-orbit interaction is relatively weak in atoms, and quite strong in nuclei (see Section 10-3). Thus in nuclei the spin-orbit interaction dominates the coupling. That is, in JJ coupling, the PREDICTI ONS OF THE SH ELL M ODEL intrinsic spin angular momentum of a nucleon couples strongly with its own orbital angular momentum to form the total angular momentum for that nucleon. This happens for each nucleon. Finally, the several total angular momenta that have been formed couple together less strongly to form the total angular momentum for the nucleus. Another difference between the angular momentum couplings in atoms and nuclei is that the final coupling which forms the total angular momentum of the nucleus is particularly simple. This is apparent from the fact that all nuclei with even N and even Z are found to have a total angular momentum given by i = 0, as stated in Section 15 -2. An explanation is that, whenever there is an even number of nucleons of a given species in a subshell, the total angular momenta of each of these nucleons couple together to yield a total angular momentum for the nucleus, which is zero. This is true, but the coupling is even simpler. There is much evidence indicating that the total angular momenta of the protons in a subshell couple together in pairs, with the total angular momentum of each pair of protons equal to zero, and that the same thing happens for pairs of neutrons in a subshell. Some of the evidence for the pairing tendency has been presented before in discussing the abundance of stable nuclei, and the semiempirical mass formula. It arises from a pairing interaction. This is a residual nuclear interaction, i.e., a part of the total nuclear interaction experienced by the nucleons that is not described by the spherically symmetrical net potential V(r) of the shell model, or by the spin-orbit interaction. Although not described by these attributes of the shell model, the pairing interaction can be predicted from them. The net potential V(r) represents the interactions experienced by a nucleon on the average. The pairing interaction represents a departure from the average interaction described by V(r), that arises when the nucleon is particularly close to another nucleon with which it can have an individual interaction. It involves the collision of nucleons in degenerate states of a partly filled subshell, mentioned in Section 15-7. A pair of nucleons having the same values of j but opposite values of m (e.g., j = 5/2, m; = 5/2; j = 5/2, m; = — 5/2) collide with each other in such an interaction, and after the collision enter previously empty states that have different but still opposite values of m; ( e.g., j = 5/2, m; = 3/2; j = 5/2, m; = — 3/2). It is clear that angular momentum is conserved in such collisions, and that the collisions are not inhibited by the exclusion principle. The energy of the system is reduced because, when colliding, the nucleons are particularly close together, and the exclusion principle does not prevent them from exerting on each other the strongly attractive short range nuclear force. Because the nuclear force exerted between two nucleons is strong and short range, the departures from the average described by the pairing interaction are pronounced. Thus the pairing interaction is fairly strong, although it is less strong than the spinorbit interaction. It is short range, just like the nuclear force leading to the fluctuation it represents. It is attractive because that force is attractive. A similar interaction resulting from a departure from the average, called the residual Coulomb interaction, arises in the treatment of atoms, as we have seen in Section 10 -3. In atoms, the repulsive residual Coulomb interaction between the electrons in a subshell tends to make them form parallel couplings of their angular momenta. In nuclei the tendency is for antiparallel couplings because the residual nuclear interaction between the nucleons is attractive. The reason can be understood by carrying through arguments similar to those used for the atomic couplings (see Section 10-3), in the case of an attractive residual interaction. Briefly, these arguments show that since two nucleons of the same species are described by an antisymmetric total eigenfunction, on the average they are closer to each other if their spin angular momenta are essentially antiparallel. Also they are closer on the average if their orbital angular momenta are essentially antiparallel, because then they move in opposite directions around the same "orbit" and so frequently pass by each other. Thus they form a closely spaced pair if their total N ^ NUCLEAR MODELS angular momentum vectors are essentially antiparallel. When they form such a closely spaced pair with zero total angular momentum, the attractive nuclear force acting between them makes a larger contribution to the binding energy of the nucleus, and so makes the nucleus more stable. Hence the tendency to form a pair, and maintain essentially antiparallel total angular momentum vectors throughout their sequence of collisions with each other. These collisions change the orientation of their orbit, but they always move in opposite directions through whatever orbit they happen to be in. The energy decrease, arising from the coupling of a pair of nucleons of the same type, or pairing energy, gives rise to the preference for nuclei to have even Z and even N, and to the pairing term of the semiempirical mass formula. It is also responsible for the occasional failure of Figure 15-18 to predict correctly the ground state nuclear spins. For the case of 82 PbZ07, considered in Example 15-10, the nuclear spin is 1/2 because it is energetically favorable for a neutron from the 3p1/2 subshell to pair with the odd neutron in the 111 3/2 subshell, leaving a hole in the 3p1/2 subshell. The reason is that the pairing energy is larger the larger the 1 values of the components of the pair, because with increasing 1 the nucleons move in a more classical way (i.e., more like particles confined to orbits in a plane), and this increases the overlap of their wave functions (i.e., they get closer together). Since the two subshells have very nearly the same energy, the pairing effect dominates. If a subshell contains an even number of nucleons, their total angular momenta should couple together in pairs to yield zero total angular momentum. If one more nucleon is added, it should be difficult for it to disturb the pairs that were already there, because the pairing interaction is fairly strong. Thus the total angular momentum of the whole subshell should be due entirely to the odd nucleon. Therefore, the entire angular momentum of an odd-A nucleus should be due to the total angular momentum of the single odd nucleon in the highest energy occupied subshell, and the nuclear spin i should be equal to the value of the quantum number j for that subshell. With only one or two exceptions, this rule allows the observed values of i for all odd-A nuclei to be explained in terms of Figure 15-18. It is, however, necessary to allow for occasional interchanges of the filling order of some closely spaced levels because of the pairing effect discussed in the preceding paragraph. For odd-A nuclei, the shell model is also quite successful in predicting the parities of the nuclear eigenfunctions, i.e., whether they are even or odd functions of their space variables (see (8-44) and (8-45)). Because the nucleons in the shell model are, basically, moving independently, a nuclear eigenfunction can be written as a product of the eigenfunctions for each of its nucleons just as in the Hartree theory of atoms. We shall see in Example 15-11 that the parity of the nuclear eigenfunction is just the parity of the eigenfunction for the odd nucleon. Because (8-47) shows that the parity of that eigenfunction is determined by (-1)`, we find that if the odd nucleon is in a subshell in which l is even, the nuclear parity is even; if 1 is odd, the parity is odd. In the next chapter we shall find that the nuclear parity is extremely important in determining the types of transitions that occur in certain kinds of radioactivity and nuclear reactions because there are selection rules that involve parity. It should be apparent that the shell model predicts that for even-A nuclei, with N and Z even, the nuclear spin is i = 0 and the nuclear parity is even. This agrees with experiment. For even-A nuclei, with N and Z odd, the value of j and the parity of the eigenfunctions are predicted for each of the two odd nucleons. From this the nuclear parity can be obtained immediately, but it is only possible to set limits on the nuclear spin and to say that it must have an integral value. However, there are only a few odd-N, odd-Z nuclei. The arguments of the last two paragraphs can also be extended to provide information about the spins and parities of low-lying excited states of nuclei. As we shall see later, this information is dependable only if the N and/or Z values lie near the magic numbers. Example 1 5 11. Predict the ground state nuclear spin and parity for the following nuclei: (a) / 8 0 16 , (b) / , 8 0 17 (e) 8 0 18 (d) 7N15 (e) 7N14. • (a) The 8 0 16 nucleus has even N and even Z, and it is also doubly magic since both N and Z equal 8. It has two neutrons in the is 1/2 subshell which couple together in a pair to yield zero total angular momentum. Both of these neutrons are described by even parity eigenfunctions, since l = 0, so their part of the product eigenfunction for the nucleus is even. There are four neutrons in the 1p 3/2 subshell, that couple into two pairs, both of which have zero total angular momentum. All four of these neutrons are described by odd parity eigenfunctions since 1 = 1, but the product of four odd functions is an even function, so their part of the product eigenfunction for the nucleus is also even. There are two neutrons in the 1p 1/2 subhel,wicformapzetlngurmo .Theycntibuwod eigenfunctions to the product eigenfunction for the nucleus, so their part of the product eigenfunction is also even. Exactly the same remarks apply to the protons. The net result is that the nuclear spin is zero, and the nuclear parity is even. (b) 8 0 17 is an odd-N, even-Z nucleus. Its neutrons and protons are doing the same things as the neutrons and protons in 8 0 16, except that it has a single extra unpaired neutron in a 1d5/2 subshell. This gives the nucleus a spin of i = 5/2. The parity of the eigenfunction for the unpaired neutron is even since l = 2, so the nuclear parity is even. (c)8 0 18 is an even-N, even-Z nucleus. The predicted spin and parity are i = 0, and even. The reasons are that there are two neutrons in the 1d5î2 subshell, which form a pair of zero total angular momentum, and which both have even parity eigenfunctions. (d)7N 15 is an even-N, odd-Z nucleus. Its neutrons and protons behave as in 80 16, except that it has only one unpaired proton in the 1p 1i2 subshell. This odd proton gives the nucleus a spin of i = 1/2. Since the eigenfunction for the proton is odd because 1 = 1, the nuclear parity is odd. Note that we predicted the nuclear spin, from a somewhat different point of view, in Example 15-10. (e)7N14 is an odd-N, odd-Z nucleus. It has an unpaired proton in the 1p 1/2 subshell, and also an unpaired neutron in the 1p 1/2 subshell. Both have a total angular momentum quantum number of j = 1/2. We cannot say precisely what the nuclear spin should be without knowing how these two different particles couple their angular momenta. But we can say that there are only two possibilites for the nuclear spins, i = 0, or i = 1. Experiments show that i = 1 is the correct value. We can predict unambiguously that the nuclear parity will be even, since the unpaired proton and the unpaired neutron both contribute an odd eigenfunction to the product eigenfunction for the nucleus, and the product of two odd functions is an even function. This prediction is born out by the experiments, as are all the predictions made in the earlier • parts of this example. The shell model is not so successful in predicting the magnetic dipole moments of nuclei. It says that the magnetic dipole moment of an odd-A nucleus (i.e., even N and odd Z, or odd N and even Z) should be due entirely to that of the single odd (unpaired) nucleon. The reason is that the magnetic dipole moments of the other nucleons would be expected to cancel out in pairs, if their total angular momenta do the same. The experimental data are illustrated in the two parts of Figure 15-19, for even-N, odd-Z nuclei and for odd-N, even-Z nuclei. The data are obtained in the manner indicated in Section 15-2. Also shown in the figure are the so-called Schmidt lines, which represent the predictions of the shell model for cases in which the spin and orbital angular momenta of the odd nucleon are either essentially parallel or essentially antiparallel, that is for the two possible cases j = 1 + 1/2 or j = 1 — 1/2. The data show only a barely recognizable tendency to follow qualitatively the predictions of the shell model. The failure in the model is due to its assumption that the nuclear magnetic dipole moment is due entirely to the single odd nucleon. It is not true that all the other nucleons are always paired off with total angular momenta and magnetic dipole 130 O1A1 113HS3H 1 30SNOIlOIa3 }:1d - • U 6 Upper Schmidt line j= t+1/2 5 • V51 SC45 •• C 59 O • pr 141 AI27• M n 55 S b121 • Li 7 EU 151 -;_ 187— Rb87 Re 1 127 •,—R e 1s5 B u-s-- 71 H3 --s • H1 - F19 - T1205 C11881—CU 63 Na23 -Br81 Br 79 69 Bi 209 • / 1-U 175 La139 LCS137 135- ^1129 CS 1 33^^ Sb 123 Ta1s1 • Ga • AS75 =- Tb 159 - \T1203 R b85' • P31 C135 1 Aglo7 37 Er"' 109 Kla 1091 / A8 Y9 i AU19 l ^ lo3 N r 193 0 -•/ Rh Eu 153 Lower Schmidt line j= l-1/2 K39 NUCLEAR MODELS Odd-Z Nb93 • Tc9 ■ = In • 113 In N 15 1/2 + 1.5 - 3/2 5/2 Nuclear spin 9/2 Upper Schmidt line* j=l- 1/2 +1.0 C 13 • Pt 195 oS S35 • ^ B a 137 189 •-Ba 135 j—Xe131 •Zn67 Odd--N :,Pb2o7 S33—:—Cr 53 +0.5 i7i Yb Hg199 • w 183 P^39 $e 77 • Fe 57 • Ni 61 29 •/Cd 111 113 Y Cd ^ Xe 131 Sn119 - 1.5 - 2.0 - • Hg zal \"--Te 123 Xe 129 - Te 125 • Be9 Sn117 Sn ; Er 105 16 149 • Pd as S '—Nd Ybll3 • 47 la7 9—Ti Sm — 235 M g 2J _ Mo 95 •_ 143 ;—Se79 M o 9^: ^Z r 91 N d • Zr 91 • nl• Ti 49- • • C aa 3 Ge 73 Kr83 • Srs-7- : Zr 91 017 Lower Schmidt line j= 1+1/2 • He 3 2.5 7/2 I I I I 1/2 3/2 5/2 Nuclear spin 7/2 I 9/2 Figure 15-19 Top: Measured magnetic dipole moments of even-N, odd-Z nuclei and the shell model predictions. The upper line is the prediction if the spin and orbital angular momenta of the odd proton are assumed to be essentially parallel, and the lower line is the prediction if they are assumed to be essentially antiparallel. Bottom: The same for odd N, even Z. Here, the lower line is for the "parallel" assumption and the upper line is for the "antiparallel" assumption. 15 10 THE COLLECTIVE MODEL - The shell model is based upon the idea that the constituent parts of a nucleus move independently. The liquid drop model implies just the opposite, since in a drop of incompressible liquid the motion of any constituent part is correlated with the motion of all the neighboring parts. The conflict between these ideas emphasizes that a model provides a description of only a limited set of phenomena, without regard to the existence of contrary models used for the description of other sets. A theory, such as relativity or quantum theory, provides a description of a very large set of phenomena. At the border lines between its own set of phenomena and other sets of phenomena, a theory fuses without conflict into the theories used for the description of the other sets. As nuclear physics evolves, attempts are made to remove conflicts between various models and unify them into more comprehensive models. The most successful and most important example is the collective model of the nucleus, which combines certain features of the shell and liquid drop models. It is partly the work of Aage Bohr, whose father developed the Bohr model of the atom. The collective model assumes that the nucleons in unfilled subshells of a nucleus move independently in a net nuclear potential produced by the core of filled subshells, as in the shell model. However, the net potential due to the core is not the static spherically symmetrical potential V(r) used in the shell model; instead it is a potential capable of undergoing deformations in shape. These deformations represent the correlated, or collective, motion of the nucleons in the core of the nucleus that are associated with the liquid drop model. As in the shell model, the nucleons fill the energy levels of the potential, which are split by the same spin-orbit interaction and lead to the same magic numbers, and nuclear spin and parity predictions. Consider a nucleus with one more than a magic number of nucleons. Inspection of the shell model energy levels of Figure 15-18 will show that the extra nucleon will have a relatively large orbital angular momentum. Classically, it will move in an orbit of relatively large radius, near the surface of the core of completely filled subshells. Because of the attractive nuclear interaction between the extra nucleon and the nucleons in the core, the core is distorted. Bulges circulate around the surface of the core, following the motion of the extra nucleon. The effect is very much like the tides at the surface of the earth, which follow the motion of the moon, and arise from the attractive gravitational interaction. If there are two extra nucleons of the same species, classically they will move in opposite directions around the surface of the core in orbits that are essentially in the same plane. The reason is that their pairing interaction produces "antiparallel" coupling of their angular momenta. This increases the distortion of the core. Physically, the distortion of the core affects the motion of the extra nucleons. Mathematically, this is handled by distorting the net potential in which these nucleons move. One result is 12 00W 3/1110311 00 3H1 moments that strictly cancel. The assumption is good enough to lead to the prediction of correct magnitude for the total angular momentum of the nucleus, since this quantity is quantized. If occasionally the pairs have a nonzero total angular momentum, then at that time the odd nucleon must have exactly the right total angular momentum to compensate and keep the magnitude of the total angular momentum of the nucleus constant. This kind of compensation cannot also take place for the magnetic dipole moments since the g factors, which relate the magnitudes of the magnetic dipole moments to the magnitudes of the angular momenta, change as the angular momentum couplings change (see Section 10-6). And since the nuclear magnetic dipole moment does not have a quantized magnitude, there is nothing to enforce such a compensation. co NUCLEAR M ODELS ^ a considerable complication of the necessary task of solving the Schroedinger equation for the potential. Another result is a considerable extension of the set of phenomena that can be described accurately by the model. For instance, in the collective model, part of the total angular momentum of the nucleus is carried in the form of orbital angular momentum by the "tidal waves" circulating around the surface of the core. A moving deformation, partly composed of protons, constitutes a current that produces a magnetic dipole moment proportional to its angular momentum. This is also true in the case of the single moving nucleon that the shell model says is totally responsible for the nuclear magnetic dipole moment, but the proportionality constants differ. The moving deformation produces less magnetic dipole moment than a moving proton, and more than a moving neutron, relative to the angular momentum it carries. These changes are exactly what is required to remove the discrepancies between the measured nuclear magnetic dipole moments and the shell model predictions, shown by the Schmidt lines in Figure 15-19. The student may notice an analogy between the behavior of two electrons always moving in opposite directions with antiparallel spins in a Cooper pair of a superconductor, and two neutrons or two protons always moving in opposite directions in an unfilled subshell of a nucleus with spins that, because of the nuclear pairing interaction, are also antiparallel. Another analogy is that in both cases the behavior of a pair of interacting particles influences, and is influenced by, the behavior of the other particles in the system, which move collectively. Analogies are also found between the mathematical procedures used in BCS superconductivity calculations and in nuclear collective model calculations. A nuclear property which can be explained quite well in terms of the collective model is the electric quadrupole moment q. The hyperfine splitting measurements yielding q were briefly explained in Section 15-2, and there it was also stated that q is a measure of the departure from spherical symmetry of the nuclear charge distribution, as observed in measurements such as hyperfine splitting which are sensitive to the average of this departure over a sample containing many nuclei. The exact definition of the electric quadrupole moment is R = J p[3z2 (x2 + y2 + z2)] dz (15-35) where p is the average nuclear charge density in units of proton charges, and where the three-dimensional integral is taken over the nuclear volume with di the volume element. Note that q is equal to Z, the number of protons in the nucleus, multiplied by the average over p of the difference between three times the square of the z coordinate and the sum of the squares of all the coordinates. That is (15 36) q = Z[3z2 (x 2 + y2 + z 2)] It is clear then that q = 0 if the average nuclear charge density p is spherically symmetrical, since in that case x 2 = y2 = z 2 . If p is not spherically symmetrical, it must at least have symmetry about the axis of the cone on which the total angular momenta of the nuclei are found. In typical cases the average charge density is an ellipsoid with such a symmetry axis. For (15-35) and (15-36), the symmetry axis is taken as the z axis. The second of these equations shows immediately that q > 0 if p is elongated in the z direction so that z 2 > x 2 = y2, and that q < 0 if p is flattened in the z direction so that z 2 < x 2 = y 2 . The measured values of the average nuclear electric quadrupole moment q are shown in Figure 15-20. Some features of the data shown in the figure can be understood qualitatively in terms of the shell model. For example, that model predicts q < 0 for an even-N, odd-Z nucleus with Z equal to a magic number plus one. The reason is that the nucleus contains only completely filled proton subshells, which — - 1 I I 1 Er167 1340W 3A 110311003H1 0.30 Lu 176 0.25 B10 0.20 175 0.15 Mn 55 0.10 - B11 AI27 C o5° 1Be9ANa 23 50 Br 79- Bra' As75 Rb85 i : Ga 69 Kr 83 K ^I ^ Ga71 Rb87 63 • 0 17 - 0.05 — T 8 CI 35 ^ I 93 t 0 20 I I 40 I r 193 • Gd 157 • Q s 189 ‘• H g 201 C5133 B1209 20 28 I 177 1 Gd 155 129 127 125 Sb - 0.10 Hf Re187 • Fr 141 Ge73 Nb Yb73 La 139 131 Cu 65 S33 C U CI 37 • Am241 Am 243: • • Hf 8 535 11 1 } Eul I I r i91 n 115 Z n67 K3-• Ga67 181 'Cd n 113 41 N 1• Ta Ho 165 n lls H 0.05 Eu 153 • I • Sb 121 Ac 227 ^ 82 1 126 123 1 I I I I 60 80 100 Number of odd nucleans I 1 120 I 1 140 Figure 15-20 The nuclear electric quadrupole moment, q, about the symmetry axis, divided by Zr' 2 , for odd-A nuclei. The distance r' is the average from the center of the ellipsoidal distribution, of charge +Ze, to the surface. The quantity 1 + q/Zr' 2 is approximately equal to the ratio of the distances from the center to the surface measured parallel to, and perpendicular to, the symmetry axis. have a spherically symmetrical charge distribution, plus one odd proton moving in an "orbit" near a plane perpendicular to its symmetry axis. Thus the charge distribution is flattened in the direction of the symmetry axis. For an even-N, odd-Z nucleus with Z one less than a magic number, the shell model correctly predicts q > 0 since this nucleus would contain one proton hole (the absence of charge) moving in a similar orbit. These shell model arguments are illustrated in Figure 15-21. They make plausible the observations (1) that q is positive for an even-N, odd-Z nucleus if Z is in a range just below a magic number, (2) that q is zero if Z is at the magic number, and (3) that q is negative if Z is in a range just above the magic number. z NUC LEAR MODELS Z Figure 15-21 Left: Illustrating schematically an odd proton in a nucleus with Z equal to one more than a magic number. To a fair approximation the proton moves in an orbit of radius equal to the nuclear radius. Averaged over time, its charge distribution looks like a ring. The same is true at any time of the charge distribution averaged over a sample containing many such nuclei. The total charge distribution contains an excess of charge, relative to a spherical distribution, in a plane perpendicular to the symmetry axis (the z axis). Thus the nucleus has a negative quadrupole moment. Right: Illustrating a proton hole in a nucleus with Z equal to one less than a magic number. The hole leads, on the average, to a ring containing a deficiency in charge in a plane perpendicular to the symmetry axis. The electric quadrupole moment is positive because the charge distribution has an excess of charge, relative to a spherical distribution, in the direction of the symmetry axis (the z axis). However, the shell model is not capable of yielding correct quantitative results for electric quadrupole moments. Its predictions for the magnitude of q are generally low, and for some nuclei between magic numbers they are lower than the observed magnitude by more than a factor of 10. Example 15 12. Estimate the shell model prediction for the average electric quadrupole moment q of the nucleus 51Sb123, and compare with the measured value shown in Figure 15-20. ■ According to the shell model, the charge distribution of this nucleus is due to a spherically symmetrical core of completely filled proton subshells, plus a single odd proton in a 1g 712 subshell. Since the orbital angular momentum of this proton is high (1 = 4), to a fair approximation it can be thought of as moving in a Bohr-like orbit of radius about equal to the nuclear radius r'. (Recall we found in Section 7-8 that orbital motion approaches the classical limit as 1 becomes large.) Thus an average of the nuclear charge distribution looks something like that shown on the left of Figure 15-21. The spherical core makes no contribution to the nuclear electric quadrupole moment q. So, if we take the symmetry axis perpendicular to the orbit as the z axis, we have - q = Jp[3z2 (x2 + y2 + z2)] dr where p is approximately the charge density for a uniformly charged ring, of radius r', in a plane perpendicular to z. This p is zero except where x 2 + y2 = r' 2 and z = 0. Thus q _, ., f - — r' 2 Jp dr The integral of p yields one since the ring contains the charge of one proton and p is measured in units of proton charges. Therefore, the result we obtain for an estimate of the shell model predictions of q for 51Sb123 is q^ —r '2 Figure 15-20 shows that the measured value of q for this nucleus is such that Zq 2 ^ -0.09 or q^ 0.09Zr' 2 —0.09 x 51r' 2 —5r' 2 Another prediction of the shell model is that the value of the electric quadrupole moment for odd-A nuclei depends significantly on whether they have odd N, even Z or even N, odd Z. The reason is simply that the odd nucleons are uncharged neutrons in the first case and charged protons in the second case. But Figure 15-20 shows that the value of q for odd-A nuclei depends on only the number of odd nucleons, independent of whether or not the odd nucleons are charged. The collective model explains all the features of the measured electric quadrupole moments that are incorrectly predicted by the shell model. It leads to large enough values of q because the core can be deformed so that the charges of many protons contribute to the total electric quadrupole moment. For nuclei between the magic numbers the core deformations become quite large, and therefore the electric quadrupole moments also become quite large. As the deformations can be due to extra nucleons of either species, the collective model explains why the observed values of q do not depend significantly on whether the odd nucleons are neutrons or protons. In addition to the collective rotations of the nuclear core that we have been considering, there are also collective vibrations. Certainly the most spectacular example is nuclear fission. This will be discussed in the next chapter. 15 11 SUMMARY - Table 15-3 briefly summarizes this chapter by listing the nuclear models we have treated, and some of their most significant features. We have seen that each model can provide satisfactory explanations of certain properties of nuclei in their ground states (but no single model can explain all the properties). In the next chapter we shall find that these models can provide explanations of the properties of nuclear decay and nuclear reactions. In that chapter we shall also come across another important nuclear model, not listed in Table 15-3. This is the optical model, which Table 15 3 - Nuclear Models and the Ground State Properties of Nuclei Name Assumptions Theory Used Properties Predicted Liquid drop model Nuclei have similar mass densities, and binding energies nearly proportional to masses—like charged liquid drops Nucleons move independently in net nuclear potential Nucleons move independently in net nuclear potential, with strong inverted spinorbit coupling Net nuclear potential undergoes deformations Classical (asymmetry and pairing terms introduced with no justification) Accurate average masses and binding energies through semiempirical mass formula Quantum statistics of Fermi gas of nucleons Schroedinger equation solved for net nuclear potential Depth of net nuclear potential Asymmetry term Magic numbers Nuclear spins Nuclear parities Pairing term Schroedinger equation solved for nonspherical net nuclear potential Magnetic dipole moments Electric quadrupole moments Fermi gas model Shell model Collective model Aab'wwns The magnitude of the shell model prediction is too low, compared to the measurements, by t about a factor of 5. NU CLEAR MODELS is a generalization of the shell model that describes the behavior of an unbound nucleon moving through a nucleus. r Ç O QUESTIONS 1. Was there a stage in the development of atomic physics in which models played a role comparable to that now played by models in nuclear physics? Are models used now in atomic physics? 2. In those regions of the universe where thermal energy is kT ' 10 6 eV, are atomic processes more apparent than nuclear processes? What about those regions where kT 10 -6 eV? 3. All nuclei have an electric monopole moment (which measures their total charge). Some nuclei have an electric quadrupole moment (which measures the departure from a spherical shape of their charge distribution). No nuclei have an electric dipole moment (which would measure the departure of the center of their charge distribution from the center of their mass distribution). Why would we not expect electric dipole moments for nuclei? 4. Nuclei have magnetic dipole moments. Why do they not have magnetic monopole moments? What about magnetic quadrupole moments? 5. If an electron of kinetic energy 100 keV passed through a typical atom it could be scattered through a fairly large angle in a close collision with an atomic electron. If its kinetic energy is 100 MeV it could be scattered through a fairly large angle only in a close collision with the nucleus. Why? 6. Why is the mass unit not defined in terms of the mass of the hydrogen atom? (Hint: Use Table 15-1 to make a quick estimate of the mass of 92 U238 if the mass of 1 H 1 is 1.000000u.) 7. Since atomic and molecular reactions also involve binding energies, why did the nineteenth century chemists not observe mass deficiencies and thereby discover relativity theory? 8. Many textbook problems in mechanics consider zero Q-value collisions between idealized classical particles. Is the Q value exactly zero in collisions between real classical particles (like real billiard balls)? What is the sign of the Q value? 9. Why are the most stable nuclei found in the region near A ^ 60? Why do not all nuclei have A 60? 10. The semiempirical mass formula contains five parameters, and it predicts quite accurately more than 500 masses. How does its ratio of predictions to parameters compare with other empirical formulas of physics or engineering? 11. Why does the pairing term make a negative contribution to the energy liberated when a neutron is captured by 92U238, and a positive contribution in the case of 92U235? What are the practical consequences of this situation? 12. Why are the atomic magic numbers not the same as the nuclear magic numbers? 13. Explain why there can be no collisions between a typical nucleon and another in a nucleus in its ground state. If a high-energy nucleon, say from a cyclotron beam, enters a nucleus in its ground state, can it collide with a nucleon in the nucleus? 14. What fundamental law of physics is most responsible for the existence of nuclear magic numbers? 15. Is there a relation between the l dependence of the spin-orbit splitting of nuclear levels and the Landé interval rule for the spin-orbit splitting of atomic energy levels? 16. Why do most nuclei obey JJ coupling, whereas most atoms obey LS coupling? 17. Use the argument associated with Figure 9-4 to explain why there is a tendency for the intrinsic spin angular momenta of a pair of identical nucleons to be essentially antiparallel in order to minimize their average separation. Then modify the argument illustrated in Figure 10-2 to explain why the average separation of the pair is minimized if their orbital 19. 20. 21. 22. 23. 24. PROBLEMS 1. The analysis of the optical spectrum of an atom shows that there are four energy levels in a certain hyperfine splitting multiplet. The analysis also shows that the value of the total electronic angular momentum quantum number for that multiplet is j = 2. Determine the value of the nuclear angular momentum quantum number, or nuclear spin i, for the nucleus of the atom. 2. The nuclear spin and symmetry character of the boron nucleus with Z = 5 and A = 10 are: i = 3, symmetric. (a) Show that the mass, charge, nuclear spin, and symmetry character agree with the assumption that nuclei contain Z protons and A — Z neutrons. (b) Which of these four properties disagree with the assumption that nuclei contain A protons and A — Z electrons? 3. (a) Evaluate, in MeV, the energy of gravitational attraction for two spherically symmetrical protons with a center-to-center separation of 2 F. (b) Do the same for the energy of Coulomb repulsion at that separation. (c) Compare your results with the energy of nuclear attraction, which is about —10 MeV at that separation. 4. Electrons of kinetic energy 1000 MeV are scattered from a target containing 79Au nuclei. (a) Use data from Figure 15-6 to find the radius at which the nuclear charge density is half its interior value. (b) Then use this radius to predict the approximate separation in angle between adjacent minima of the diffraction pattern that is observed in the scattering. 5. Use the empirical equation representing the measured nuclear charge densities, (15-5), and the parameter b quoted in (15-7), to determine the distance in which the nuclear charge densities fall from 90% to 10% of their internal values. 6. Show that for 6C 12 the nuclear density given by (15-5) is one-half the central density at a radius differing from the parameter a by 0.0126 F approximately. 7. A mass spectrometer selects ions moving at 4.8 x 10 5 m/sec; the magnetic field is 0.22 tesla. A sample of triply ionized oxygen atoms is analyzed. How far apart are the images produced by 8016 and 8 0 18 ions on the photographic plate? 8. Estimate the pressure in a mass spectrometer with an ion path radius of about 10 cm by setting the mean free path equal to the length of the trajectory. sw31eoad 18. angular momenta are also essentially antiparallel. Do these arguments explain why the pairing interaction tends to make the total angular momenta of the pair essentially antiparallel? If one factor in a nuclear eigenfunction consists of a product of an even number of eigenfunctions for nucleons in a particular subshell, why is the parity of the factor even, independent of whether the parities of the nucleon eigenfunctions are all even or all odd? How does this lead to the rule for predicting the parities of odd-A nuclear eigenfunctions? How can the magnetic dipole moment data of Figure 15-19 be used to identify the orbital angular momentum quantum number 1, of many nuclei, in terms of the measured value of their total angular momentum quantum number j? If the tidal waves circulating around the nuclear core in the collective model were entirely composed of protons, instead of being composed partly of protons and partly of neutrons, what would be the effect on the magnetic dipole moments predicted by the model? What is the simplest distribution of point charges that has an electric quadrupole moment? Is a positive electric point charge surrounded by a concentric circular ring of negative charge, of total magnitude equal to that of the point charge, an electric monopole, dipole, quadrupole, or something else? Why are there no magic numbers that are odd? Why is the nuclear shell model called a model, while the comparable atomic Hartree theory is called a theory? Generally speaking, how does a model differ from a theory? 9. Derive (15-16), which relates the Q value of a nuclear reaction to the dynamical quantities involved in the reaction. (Hint: Write equations for the conservation of the components of linear momentum in the directions parallel to and perpendicular to the direction of the incident particle. Then eliminate from these the angle between the direction of the residual nucleus and the direction of the incident particle.) 10. (a) Use (15-16) to calculate the energy of protons emitted in the direction of incidence of the 7.70 MeV a particles in the Rutherford reaction of (15-11). The Q v al ue of the reaction is —1.18 MeV. (b) Compare your results with Example 15-4. 11. How much energy in MeV would have to be supplied to a nucleus of 24 Cr 52 in order to split it into two identical fragments? The atomic mass of 24 Cr 52 is 51.94051u, and that of 12Mg26 is 25.98260u. 12. Since the reaction 1 H 2 + 1 H 3 —+ 2 He4 + 0n 1 has a high positive Q value, it is frequently Û used to obtain high-energy neutrons, 0n 1 , from a low-energy electrostatic generator accelerating a beam of deuterons, 1 H2, into a target of tritons, 1 H 3. (a) Use information presented in Table 15-1 to calculate the Q value for the reaction. (b) Use (15-16) to calculate the energy of the neutrons emitted from the reaction in the same direction as the incident beam of deuterons, if the energy of the deuterons is 0.500 MeV. 13. Use the masses quoted in Table 15-1 to verify that the binding energy per nucleon of 6C12 has the value quoted in that table. 14. (a) Use information presented in Table 15-1 to evaluate, in MeV, the energy released in the fusion of two 1 H 2 nuclei to form a 2 He4 nucleus. (b) Also evaluate, in MeV, the height of the Coulomb repulsion barrier which must be overcome before there is an appreciable probability that the two nuclei can get close enough together for fusion to take place. Treat the 1 H2 nuclei as uniformly charged spheres of radius 1.5 F, and evaluate the energy of Coulomb repulsion when they are just touching. 15. (a) The Coulomb energy of a uniformly charged sphere of radius r', i.e., the energy required to assemble the charge, is N ^ NU CLEAR MODE LS ^ V= 3 Z2e 2 5 47EO r' Take r' = 1.1A 1 "3 F, which is consistent with the electron scattering measurements, and show that V then assumes the form of the Coulomb term of the semiempirical mass formula. (b) Evaluate, in mass units, the coefficient of Z 2/A 1 /3 in the expression obtained for V, and compare with the empirical value of the coefficient a 3 given in (15-31). 16. The nuclei 5 B 11 and 6011 are said to be a pair of mirror nuclei because they have the same number of nucleons, and the number of protons in one equals the number of neutrons in the other. If nuclear forces are charge independent, their total binding energies should differ only in that the Coulomb energy is higher in 6 C 11 . The atomic mass of 5B " is 11.009305u, and the atomic mass of 6C11 is 11.011432u. (a) Evaluate the difference in their total binding energies. (b) Assuming both nuclei to be uniformly charged spheres of the same radius r', and using the expression for the Coulomb energy given in Problem 15, find the value of r' that leads to a difference in Coulomb energy that agrees with the difference in binding energy. (c) Compare this charge distribution radius with the radial dependence of the charge density for the similar nucleus 6C12 shown in Figure 15-6. 17. (a) Evaluate the terms of the semiempirical mass formula for 26 Fe 56. (b) Convert them to their equivalents in MeV, divide by A, and then compare them with Figure 15-12. (c) Use the terms to predict the atomic mass. (d) Evaluate the average binding energy per nucleon, and compare with Figure 15-10. 18. According to the a-particle model of the nucleus, 6 C 12 consists of three a particles, i.e., 2 He4 nuclei, and 8 0 16 consists of four a particles. (a) Use Table 15-1 to evaluate the difference between the total binding energy of 6 C 12 and the total binding energies of three a particles. (b) Evaluate the difference between the total binding energy of 8 0 16 and the total binding energies of four a particles. (c) Draw schematic diagrams of 6C 12 and 8016 according to the a-particle model, and use them to show that there can be three "bonds" connecting the a particles in 6C 12, while there can be six bonds connecting the a particles 20. 21. 22. 23. 24. 25. 26. sw 318 oa d 19. in 8 0 16 . The exact nature of a bond was not specified in the model, but it was thought that they were somehow analogous to bonds in molecules. (d) Use the results of parts (a) and (b) to show that the total binding energies of 6C12 an d 8016 could be accounted for by saying that every possible bond contributes a binding energy of a little over 2 MeV. The a-particle model is not highly regarded because little more can be done with it than has been done in this problem. Use the acrostic explained in Section 15-8 to construct the diagram giving the ordering and approximate spacing of the energy levels which the nucleons are filling in the shell model. After you have finished, compare with Figure 15-18. Use the exclusion principle argument of Example 15-10 to predict from the shell model diagram of Figure 15-18 the nuclear spins of: 20Ca40 20Ca 39 20Ca41 (a) Use the existence of the pairing interaction to predict from the shell model diagram of Figure 15-18 the nuclear spins and parities of 6011 , 20Ca44 2sNi61 32 Ge73 . Briefly justify each prediction. (b) The observed spins and parities are: (3/2, odd), (0, even), (3/2, odd), (9/2, even). Give an explanation of any discrepancies you find. (a) Predict from the shell model diagram of Figure 15-18 the possible values of the nuclear spins, and also predict the parities, of the following odd-N, odd-Z nuclei: 5B10, 19K40 , 231/50 (b) The observed spins and parities are: (3, even), (4, odd), (6, even). Does there seem to be any preferential tendency in the coupling of the angular momenta of the odd neutron and odd proton? Use the shell model to predict for the ground state of 80 17 (a) the spin; (b) parity; (c) sign of the magnetic dipole moment; (d) sign of the electric quadrupole moment. The measured nuclear spin of 23V51 is 7/2. Since this is an even-N, odd-Z nucleus, the nuclear spin is due to the odd proton that has a total angular momentum quantum number j = 7/2. Since there are two possible relations between j and the orbital angular momentum quantum number 1 for that proton, namely j = 1— 1/2 and j = / + 1/2, the value of l could be either 3 or 4. (a) Use the measured value of the magnetic dipole moment and its relation to the Schmidt lines, shown in Figure 15-19, to predict the most likely value of 1. (b) Use the shell model diagram of Figure 15-18 to predict the value of 1, and compare with (a). (a) Use the measured electric quadrupole moment of 73 Ta 181 , presented in Figure 15-20, to evaluate approximately the ratio of the distances from the center to the surface of its ellipsoidal charge distribution, measured parallel to and perpendicular to its symmetry axis. (b) Use the electron scattering charge distribution radius a, from (15-6), to evaluate approximately the average of these distances. (c) From the answers to (a) and (b) evaluate approximately these distances, which are the semimajor and semiminor axes of the ellipsoidal charge distribution. (d) Make a sketch, to scale, of the charge distribution. A solid right circular cylinder of radius R and length L has uniform charge density p. L/R it will be positive, Findtselcrquapomnt,idcgfrwhaos negative, or zero. 16 NUCLEAR DECAY AND NUCLEAR REACTIONS 16-1 INTRODUCTION 555 information provided by decay and reactions 16 2 - ALPHA DECAY 555 relation between decay and reactions; radioactivity; parent and daughter nuclei; decay energy and nuclear models; barrier penetration; decay rate; exponential decay; lifetime; half-life; equilibrium in series decay; radioactive series; spontaneous fission; superheavy elements 16 3 - BETA DECAY 562 presence in radioactive series; decay energetics for electron emission and capture, and positron emission; energy sharing; neutrinos; momentum spectra; matrix elements; fl-decay interaction and coupling constant; Kurie plot; decay rate; FT value; selection rules; forbidden decays 16 4 - THE BETA DECAY INTERACTION - 572 coupling constant evaluation; comparison of strength to other interactions; range; Reines-Cowan experiment; Wu experiment; parity nonconservation; helicity 16-5 GAMMA DECAY 578 experimental techniques; comparison to atomic radiation; electric and magnetic radiation; multipolarity; shell model transition rates; selection rules and their origin; internal conversion; lifetimes and widths 16 6 - THE MOSSBAUER EFFECT 584 resonant absorption; phonons; Doppler shift; applications to uncertainty principle, solids, and relativity 16 7 - NUCLEAR REACTIONS 588 conservation laws and their application; processes occurring in reactions; Coulomb and nuclear potential scattering; optical model; size resonances and single-particle states; direct interactions; compound nucleus reactions; compound nucleus resonances and many particle states; Breit-Wigner formula 16 8 - EXCITED STATES OF NUCLEI 598 general survey; low-lying shell model states; rotational states; vibrational states; states of mirror nuclei 16 9 - FISSION AND REACTORS chain reactions; bombs and reactors; fission energetics; spontaneous and induced fission; fission neutrons; moderators; control rods; breeder reactors 554 602 16-10 FUSION AND THE ORIGIN OF THE ELEMENTS 607 QUESTIONS 611 PROBLEMS 613 16-1 INTRODUCTION In the preceding chapter we used the properties of the ground states of stable nuclei to introduce the most important nuclear models. In this chapter we use these models to consider the decay of unstable nuclei, and also to consider nuclear reactions involving both stable and unstable nuclei. Our considerations will concern excited states of nuclei, as well as their ground states. Nuclear decay divides itself into three categories. One is a decay—the spontaneous emission of an a particle from a nucleus of large atomic number. We shall see that this process, or the closely related process of spontaneous fission, is responsible for setting an upper limit on the atomic numbers of the chemical elements occurring in nature. A second type of nuclear decay is fi decay—the spontaneous emission or absorption of an electron or positron by a nucleus. It is particularly interesting because it will tell us much about the fi-decay interaction, which is one of the fundamental interactions, or forces, of nature. A third type of nuclear decay is y decay—the spontaneous emission of high-energy photons when a nucleus makes transitions from an excited state to its ground state. We shall find that y decay gives detailed information about the excited states of nuclei that can be used to improve the nuclear models. We shall also find that y decay is used in the Mössbauer effect to make extremely high-resolution energy measurements in many different fields of physics. Nuclear reactions will provide us with additional information about excited states of nuclei, since the residual nucleus in a reaction is typically formed in an excited state. Among the nuclear reactions that we shall consider are those that occur in the nuclear fission reactors that are now used as inexpensive sources of energy. We shall also consider the reactions that may some day be used to produce energy on earth by nuclear fusion and that have been used for a long time by stars to produce the energy, and the chemical elements, of which nature is composed. 16 2 ALPHA DECAY - Nuclear decay occurs, sooner or later, whenever a nucleus containing a certain number of nucleons is put in an energy state which is not the lowest possible one for a system with that number of nucleons. Invariably, the nucleus is put into the unstable state as a consequence of a nuclear reaction. But in some cases the nuclear reaction responsible for producing the unstable nucleus took place recently in a man-made particle accelerator, while in other cases it took place in natural events that happened billions of years ago when our part of the universe was formed. Unstable nuclei that originate from the natural events are often called radioactive; the processes that occur in their decay are often called radioactive decay, or radioactivity. One of the reasons why radioactive decay is interesting is that it provides clues about the origin of the universe. A process that is particularly important in radioactive decay is a decay, occurring commonly in nuclei with atomic number greater than Z = 82. It involves the decay .lt/O3 a dHd l `d thermal fusion; fusion reactors; big-bang processes; stellar formation; proton-proton cycle; carbon cycle; formation of elements NU CLEAR DECAY AND NUCLEAR REACTIO NS ci. Ç of an unstable parent nucleus into its daughter nucleus by the emission of an a particle, the nucleus 2He4. The process takes place spontaneously because it is energetically favored, the mass of the parent nucleus being greater than the mass of the daughter nucleus plus the mass of the a particle. The reduction in nuclear mass in the decay is primarily due to a reduction in the Coulomb energy of the nucleus when its charge Ze is reduced by the charge 2e carried away by the a particle. The energy made available in the decay is the energy equivalent of the mass difference. This decay energy is carried away by the a particle as kinetic energy. Ignoring the mass equivalents of atomic electron binding energies, the cc-decay energy E can be written in terms of the atomic masses of the parent nucleus, Mz , A, of the daughter nucleus, Mz _ 2,A- 4, and of the a particle, M 2 ,4, as (16-1) E = [Mz,A — (Mz- 2,A-4 + M2,4)]C2 Figure 16-1 displays the decay energies E for parent nuclei in the a-emitting range of Z, or A. The data are obtained from direct measurements of the kinetic energy of the a particles by bending them in a magnetic field, and/or by using (16-1) with the measured masses. The dashed line represents the general trend for the parent nuclei to become increasingly unstable to a decay as A gets further away from the value A ^ 60, where the average binding energy per nucleon, AE/A, maximizes. It also represents the predictions of the liquid drop model. Superimposed on the general trend is a peak, roughly 4 MeV high, occurring at the parent nucleus 84 Po 212 . The peak is explained by the shell model as due to the particular stability of the associated daughter nucleus, 82 Pb 208 . Since the daughter has magic Z = 82 and magic N = 126, it is about 4 MeV more tightly bound than typical nuclei in this region of A. (Figure 15-13 shows that about 2 MeV of extra binding energy is found at each magic num- 4 200 I I I 210 220 230 Th 232 U 238 I 240 250 I 260 A Figure 16-1 Alpha-decay energies for nuclei in the a-emitting region. The dashed curve represents the general trend predicted by the semiempirical mass formula. ber.) Note that the a-decay energies range from 8.9 MeV for 84Po212 to 4.1 MeV for The moderately energetic particles emitted in a decay of radioactive nuclei were put to very good use by Rutherford, and others, in the scattering experiments that led to the discovery of nuclei (see Chapter 4). Similar use continued to be made of a particles from radioactive sources in investigating nuclear structure, until the invention of cyclotrons by Lawrence in the late 1930s. Cyclotrons, and other types of particle accelerators, produce particles of higher energy which can be used in more precise measurements because they have shorter de Broglie wavelength. Accelerators also produce more intense beams of particles than can be obtained from radioactive sources, and this makes the measurements easier to carry out. Ana particle is emitted by the parent nucleus 84Po212. Estimate the Coulomb potential it feels at the nuclear surface, and then make an approximate plot of the sum of the Coulomb and nuclear potentials acting on the a particle in various locations. •If we approximate the daughter nucleus and the a particle as uniformly charged spheres, the Coulomb repulsion potential energy when they are just touching will be 2Ze2 Vo = + 41cEOr' where + 2e is the a-particle charge, + Ze is the daughter nucleus charge, and r' is the sum of the radii of the a-particle and daughter nucleus uniform charge distributions. We can estimate these radii by using the charge density half value radii a of the actual charge distributions found in the electron scattering measurements, and quoted in (15-6) a = 1.07A 113 F We obtain for the sum of the radii r'= (41 /3 + 208 1 /3)1.07 F = 8.0 F So 2 x 82 x (1.6 x 10 -19 coul) 2 — x 10 12 joule Vo — is m =4.8 1.1 x 10 10 coul 2 /nt -m z x 8.0 x 10 -15 Example 16 1. - = 30 MeV Figure 16-2 indicates the total (Coulomb plus nuclear) potential acting on the a particle. As it approaches the nucleus, it feels the repulsive Coulomb potential increasing in inverse proportion to the distance between the centers of the a particle and nucleus, and reaching the value of Vo when this distance equals r'. Inside the surface it feels a rapid onset of the strong attractive nuclear potential, which soon dominates. (The onset is, of course, not quite as rapid as shown in the figure.) Also indicated is the 84Po212 a-decay energy E = 8.9 MeV, which is 30 0 r' I I I 10 20 30 Center-to-center separation (F) I 40 An approximate representation of the Coulomb plus nuclear potential V acting 84Po212 nucleus, and the total energy E of the cc particle. on an cc particle emitted from a Figure 16-2 Ab'O3a dHdib' aoTh232 co ^ NUC LEAR DEC AY AND NUC LEAR REACTIONS cr) r ^. s ^ U the energy of the emitted cc particle. Note that it is much less than 170 , the height of the Coulomb barrier. • Since every decay energy shown in Figure 16-1 is far less than the height of the Coulomb barriers, which is 30 MeV for all a decays, the cc particle tends to be trapped by the barrier in every decay. It can escape only by the quantum mechanical process of barrier penetration. We have previously gone through a detailed treatment of this process, so here we shall only remind the student of the results, but he would be well advised to look again at Section 6-6. At least he should inspect Figure 6-20, which plots the probability per second that a nucleus will emit an a particle, called the decay rate R, versus the decay energy E. The figure shows that the decay rate decreases extremely rapidly as the decay energy decreases and the cc particle tunnels more deeply through the Coulomb barrier. Now consider a system containing many nuclei of the same species at some initial time. The nuclei cc decay (or, equally well, fl or y decay) at the decay rate R. We shall calculate the number of undecayed nuclei present at some subsequent time. If there are N undecayed nuclei at time t, then the number decaying in the following time interval dt can be written dN. Since R is the probability that a particular nucleus will decay in 1 sec, R dt is the probability that it will decay during the time interval, and NR dt is the probability that any one of the nuclei will decay in that interval. Thus the average number of decaying nuclei is dN = —NRdt (16-2) where the minus sign accounts for the fact that dN is intrinsically negative since N decreases. Rearranging the terms, and integrating, we obtain dN = —Rdt t N(t) I dN = —R N N(0) ln N(t) — ln dt= —Rt 0 N(0) = ln N(t) N(0) = —Rt or N(t) = e _ Rt N(0) so (16-3) In this expression N(0) is the number of undecayed nuclei at the initial time 0, and N(t) the number of undecayed nuclei at the subsequent time t. Since the calculation involves probabilities, its results are correct only on the average, but fluctuations from the average are very small in the typical case in which the number of nuclei involved is very large. Figure 16-3 is a plot of (1p-3), which is called the exponential N(t) = N(0)e -Rt decay law. Also indicated in Figure 16-3 is the lifetime T characteristic of the decay. This is the average time a nucleus survives before it decays. It is obvious from their definitions that T is inversely proportional to the decay rate R. In fact, it is easy to show from a simple integration of the decay law that T=R (16-4) Using this relation in (16-3), we conclude that in one lifetime the number of undecayed nuclei decreases by a factor of e, as indicated in the figure. Further indicated JlVJ3O VHd1V Figure 16 3 The exponential decay law for N(t), the number of nuclei surviving at time t. Also shown are the lifetime T and half-life T112. Note that N(t) is expressed in units of the original number of nuclei N(0), while time is expressed in units of the lifetime T. - is the half life T 112 , which is the time required for the number of undecayed nuclei to decrease by a factor of 2. The relation between the two times is obtained directly from the decay law (16-5) T112 = (In 2)T = 0.693T In a more typical system, there are several related radioactive nuclei decaying successively into each other by a decay (and/or other decay processes). For instance, 92U234 a decays into 90Th230, which a decays into 88 Ra 226 , etc. Thus a system initially filled with 92U234 will eventually contain a mixture of all these nuclei. Differential equations governing the general behavior of such a family can be written down easily, and they can be solved with not much more difficulty in certain cases. In the most important case, the significant features of the solution can be discerned from the following qualitative argument. Consider a family of decays in which the parent has by far the smallest decay rate, or longest lifetime. The situation is indicated schematically in Figure 16-4. On a time scale comparable with the parent lifetime, the population of the parents decreases exponentially. But on the much shorter time scale comparable to the daughter lifetimes, the population of the parents remains essentially constant, and so the total number decaying per second into the first daughters seems contant. Since the first daughters decay rapidly after they are formed, their population is governed by the constant resupply from decay of the parents. Thus the population of the first daughters remains constant. The same is true for the second daughters, since they are being formed at a constant rate from the decay of the constant population of the first daughters. In fact, the populations of all the daughters will remain constant as long as we consider times short compared to the parent lifetime so that the population of the parents remains essentially constant. (If we consider longer times all that happens is that the population of the parents, and of all the daughters, decreases exponentially at the same rate following the slow decay of the parents.) Thus, on the shorter time scale, we have an equilibrium condition, which requires that the following relation be satisfied NOR0 = N 1 R 1 = N2 R2 = • • • (16-6) - Figure 16 4 - A schematic representation of a family of successive decays. 0 ^ NUCLEAR DECAY AND NUCLEAR REACTIONS For instance, the left side of the first equality is the total number of parents decaying per second to form first daughters, while the right side is the total number of first daughters decaying per second. If the total rate of formation of first daughters did not equal their total rate of decay, their population would not remain constant. Equation 16-6 describes the most important case of a family of decays. It is sometimes used to determine the values of the R, or T, from measurements of the N, and one known R. We can now understand how a-decaying nuclei with very short lifetimes can be found in nature. For example, 84 Po 212 , with T — 10 -6 sec, can be extracted from naturally occurring minerals that presumably have been in existence for billions of years. The reason is simply that the short lifetime a emitters are in equilibrium in decay families with long lifetime parents, called radioactive series. There are three such series that occur naturally: the 4n series whose parent is 90Th 232 with T = 2.01 x 10 10 yr, the 4n + 2 series whose parent is 92U238 with T = 6.52 x 10 9 yr, and the 4n + 3 series whose parent is 92U235 with T = 1.02 x 10 9 yr. The names characterize the A values for the members of the series. For instance, the parent of the 4n + 3 series has A equal to four times an integer plus three, where the integer is 58. Since each a decay reduces A by four (and the other decay processes do not change A), all the daughters of this series will also have A equal to four times some smaller integer plus three. There is evidently also room for a 4n + 1 series. Actually there is such a series, whose parent is 93Np237 with lifetime T = 3.25 x 106 yr. The series can be produced artificially by using a nuclear reaction to make the parent, but it is not found in nature since the lifetime of the parent is very short compared to the age of the earth, which is estimated from geological and cosmological evidence to be —10 10 yr (see Example 16-2). Consequently any parent nuclei initially present have decayed away. In this connection note that Figure 16-1 shows the decay energies of the parents of the three naturally occurring series are particularly low. If they were less than 1 MeV higher their decay rates would be so much higher, and their lifetimes so much shorter than — 10 1 ° yr, the age of the earth, that the naturally occurring elements would stop at Z = 82 instead of Z = 92. The same figure indicates why the presently known naturally occurring elements do stop at Z = 92. It is because the a-decay energies for nuclei with Z > 92 are large enough to lead to lifetimes short compared to the age of the earth. Finally, an extrapolation of Figure 16-1 to Z < 82 shows that the corresponding elements are apparently stable to a decay because their decay energies are so small that the lifetimes are immeasurably long. Students frequently wonder why nuclei of large Z spontaneously emit a particles, Z He4, but do not spontaneously emit any of the particles 2He3, 1H2, or 1 H 1, even though emitting any of these particles reduces the Coulomb energy of the nucleus. The reason is simply that for the particles other than 2He4 the binding energy per nucleon, AE/A, is much smaller than it is for a typical nucleus. Thus their emission is not energetically favorable. The emission of a 6C12 particle from a nucleus of large Z would be energetically favorable because it has a high AE/A and also reduces considerably the Coulomb energy of the nucleus. And the emission of a particle of even larger Z would be even more so because of the increased reduction of the Coulomb energy. Such a process is called spontaneous fission. For naturally occurring nuclei of the highest Z values, i.e., for Z in the range just below 92, the decay rate for spontaneous fission is very much smaller than the decay rate for emitting an a particle because of the very much reduced probability of a more massive particle penetrating a higher Coulomb barrier. As Z becomes larger than about 100, the decay rate for spontaneous fission becomes comparable to, and eventually larger than, the decay rate for a-particle emission. The reason is that with increasing Z the decay energy for spontaneous fission increases more rapidly than the decay energy for a-particle emission, so the spontaneous fission Coulomb barrier becomes relatively easier to penetrate. present on the earth if enough of it were formed 10 1° yr ago. The prediction follows from the prediction that the proton magic number after Z = 82 is Z = 114, not Z = 126 as indicated in Figure 15-18 of the shell model. Of course the prediction of that figure that N = 126 is a neutron magic number is abundantly verified by experiment, and it is also believed that N = 184 is a neutron magic number as predicted by the figure. But there is no experimental evidence concerning Z values much beyond 100 since the corresponding nuclei have not been discovered yet, so Z = 126 is not actually known to be magic. The difference between the recent shell model predictions for the higher proton and neutron magic numbers arises because for protons there is, in addition to the nuclear potential, a repulsive Coulomb potential that becomes large for nuclei of large Z. It tends to raise all the proton levels, but more so for levels of small l whose probability densities extend deeper into the nuclear center where the Coulomb potential is stronger. The result is to raise the 2f and 3p levels relative to the li level, making the 1/13/2 level lie just above the 2f712 level, and creating a proton magic number at Z = 100 + 14 = 114. Thus the nucleus with Z = 114, and N = 184, is believed to be doubly magic. That nucleus also lies near, but not on, the curve of maximum stability obtained from an extrapolation of the semiempirical mass formula of the liquid drop model. In othe words, Z = 114 and N = 184, or Z = 114 and A = 298, is expected to be doubly magic and also to have almost the most stable value of Z for that value of A. Collective model calculations indicate that the best compromise between the requirements for stability of the shell and liquid drop models is obtained by removing four protons to reduce the Coulomb energy, which is extremely important for nuclei of such large Z. Thus these calculations predict maximum stability at Z = 110 and A = 294. They also predict a lifetime of ' 10 8 yr against decay by a-particle emission or spontaneous fission into two smaller nuclei. The fission process is actually the most likely decay because it is more effective in reducing the Coulomb energy. So Z = 110 and A = 294 is predicted to be "an island of stability in a sea of spontaneous fission." In the mixture of isotopes normally found on the earth at the present time, has an abundance of 99.3% and 92U235 has an abundance of 0.7%. The measured lifetimes of these radioactive isotopes are 6.52 x 10 9 yr and 1.02 x 10 9 yr, respectively. By assuming that they were equally abundant when the uranium in the earth was originally formed, estimate how much time has elapsed since the time of formation. (That is assume pairing effects in the initial formation ratios are small compared to lifetime effects in the present abundance ratios.) ^ If the number of 92U238 nuclei originally formed is N, the number present now is Example 16 2. - 92U238 N238 = Ne—xt = Ne-t/T = Ne- 1/6.52 where t is the elapsed time in units of 10 9 yr. Since the number of 92U235 nuclei originally formed is, by assumption, also N, the number now present is N235 = Ne t/1.02 The present abundance of 92U235 is 3 7 X lO_ = N235 N235 + N238 _ ti N235 N238 Ne t/1.02 Ne - 06.52 = e - (01.02-06.52) = e - 0 . 827t So e 0.827t 1 = 143 7 x 10 -3 0.827t ^ ln (143) = 4.96 4.96 =6.0 t ^ 0.827 That is, the elapsed time is t 6.0 x 109 y ^ rn A`dJ3 4 dHd1V There is an as yet unverified prediction that the nucleus of the element with Z = 110 and A = 294 might have a lifetime as long as ti 108 yr. If so, a little of it could possibly still be N Co NU CLEAR D ECAY AND NU CLEAR REACTION S lf) co Q L U The estimate obtained from this simple argument is in reasonable agreement with the estimates of the age of the earth, or of the solar system, obtained from more sophisticated geological and cosmological arguments. 4 16-3 BETA DECAY A more complete description of the processes occurring in the 4n radioactive series is plotted in Figure 16-5. In addition to a decay, there is also [3 decay. For the radioactive series, 16 decay involves a nucleus Z, A emitting a negatively charged electron and being transformed into the nucleus Z + 1, A. There are also two other types of f3 decay that will be considered shortly. It is instructive to superimpose Figure 16-5 on Figure 15-11, the plot of the Z and N values of the stable nuclei. The result, shown in Figure 16-6, makes it clear that the radioactive series uses /3 decay to keep as good a match as possible between the average slope of the path traced out by its decay and the average slope of the "curve of stability." Another way of saying this is that the a-decay energy of a nucleus is relatively small if the nucleus it would a decay into is too far from the curve of stability. But in just these circumstances the fl-decay energy is relatively large. As the decay rates for both processes increase rapidly with increasing decay energy, the nucleus in question will 10 decay because that process has a larger decay energy, and so a much larger decay rate. In some cases, the decay rates for the two competing processes are comparable, both processes occur, and the series branches (see 84Po216 and 83 Bi 212 in the 4n series). In the first part of this section we shall study the energetics of f3 decay. Then we shall study the dependence of the decay rate on the decay energy. There we shall see that the decay rate also depends strongly on the spins and parities of the nuclear states involved in the decay. This dependence on spin and parity makes the fl-decay process a very useful tool in the investigation of nuclei. To discuss the energetics of f3 decay, we plot atomic masses MZ, A, in the region of the curve of stability, as a function of Z for fixed A. Figure 16-7 shows typical results for odd A, and Figure 16-8 shows results typical for even A. Except near magic numbers, all the results are well described by the semiempirical mass formula. For odd A, the values of MZ,A are found to lie on a parabola. For even A, there are 92 90 Th 228 90 90 Th 232 ‘11' 88 86 E m 22 o 86 85 84 84 82 s2 Pb 2os Po 81 T1 208 78 124 126 Figure 16-5 128 88 Ra à 228 ^ ill ` ^^\^° â4 Po 216 °^^ ^pb $ 80 216 j^ 212 1^°\ At d' 89 Ac 228\' 88 212 83 Bi 212 130 132 N 134 136 138 140 The decay processes occurring in -the 4n series. 142 144 100 80 70 60 50 40 30 20 10 0 4n series MMEMMEMM _AM MITIMMEMEMIN MUM MIMI= MUM TATE INEMMO MIMI EMI MIIMMm MIIMME OEM IMMAIMME =NMI m MUM MIME r. Curve of stability—, j 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 N=(A—Z) Figure 16-6 Illustrating why fi decay occurs in the 4n and other radioactive series. I I I ^ I I 52 54 Figure 16-7 I I Z 58 56 The masses of atoms with a given odd value of A. The value A = 135 is chosen for this example. Even A (A = 102) /(Z,A)=+f(A) odd Z f5 (Z, A) = — f (A) even Z Figure 16-8 42 44 Z 46 The masses of atoms with a given even value of A. The value A=102 is chosen for this example. Ad03O d138 90 NUCLEAR DECAY AND NUCL EAR REACTI ONS to r à s O two parabolas corresponding to the two possible signs of the pairing term, (15-28); the upper one is for odd Z, odd N, and the lower one is for even Z, even N. These curves are really cross cuts through the curve of stability, showing its structure. They specify how the masses increase when the Z values depart from their most stable values for a given A. Note that for an odd value of A, there is generally only one most stable value of Z. (Rarely there are two values straddling the bottom of the parabola that happen to lead to almost the same mass.) For a given even A, there are generally two stable values of Z (but occasionally there are three). Nuclei whose Z values are not the most stable, in consideration of their A values, can change Z to attain stability by three different fl-decay processes. One is the process of electron emission that occurs in the radioactive series. In this process, a negatively charged electron is emitted by the nucleus, so Z increases by one, N decreases by one, and A remains fixed. The other processes are electron capture and positron emission. In the former the nucleus captures a negatively charged atomic electron, and in the latter it emits a positively charged positron. In both, Z decreases by one, N increases by one, and A remains fixed. Electron emission takes place if the mass mz ,A of the initial nucleus exceeds the mass mz+1,A of the final nucleus plus one electron rest mass m. The mass excess times c2 equals the energy E made available in the decay. That is, the decay energy is (16-7a) E = [mz,A — (mz+ 1,A + m)]c 2 This energy must be positive for the decay to occur. We can write it in terms of atomic masses by adding and subtracting Z electron rest masses, to yield E = [mz ,A + Zm — (mz+1,A + Zm + m)]c 2 Neglecting the binding energies of atomic electrons, we obtain the simple result that the decay energy in electron emission is 2 (16-7b) E = [Mz,A — Mz+1,A]c We see that electron emission occurs when the initial atomic mass exceeds the final atomic mass because the mass of the electron added to the atom is compensated for by the mass of the electron emitted by the nucleus. Electron capture takes place if the mass mz , A of the initial nucleus plus one electron rest mass m exceeds the mass mz_ 1 ,A of the final nucleus. The energy made available in the decay is (16-8a) E = [(mz ,A + m) — mz-1,A1 c 2 = [m z,A — (mz-1,A — m)]c2 or E = [mz ,A + Zm — (mz- 1,A + Zm — m)]c 2 In terms of atomic masses, the decay energy in electron capture is (16-8b) When the energy is positive, electron capture occurs. This simple result is obtained because the mass of the electron taken from the atom in the capture is compensated for by the mass of the electron captured by the nucleus. Positron emission requires that the mass mz A of the initial nucleus exceed the mass mz _ 1,A of the final nucleus plus one positron rest mass, which also equals m. The energy made available in the decay is (16-9a) E = [mz,A — (mz- i,A + m)]c2 or E = [Mz,A — Mz-1,A]c2 E = [mz ,A +Zm—(m z _ 1-,A+Zm — m) - 2m]c 2 In terms of atomic masses, this expression says that the decay energy in positron emission is E = [Mz,A — Mz-1,A — 2m]c2 (16-9b) The only known nuclei with A = 7 are 3 Li 7, whose atomic mass is M3 7 = 7.01600u, and 4Be7 , whose atomic mass is M4 , 7 = 7.01693u. Which of these nuclei is stable to fi decay? What process is employed in the f decay of the unstable nucleus to the stable nucleus? ^ Since the atomic mass of 3 Li 7 is the lowest, it is the nucleus which is fi stable. As far as charge conservation is concerned, the fl-unstable 4Be 7 could decay into the stable nucleus either by capturing an atomic electron or by emitting a positron. But as far as energy conservation is concerned, only electron capture is possible since the difference in the atomic masses, M4, 7 - M3 , 7 = 7.01693u — 7.01600u = 0.00093u, is less than two electron masses, 2m = 0.00110u. Thus electron capture is the process employed in the fi decay of 4Be7 into 4 3 Li7. Example 16 3. - Now let us consider the very interesting question of what happens to the decay energy in fl-decay processes. Take the most common one, electron emission. A nucleus Z, A, which we assume to be stationary in the initial state, emits an electron and recoils, as indicated in Figure 16-9. If there are just two particles in the final state, there can be only one linear momentum conserving way in which the available energy, which is the decay energy E, can be shared. In fact, since nuclei are so massive their recoil velocities are extremely low and they carry practically no kinetic energy. Thus the electron should carry away almost all of the decay energy E in the form of kinetic energy. But measurements made at an early stage in the study of radioactivity, using bending magnets, showed that the electrons are emitted with a spectrum of kinetic energies K e, as indicated in Figure 16-10. For many years, the fact that electrons are emitted in fi decay with a spectrum of energies was very mysterious and very disturbing. Electrons emitted at the end point Ké aX of the spectrum carry away all the decay energy E, since Ké 8X was observed to equal E within experimental accuracy. That is (16-10) KQ aX = E But typical electrons carry away much less than the energy E which, the measured mass differences show, must be released in the process. It would appear that some of O O Electron Z, A Initial state Z + 1, A Final state Figure 16-9 The electron emission process, assuming (incorrectly, as we shall see) that only two particles comprise the final state. AdO3 a b13 8 In positron emission the atom must emit one electron since its nucleus emits one positron and has, therefore, one less positive charge. Thus there cannot be the compensation of electron masses found in the other fl-decay processes. The result is that in order to have the decay energy in positron emission positive, which is a necessary condition for the process to occur, the initial atomic mass must exceed the final atomic mass by more than two electron rest masses, 2m = 0.00110u. We conclude that if Mz , A > Mz+1 , A then electron emission can occur. If Mz , A > Mz _ 1 ,A then electron capture can occur. But positron emission can occur only if MZ,A > MZ-1,A + 2m; and in this case electron capture can also occur. Thus there is a range in which the difference in atomic masses is such that electron capture is possible while positron emission is energetically forbidden. In practice, atomic mass differences frequently fall in this range and so there are relatively few positron emitters in nature. In all these processes the decay energy E varies from case to case from a small fraction of 1 MeV to more than 10 MeV, and typically it is somewhat less than 1 MeV. co co 9 NUC LEAR D ECAY AND NU CLEAR REACTIONS • o7 ccsc Û rn 8 fri\ a°—') 6 ^ 05 ^ ^ E 4 • 0 a^^ 3 fi• a) 2 End point 1 Kemax 0 01 0.2 03 04 0.5 06 0.7 08 09 10 11 121.3 Kinetic energy of electrons, Ke (MeV) Figure 16 10 - The spectrum of electrons emitted in the fi decay of 83 Bi 210 . this energy has vanished! Several attempts were made to detect the missing energy, for instance by placing the fl-decaying material inside a calorimeter with very thick lead walls, but they were fruitless. The situation was grave enough that some physicists were beginning to seriously consider abandoning the law of conservation of relativistic energy, when Pauli proposed a less repugnant alternative. In 1931 Pauli postulated that a particle, now called the antineutrino v, is also emitted in the electron emission process, but it is not normally detected because its interaction with matter is extremely weak. He also postulated that the antineutrino has (1) zero charge, (2) intrinsic spin s = 1/2, and (3) zero rest mass. The first property permits charge conservation to be maintained in electron emission. The second property allows angular momentum to be conserved. Consider the nucleus Z, A emitting an electron to become the nucleus Z + 1, A and assume, for example, that A is even. Then the nuclear spin i is an integer for both the initial and final nuclei. If only the electron with intrinsic spin s = 1/2 were emitted, it would be impossible to conserve angular momentum, because the sum of a half-integral angular momentum (the electron) and an integral angular momentum (the final nucleus) can only be half-integral. If an antineutrino with s = 1/2 is also emitted, the difficulty is removed. The third property was postulated to agree with measurements showing that the end point K ' of the electron spectrum equals the decay energy E, to the accuracy of the measurements. When an electron happens to be emitted at the end point, it carries away all the decay energy and none is left for rest mass energy of the antineutrino. In positron emission and electron capture, the particle that is emitted, but very difficult to detect, is called the neutrino v. It has the same zero charge, spin 1/2, and zero rest mass as the antineutrino. The relation between neutrinos and antineutrinos is explained by Dirac's relativistic quantum mechanics. This theory shows that every particle with intrinsic spin s = 1/2 has its antiparticle. A familiar, and closely related, example is the electron and its antiparticle called the positron. (Unrelated examples are the proton and antiproton, and neutron and antineutron.) The theory also shows that when a particle is produced a related antiparticle must be produced. The familiar example is, again, the electron and positron, which are produced in pairs. This is also found in the three 13-decay processes. In electron emission a particle (electron) is produced with an antiparticle (antineutrino), while in positron emission a particle (neutrino) is produced with an antiparticle (positron). Electron capture fits into this scheme since in the Dirac theory the destruction of an electron is identical to the creation of a positron. Figure 16-11 schematically illustrates electron and positron emission in terms of Dirac energy-level diagrams for the related particles, electrons and neutrinos. We saw in the discussion of Figure 2-15 that in pair production the energy of an absorbed photon makes possible Electron emission 2 mc 2 0 Ad030 d13 8 Pair production Electron Neutrino Positron emission Figure 16 11 Electron and neutrino Dirac energy-level diagrams illustrating pair production, electron emission, and positron emission. - the transition of an electron of rest mass m from one of the all pervading sea of filled electron levels that extend downward from — mc 2 to one of the empty levels that extend upward from + mc2 . The result is an electron in a positive energy level, and a hole in a negative energy level, which is a positron. Such a transition could be represented by a vertical arrow connecting the lower and upper electron levels. In a similar way, an electron emission transition can be represented by a diagonal arrow connecting a filled neutrino level with an empty electron level, as shown in Figure 16-11. The energy made available by the difference in the nuclear masses converts a neutrino from the neutrino sea into an electron, leaving a hole in a neutrino level, which is an antineutrino. The diagonal arrow connecting a filled electron level with an empty neutrino level represents positron emission since the result is a hole in an electron level, or positron, and a neutrino. Note that there is no gap separating the filled and empty neutrino levels because neutrinos are postulated to have zero rest mass. Also note that the minimum energy that the nuclear mass difference must provide to make either fl-decay process possible is one electron rest m as s energy, mc2, in agreement with (16-7a) and (16-9a). There is an obvious distinction between a particle and its antiparticle if they are charged, because their charges are of opposite sign. The distinction is more subtle if the particle and antiparticle are neutral, like the neutrino and antineutrino. Nevertheless, there really is a distinction. Recent evidence that we shall discuss soon shows the component of intrinsic spin angular momentum along the direction of motion is always — h/2 for a neutrino and always + h/2 for an antineutrino. The problem concerning the emission of electrons with a spectrum of energies is resolved by the postulate that an antineutrino is also emitted in the fl decay, since then the decay energy E can be shared between the electron kinetic energy K e and the antineutrino kinetic energy K. That is K e + Kv = E (16-11) where we neglect the nuclear recoil energy. As there are very many ways in which this energy division can be made, the values of K e form a spectrum. Detailed agreement with the measured forms of the f3-decay spectra can be obtained if the argument is made quantitative. This involves the use of statistical procedures, similar to but somewhat more complicated than those used in Chapters 1 and 11, to determine the number of energy divisions in each range of K e . (See also Appendix K.) The results are most conveniently expressed, and explained, in terms of the momentum spectrum R(p e), which is the rate of emission of electrons with linear momentum Pe per unit time and per unit momentum. It is found that ( N R(pe) r[(E — Ke) 2pé 1 M*M L 27L3^1'c3 ] (16-12) NUCLEAR DECAY AND NU CLEARREACTIONS where M is the fl-decay matrix element M=J (16-13) In (16-12) the term (E — K e)2 = Kv is proportional to pÿ , the square of the antineutrino linear momentum. So the rate R is proportional to the product of two factors, each of which is the square of the momentum of one of the particles emitted in the f3 decay. These p2 factors are just measures of the number of quantum states per unit momentum interval into which the antineutrino, or electron, can be emitted in the decay. Both can be obtained by a trivial modification of the argument in Example 1-3. If the allowed wavelength 2 in Figure 1-7 is taken to be the de Broglie wavelength of a particle in a box, then (1-15) can immediately be converted from the form N(r) cc r2 to the form N(p) cc p 2 since the quantity r in that equation is inversely proportional to 2 and, according to de Broglie, 2 is inversely proportional to the particle's momentum p. Thus we see that N(p), the number of allowed states per unit momentum interval for an antineutrino or electron of momentum p, which is confined to a box, is proportional to p2 . The box is a mathematical one that is used to normalize the free particle eigenfunctions representing the emitted antineutrino, or electron, as discussed in Section 6-2. In other words, if a particle is confined to a box (of arbitrarily large dimensions) so that its eigenfunction can be normalized, it is no longer strictly a free particle and thus has a discrete (albeit arbitrarily closely spaced) set of quantum states available to it. The number of these states per unit momentum is proportional to the square of its momentum. If we then make the usual statistical assumption that all possible divisions of energy, or momentum, occur with the same probability, the rate for a f3 decay with a particular division will be proportional to the total number of states for that division, which is the number of states for one particle times the number of states for the other. Thus the rate R will be proportional to the momentum density of states factor for the antineutrino times the momentum density of states factor for the electron. So we see how the shape of the electron momentum spectrum is governed by the bracketed terms of (16-12). Crudely speaking, the spectrum is symmetrical about a maximum at the momentum which represents equal momentum sharing between the electron and antineutrino. The reason is that if one of these particles takes more momentum in the decay, the other must take less, and this will decrease the value of the product of the two density of state factors. The term M*M in (16-12) governs the magnitude of the momentum spectrum, and therefore the overall rate of emission of electrons in the 13 decay. Equation (16-13) shows that M depends on the value of a quantity f3, which will be identified in the following paragraphs. It also depends on the eigenfunction i/i t of the /3-decaying nucleus in its initial state (before the decay) and on the complex conjugate of the eigenfunction Of of the nucleus in its final state (after the decay). We shall see that the f3-decay matrix element M is really a measure of how easy it is for the nucleus to change from the initial to the final state. Equations (16-12) and (16-13) are analogous to (8-42) and (8-43), which we derived for the rate of emission of photons in the decay of an excited state of an atom. In particular, the /3-decay matrix element is analogous to the electric dipole moment matrix element J t/i fertli i dz that enters in the theory of the "photon decay" of atoms. The /3-decay matrix element is a volume integral of the quantity f3, taken between the eigenfunction of the nucleus in its initial state and the complex conjugate of the eigenfunction of the nucleus in its M=fi J fi ii i ch= f3M' (16-14) where M' is the so-called nuclear matrix element M' = f (16-15) Fermi's theory of electron emission from nuclei is closely related to the theory of photon emission from atoms. Perhaps the biggest difference is that Fermi's theory is complicated by the fact that two particles are emitted and share the available energy. Certainly the biggest similarity is that in both theories none of the particles emitted are considered to have prior existences—they are created at the time of emission. It should be emphasized that /3 decay is not a consequence of the nuclear force, or interaction. Instead, f3 decay is a consequence of an interaction that we have not previously encountered in our study of quantum physics—the fl-decay interaction. This is one of the four interactions of nature. The other three are the nuclear, electromagnetic, and gravitational interactions. In the next section we shall study the properties of the fl-decay interaction, and we shall find that it is set apart from the other interactions observed in nature by the very different magnitude of its strength, which is governed by the value of the f3-decay coupling constant /3. We shall als6 find that the f3-decay interaction has properties concerning parity which are strikingly different from the other interactions. The function R(p e), of (16-12), is the momentum spectrum of the emitted electrons. It also applies to positron emission. The equation predicts that a plot of [R(p e)/pe] 112 versus (E — K e), or simply versus K Q7 should yield a straight line. Figure 16-12 shows such a Kurie plot for the simplest of all electron emission processes on 1 , 1H1 + e + v (16-16) the decay of a free neutron o nt into a proton 1 H 1 plus an electron e and an antineutrino P. The neutron decays because [M0,1 — Ml,1]c 2 = +0.78 MeV, and the 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Ke (MeV) Figure 16-12 A Kurie plot for the /3 decay of the neutron. AdO34b138 final state. So M is something like an average of the quantity /3, evaluated while the nucleus is in the process of decaying and is in a mixture of the two states. Thus fi plays a role in governing the rate of 13 decay much like the role played by the electric dipole moment, er, in governing the rate of photon decay by atoms. Equations (16-12) and (16-13) were first obtained by Fermi, under the simplifying assumption that the Coulomb interaction between the nuclei and the emitted electrons could be neglected. He also assumed that fi is a universal constant, called the fl-decay coupling constant. Then the /3-decay matrix element M immediately reduces to o NU CLEAR DE CAY A ND NUCLEAR REACTIONS ti ^ lifetime T of the decay is about 1000 sec. (A neutron in a stable nucleus does not f decay into a proton because it is prevented from so doing by the nuclear interaction, which is much stronger than the fl-decay interaction.) The comparison in Figure 16-12 is typical of the good agreement obtained between the theory and experiment for the /3 decay of nuclei of low Z. Small downward deviations of the experimental data at low energies are sometimes seen, but they usually represent experimental problems with self-absorption of low-energy electrons in the source of f3-decaying material. For nuclei of high Z, there are real deviations between the predictions of the Fermi theory and experiment. They are due to the neglect of the Coulomb interaction between the final nucleus and the emitted electron, or positron. This interaction decelerates the electrons, or accelerates the positrons. Its effect is to enhance the low-energy or momentum end of the electron spectra, or to deplete that end of the positron spectra. By integrating the momentum spectrum of (16-12) over all electron momenta up to the maximum momentum ',Tax, an expression is obtained for the total rate of emission of electrons. Since this is just the decay rate R, according to (16-4) its reciprocal is the lifetime T. The results are R= (3 2M'*M'F T 273h7 3 (16-17) where F is a function of the maximum momentum pe aX, or of the corresponding maximum kinetic energy which is the end point energy Ké aX In Figure 16-13, F is plotted as a function of Ké aX. Note that F increases fairly rapidly with increasing Ké ax. Corrections made to the theory to account for the effect of the Coulomb interaction on the emitted electron change the values of F. For small Z the change in F is negligible. But for Z = 100, and Ké ax = 1 MeV, F is increased by about a factor of 100 for electron emission, or decreased by about a factor of 10 for positron emission. We see from (16-17) that the lifetime T of a fl-decaying nucleus decreases fairly rapidly with increasing end point energy KT", or decay energy E = KeaX, because of the increase in the value of F with increasing energy. For naturally occurring f3-decaying nuclei, T ranges from —1 sec for E around several MeV, to —10 8 sec for E around several hundredths of an MeV. We also see from (16-17) that the quantity 2rc3 h7 1 1 FT ^s c 4 o2. (16-18) 6 5 4 3 2 1 0 a 1 2 3 4 5 I I —6 0.01 0.02 0.04 „a III 1 l IU 2 01 02 04 0.6 1 End point energy KT ax (MeV) I I I I I 4 6 10 Figure 16-13 A base-10 logarithmic plot of the function F versus the end point energy Ké aX of the /3 decay of nuclei of very small Z. The decay rate is proportional to F. Thus as F increases with increasing end point energy, the decay rate increases and the lifetime decreases. depends on a collection of universal constants, and on the value of the nuclear matrix element (16-19) This expression for the nuclear matrix element is just (16-15), with the subscripts on the initial and final eigenfunctions rewritten to indicate that the theory applies to both electron and positron emission. The quantity FT is sometimes called the comparative lifetime. It can be used to compare /3 decays of different decay energy, and rank them according to the lifetimes they would have if they all had the same decay energy. That is, multiplying T by F removes the energy dependence, and so produces a quantity whose value depends only on a collection of universal constants and on the value of the nuclear matrix element. Since the matrix element contains the eigenfunctions for the nuclear states involved in a f decay, it is apparent that the FT value for the decay can provide information about those nuclear states. One of the simplest /3 decays is 1 H 3 -> 2 He 3 + e+ v The measured values of the decay energy and half-life are E = 0.0186 MeV and T1/2 = 12.3 yr. Calculate the value of FT. ^ Since Z is very small, we can evaluate F from Figure 16-13, using K' = E = 0.0186 MeV. We find log F -5.7 or F^ 2.1x10 -6 Converting T1/2 in years to the lifetime T in seconds gives 12.3 yr x 365 day/yr x 24 hr/day x 60 min/hr x 60 sec/min T1^2 5.6 x 10ssec = T 0.693 0.693 Example 16 4. - so 2.1 x 10 -6 x 5.6 x 108 sec = 1.2 x 10 3 sec This is one of the smallest FT values observed. In other words the /3 decay is inherently fast because its lifetime T is small, in consideration of the value of F dictated by the value of the decay energy E. In Example 16-5 we shall see that this fact has some important theoretical consequences. It also has some important practical consequences. Uncontrolled testing of hydrogen bombs in the 1950s produced large amounts of 1H3 (also called tritium) in the atmosphere. Since the /3 decay of this radioactive isotope is inherently fast, most of it has by now decayed into the 4 harmless stable isotope 2He3 . FT Since (16-18) shows that the FT value is inversely proportional to the value of M'*M', the nuclear matrix element times its complex conjugate, we see that FT is a minimum when M'*M' is a maximum. This happens when the initial nuclear eigenfunction >/iz ,A is identical to the final nuclear eigenfunction z±1,A, because then the normalization condition for eigenfunctions requires that (16-19) yield M' = 1. If the eigenfunctions are not identical, M'*M' < 1, and it becomes smaller as the eigenfunctions become less similar. In fact, M', and therefore M'*M', is exactly zero if Wz,A and tfrz ± 1,A are so dissimilar as to correspond to different values of nuclear spin i, or opposite nuclear parities. These two properties immediately give the Fermi selection rules: 0 (16-20) The nuclear parity must not change If either is violated the /i decay will not take place, according to the Fermi theory. The first restriction reflects the fact that no allowance is made in the theory for the Ai = AbO3a d13 8 M' = J ^z ±1,A ^^z,A dz NUCLEAR DE CAY A ND NUC LEAR REACTIO NS emitted particles to carry angular momentum, so the conservation law demands there be no change in the nuclear angular momentum. The second restriction arises because the integrand will be of odd parity if the eigenfunctions have opposite parity, and then the contribution to the integral from the point x, y, z will be canceled by the contribution from the point — x, — y, — z. (Recall the arguments at the end of Section 8-7.) A theory developed later by Gamow and Teller takes into account the spins of the emitted particles, and it shows that the first Fermi selection rule is too restrictive. The Fermi theory restriction arises from the circumstance that the matrix element in (16-13) does not involve spins. In the Gamow-Teller theory the corresponding matrix element contains the spin of the neutron that is being converted into a proton, and the spin of the neutrino that is being converted into an electron, in the /3 decay. If the two particles emitted in the decay have their s = 1/2 intrinsic spins essentially parallel, Ai = ± 1 is also allowed. Thus we have the Gamow-Teller selection rules: Ai = 0, + 1 (but not i. = 0 i f = 0) (16-21) The nuclear parity must not change The reason why Ai = 0 is allowed by the Gamow-Teller rules is that it is possible for the two particles to be emitted with essentially parallel spins in a Gamow-Teller decay, thereby carrying away one unit of angular momentum, with the nucleus changing the orientation in space, but not the magnitude, of its spin. But this is not possible if the nuclear spin is zero, as is indicated by the qualification in parentheses. In a Fermi decay the particle spins are "antiparallel," and the nuclear spin may be zero. Even if Ai is larger than one, fi decay still can occur in such a way that angular momentum is still conserved, since the particles can be emitted with orbital angular momentum. But the decay rates for these forbidden processes are much smaller than for the allowed processes that satisfy the Fermi or Gamow-Teller selection rules. The decay rate decreases by something like a factor of 10' for each unit of orbital angular momentum carried away by the particles. These inhibition factors result from the low probabilities of emitting a particle with orbital angular momentum of one or more h units from a system of radius as small as that characteristic of a nucleus, if the particle has linear momentum as small as that characteristic of fi decay. For many nuclear physicists /3 decay is a favorite field of investigation because it provides valuable information about the nuclei involved in the decay. A measurement of the end point KT", or of the atomic masses to determine the decay energy E, is used to obtain the value of F from a curve like Figure 16-13, if Z is small. If Z is not small, the value of F is obtained from tables that are available of F versus Ké ax and Z. Next, FT is calculated from the measured value of the half-life, or lifetime, as in Example 16-4. Then (16-18) is used to evaluate the nuclear matrix element M'. The order of magnitude of M' is often enough to give information about the spins and parities of the nuclear states participating in the decay, and more accurate values of M' can give details about the eigenfunctions of these states, through (16-19). Of course, it is first necessary to know the value of the fi-decay coupling constant /3. This quantity is evaluated experimentally from fi decays involving certain very simple nuclear states, for which M' is already known from other considerations to be discussed next. 16 4 THE BETA DECAY INTERACTION - - The fi-decay interaction is the least familiar of the four interactions (nuclear, electromagnetic, /3 decay, gravitational) that govern the operation of everything in the universe. In this section we shall explore some of its properties. We begin by using 1 H3 2 Figure 16-14 ^ GJ Neutrons Protons Shell model descriptions of the ground states of the pair of nuclei 1H3 and 2 He 3 . the iH3 to 2He3 /3 decay, considered in Example 16-4, to determine the value of the fl-decay coupling constant, f3, which specifies the strength of the interaction. Since we found in Example 16-4 that the FT value for the i decay 'H 3 -+ 2He3 + e + v is particularly small, the inverse proportionality between FT and M'*M', of (16-18), tells us that the nuclear matrix element M' is particularly large for this decay. In fact, there is reason to believe that it assumes the maximum value allowed by the normalization condition, M' = 1. Figure 16-14 gives the shell model description of the ground states of the two nuclei, which are the states involved in the f3 decay. Since the nucleons are in the 1s 1/2 subshell, which has j = 1/2 and even parity, according to the shell model both ground states should have nuclear spin i = 1/2 and even parity. These predictions are confirmed by independent measurements of the spins and parities. Thus the f3 decay between these states is certainly allowed by the Fermi selection rules. But the shell model makes the even stronger prediction that M' = 1, almost exactly, in this decay. Since all the nucleons are in the same subshell, the eigenfunctions for the two nuclei can differ only if the Coulomb, or nuclear, interactions between the nucleons differ. The Coulomb interactions do differ for the two nuclei, but they are negligible compared to the strong nuclear interactions. And there is much other evidence that the nuclear interactions are the same because they are charge independent and so make no distinction between neutrons and protons. Thus the two eigenfunctions should be essentially identical and, if the eigenfunctions are properly normalized, the integral will yield M' = J , 3i , 3 dr = 4,301,3 ch = 1 J Knowing the value of M', we can then use the measured FT value to evaluate f3, the f3-decay coupling constant. It should be emphasized that the conclusion that M' =1 depends on the particular symmetry found between the behavior of the neutrons and protons in the two nuclei involved in the decay. In the first nucleus there are a pair of nucleons of one species and an unpaired nucleon of the other species in the same subshell—in the second nucleus exactly the same is true, although the species of the nucleons are reversed. Example 16 5. Use the FT value for the f3 decay of Example 16-4, plus the conclusion that M' = 1 for that decay, to evaluate the f3-decay coupling constant, fi. •Equation (16-18) gives 1 2^3^a' /32 F Tm 5 c4 M'* M' So we have 271 3(1.05 x 10 -34 joule -sec) 7 1 2 N 1.2 x 103 sec x (0.91 x 10 -30 kg) 5 x (3.0 x 108 m/sec) 4 1 a - Sec . 1 6-4 THE BETA -DECAY INTERACTIO N Neutrons Protons He ti ^ or 6 2 — 1.4 x 10 -123 joule 2 -m 6 NUCLEAR DECAY AND NU CLEAR REACTIONS ) Thus /3 ^ 3.7 x 10 -62 joule-m 3 • There are several other pairs of nuclei whose ground states have shell model descriptions with the same kind of symmetry between neutrons and protons as in Figure 16-14. An example of such a pair is 3 Li7 and 4Be 7. One member of each pair f decays into the other, with a nuclear matrix element M' that must certainly be almost precisely equal to 1. The measured FT values of these decays lead, through calculations like the one in Example 16-5, to values of /3 which are in good agreement with the value obtained there. Thus we conclude that the /3-decay coupling constant has the very small value /3 — 10 -62 joule-m 3 (16-22) If we divide /3 by the volume of a typical nucleus, — (10 -14 m)3 10 -42 m3, we obtain 10 -62 joule-m 3/10 -42 m3 = 10 -20 joule ^ 10 - ' MeV. We can then make a comparison of this characteristic energy to the energy of the order of 1 MeV that characterizes the nuclear interaction. As it is the square of the /3-decay coupling constant that enters into measurable quantities, such as the FT value, it is appropriate to say that the f-decay interaction is weaker than the nuclear interaction by a factor of 10 -14 Since the nuclear interaction is only about two orders of magnitude stronger than the electromagnetic interaction (see Section 15-2), the /3-decay interaction is also very much weaker than the electromagnetic interaction. On the other hand, the gravitational interaction is weaker than the nuclear interaction by about 40 orders of magnitude (see also Section 15-2), so the f-decay interaction is stronger than the gravitational interaction by about 26 orders of magnitude. Thus there are extremely pronounced differences in strength between the /3-decay interaction and the other interactions observed in nature. These matters will be discussed at more length in the following chapters where it will be seen, for instance, that the gravitational interaction is the most obvious one in the everyday world, despite the fact that it is inherently the weakest by far, because it has a long range and always has the same sign. The range of an interaction is a characteristic as important as its strength. The gravitational interaction has a long range since the gravitational interaction energy between two massive objects decreases quite slowly as their separation r increases (in proportion to 1/r). The electromagnetic interaction also has a long range since the interaction energy between two charged objects has the same slow dependence on their separation. The nuclear interaction has a short range because the interaction energy cuts off abruptly when two nucleons are separated by more than about 2 F. The /3-decay interaction has an extremely short range. Some evidence for this is found from the following considerations. The form for the f-decay matrix element M used in the Fermi theory, (16-14) M=fI(PfC d^ is obtained from the assumption that the extension in space of the /3-decay interaction is very small compared to the dimensions of the nucleus. Without this assumption, the integrand in M would not be'I /I , but 41/ ii averaged over a volume of dimensions equal to the range of the interaction. If this were the case, M would be affected in such a way as to change the predictions of the theory for the shape of the momentum spectra of the electrons emitted in the f decay. But the observed momentum spectra are in good agreement with the theoretical predictions as they stand. which is the alternative form of neutron decay, (16-16) on 1H1 + e +V (Note that the two forms of neutron decay indicate the equivalence of the destruction of an antiparticle, the positron, and the creation of the associated particle, the electron. In the Dirac theory the processes are identical.) The Reines-Cowan reaction took place in the hydrogen of a very large hydrogenous scintillation counter (a modern version of Rutherford's ZnS counter, using photocells instead of eyes to detect the light flashes). The counter was exposed to the enormous flux of antineutrinos emitted from the fission induced /3 decays in a nuclear reactor, and the positrons were detected by the scintillations they produced in the same counter. Elaborate methods were required to minimize background scintillation. This was necessary because only about one reaction per minute was obtained, despite the intense flux of antineutrinos and the huge size of the target, due to the weakness of the /3-decay interaction. Now we shall briefly discuss two other experiments, performed in the 1950s, that tell us about a unique property of the /3-decay interaction. Wu, and collaborators, studied the decay 27Co°° — 28Ni6° + e +v by measuring the direction of emission of the electrons relative to the orientation of the magnetic dipole moments of the 'Co" nuclei. The magnetic dipole moments were aligned by using a very strong external magnetic field, and a very low temperature to minimize thermal disorder. Figure 16-15 is a schematic drawing of the experiment, showing a typical nucleus and a typical emitted electron. To make the drawing closer to physical reality, a current loop of positive charge is used to indicate the orientation of the magnetic dipole moment. Wu found that the electrons are not emitted symmetrically with respect to the plane of the current loop. Instead, there is a preferred direction of emission that is related to the circulation of the current loop in the same way as the direction of advance of a left-hand screw is related to its rotation. The figure also shows the experiment, as seen when looking in a mirror. The preferred direction of emission appears to be the same, but the circulation of the current loop appears to have reversed. As viewed in the mirror, the results of the experiment are described by saying the relation between the direction of the typical electron and the circulation of the current loop is like that of a right-hand screw. Thus a description of this /3 decay (and others) is not the same as a description of the mirror image. This NOIlOt/a 31NI.l`dO34 -`d139 31-11 Thus the assumption of a very short range /3-decay interaction, which the predictions stand upon, is probably correct. Additional evidence supporting this conclusion will be presented in the following chapters. The very small value of /3 is responsible for the fact that neutrinos and antineutrinos interact so weakly with matter that they are very difficult to detect. Calculations show that when they are produced in /3 decay following nuclear reactions in the center of the sun, they can travel all the way to the surface with little chance of being absorbed. This has an effect on the production of solar energy. The fi-decay interaction of electrons and positrons is equally weak, but since these particles also interact with matter through the electromagnetic interaction they are easy to detect. Despite the obvious difficulties due to the extreme weakness of their interaction with matter, antineutrinos were detected in 1953 by Reines and Cowan. They used the reaction On' 1H1 —> +é where the symbol é stands for a positron. This is the inverse of the reaction 1 H ' +v NU CLEAR D ECAYA ND NU CLEAR REACTIONS Preferred direction of electron emission Circulation of positive charges in current loop Normal view Figure 16-15 A schematic drawing of the experiment which proved that parity is not conserved in fi decay. Also shown is a mirror image of the experiment. seems to be a property unique to the I6-decay interaction, among all the observed interactions of nature (nuclear, electromagnetic, i3 decay, and gravitational). For instance, charges circulating around a macroscopic current loop emit photons by the electromagnetic interaction, because the charges are accelerating. But the photons are emitted symmetrically with respect to the plane of the loop, so the mirror description of this process cannot differ from the normal description. Since the operation of taking a mirror image is related to the parity operation in the manner illustrated in Figure 16-16, it is said that 16 decay is not invariant to the parity operation, or that parity is not conserved in fi decay (but it is in the electromagnetic interaction). z P x1( x, —y, —z) — Before parity operation After parity operation Figure 16-16 The parity operation (x,y,z) —* (—x,—y,—z). In this figure the operation is carried out by reversing the direction of each of the coordinate axes, keeping the location of the representative point P fixed (compare with Figure 8-15). Before the operation we have a set of right-hand axes, i.e., a right-hand screw, rotated in the sense that would carry the x axis into the y axis, would advance the screw in the direction of the z axis. After the parity operation they become a set of left-hand axes. This change can also be obtained by the operation of taking a mirror image, which converts right-hand axes into left-hand axes. So the mirror image operation is related to (but not identical to) the parity operation. Direction of spin angular momentum Direction of advance and of linear momentum Direction of spin angular momentum Right-hand screw (antineutrino) Figure 16 17 - Left-hand screw (neutrino) The helicities of a right-hand screw and a left-hand screw. Measurements of Goldhaber, and collaborators, have shown that the so-called helicity of the antineutrino is responsible for the results of the Wu experiment. The method is a little too complicated to explain here. But they found that in the normal view of nature the spin angular momentum of an antineutrino is, within the accuracy of their measurements, always essentially parallel to the direction of its linear momentum. It is said that the antineutrino has the helicity of a right-hand screw, depicted in Figure 16-17. They also found that the neutrino has the helicity of a left-hand screw; i.e., within experimental accuracy its spin angular momentum is always essentially antiparallel to its linear momentum in the normal view. Now the fj decay studied by Wu is between an i = 5, even parity, ground state of 27Co60, and an i = 4, even parity, excited state of 28 Nî 60 . So it is a Gamow-Teller allowed transition in which angular momentum conservation requires the antineutrino and electron to be emitted with their spin angular momentum vectors essentially parallel to that of 'Co", or to a vector representing its magnetic dipole moment. Furthermore, in such a transition the antineutrino and electron tend to be emitted with linear momentum vectors in opposite directions. Figure 16-18 shows how these relations between the vectors, plus the parallel relation between the spin and linear momentum vectors of the antineutrino demanded by its helicity, cause the typical electron to be emitted in the direction described. As viewed in a mirror, the helicity of the antineutrino changes, just as the helicity of a real screw changes, and this leads to the change in the mirror image description of the Wu experiment. It should be noted that there is no violation of parity conservation by the nuclei in the Nî6° decay. Both nuclear states involved are of even parity so there is no nuclear parity change, in agreement with the Gamow-Teller selection rules. It should also be noted that it is not possible for an antineutrino, or neutrino, to have a definite helicity in the normal view of nature unless its rest mass is zero. If it had a nonzero rest mass, it would travel with velocity less than c, and we could always find a moving frame of reference in which its linear momentum would be reversed in direction. As its spin would be unchanged by such a transformation, its helicity would be reversed. But the Goldhaber experiment shows that antineutrinos and neutrinos do have definite helicities, and this would not be possible if their helicities depended on the motion of the reference frame from which they are viewed. So we can conclude that their rest masses are zero, within the accuracy of the experiment. Direct measurements of the rest masses of these particles confirm this conclusion. 27 C0 60 to 28 Figure 16 18 The ,8 decay of aligned 27 Co60 . The give the directions only of p and I, Sv and py , and S e and Pe, which are the nuclear magnetic dipole moment and spin, the antineutrino spin and linear momentum, and the electron spin and linear momentum. Parity is not conserved because S, and pv are always essentially parallel. vectors /4 I S Pv Se Pe - NOIlJt/b31NI A`d03 0-b'13 8 3H1 Direction of advance and of linear momentum 16 5 NU CLEAR DECAY AND NU CLEAR REACTIONS - GAMMA DECAY There are y rays emitted from many of the nuclei of the radioactive series. These are photons of electromagnetic radiation that carry away the excess energy when nuclei make y-decay transitions from excited states to lower energy states. As the energy differences in nuclear excited states range upwards from — 10' 3 MeV, y rays have energies greater than — 10'3 MeV (see Figure 2-4). Most typically, y decay will arise when a preceding fi decay has produced some of the daughter nuclei in states of several MeV excitation, because the fl-decay selection rules prevent the decay from obeying the tendency, imposed by the energy dependence, for transitions to go overwhelmingly to the ground state. An example is shown in the "Cl" decay scheme of Figure 16-19. There are also many other ways to produce nuclei in excited states, which subsequently y decay. For instance, states of excitation energy around 7 or 8 MeV are produced when this much binding energy is liberated by the capture of a low-energy neutron in a nucleus. The most accurate technique for measuring the energy of y rays is to study their diffraction from a crystal lattice of known lattice spacing. This is exactly the technique of x-ray diffraction, but since y rays have somewhat higher energies than x rays, their wavelengths are somewhat shorter, and this forces the use of diffraction apparatus of inconveniently large dimensions in order to measure accurately the small diffraction angles. The most widely used technique for measuring y-ray energies involves letting the photons transfer their energies to electrons by one of the processes described in Chapter 2, namely, the Compton effect, the photoelectric effect, or pair production. The energies of the electrons are measured by using a NaI scintillation counter, or a semiconductor counter, which has a response proportional to the energy a charged particle deposits in it. The measured energy spectrum of y rays emitted in transitions 4 (3, odd) 3 S d bA N w 2 (2, even) 1 (0, even) 0 18A38 Figure 16 19 The decay scheme of 17 C1 38 . The half-life, spin, and parity of the ground state of this f-unstable nucleus are shown as well as the energy of the state relative to the ground state of 18A38. Also shown are the energies, spins, and parities of the ground and first two excited states of 18A38, and the relative probabilities that the /3 decay goes to each of these states. When the excited states are populated, they y decay to the ground state. The /3 decay to the (3, odd) state is allowed by the Gamow-Teller selection rules, while the other /3 decays are both forbidden by these and the Fermi selection rules. They nevertheless occur with appreciable probabilities because of the way the rates for all decays, allowed and forbidden, increase rapidly as the decay energy increases. - Ada a `dwwdJ between the excited states of a nucleus is used to determine the energies of these states just like the spectrum of photons emitted from an atom is used to determine the energies of atomic states. Of course, this provides very valuable information about the nucleus. Another valuable source of information is the y-decay transition rate R of each excited state. In some cases R can be measured directly. In other cases it can be obtained indirectly by measuring the lifetime T of the state. If the state makes only a single transition to a lower energy state, (16-4) tells us T = 1/R (after correction is made for the "internal conversion" process to be discussed at the end of this section). When T > 10 -10 sec, it can be determined by electronically timing the average delay between the excitation of a state and its decay. When T is shorter than this figure, in some cases it can be determined by using the Mössbauer effect (discussed in the next section) to determine the energy spread, or "width," of the state, and then employing the energy-time uncertainty principle. With these different techniques, transition rates have been observed ranging from R — 10 -8 sec - 1 to R 10 18 sec". The energies of the excited states of nuclei will be considered in a subsequent section. Here we shall consider their transition rates for y decay. As we shall use the ideas developed in treating optical transitions of atoms in Section 8-7, the student certainly should review that material before proceeding. For an atom, only electric dipole radiation is important. This is the radiation produced by oscillations in its electric dipole moment. In principle, radiation can be emitted by a more complicated behavior of the atomic electrons, such as an oscillation of the magnetic dipole moment or of the electric quadrupole moment. In practice, for an atom such radiation can be ignored because the transition rate is very much smaller than for electric dipole radiation. Electromagnetic considerations show that the transition rate for magnetic dipole radiation should be smaller than for electric dipole radiation by a factor of the order of (v/c) 2 — (10 -2)2 = 10 -4, where y is the typical velocity of the electrons and c is the velocity of light. Geometrical considerations show that the transition rate for electric quadrupole radiation should be smaller than for electric dipole radiation by a factor of the order of (r'/2) 2 — (10 -10 m/ 10 -7 m)2 = 10 -6, where r' and 2 are typical values of the atomic radius and the wavelength of the radiation. If the selection rules prevent an atom from emitting electric dipole radiation, it is almost always deexcited by hitting some other atom long before it can emit magnetic dipole or electric quadrupole radiation. For a nucleus the same factors suppress the transition rates for magnetic dipole and electric quadrupole radiation, but their values are not so small: (v/c) 2 — (10-1)2 = 10 -2; (r'/2)2 — (10 -14 m/10 -12 m)2 = 10 -4. Furthermore, the Coulomb barrier keeps nuclei from getting close enough to deexcite each other. So if the selection rules prevent a nucleus with several MeV of excitation from emitting electric dipole radiation, it must wait until it can decay by emitting some other electromagnetic radiation (or by the related process of internal conversion). The transition rates for various types of electromagnetic radiation can be calculated by extensions of the procedure developed in Section 8-7. Since the calculations are very sensitive to the detailed behavior of the nucleons in the states involved in the decays, and since the nuclear models only provide approximate descriptions of this behavior, the results can only be expected to give rough ideas of general trends. Table 16-1 shows transition rates obtained by Weisskopf from calculations, based on the shell model, for a nucleus of radius r' = 7 F. The integer L labels the multipolarity of both the electric and magnetic transitions; it is 1 for dipole, 2 for quadrupole, 3 for octupole, etc. Note that for 1 MeV y rays, predicted rates for magnetic transitions are smaller than for electric transitions, of the same L, by about 10 -2 — (v/c)2. At that typical energy, predicted rates for both types of transitions decrease by about 10 -4 — (r'/2) 2 , for each unit increase of L. Also note that the dipole transition rates have NUCLEAR DE CAY ANDNU CLEAR REACTION S co cc. j Table 16-1 Shell Model 7-Decay Transition Rates in sec -1 for a Nucleus of Radius r' = 7 F Transition L 10 MeV y-Ray Energy 1 MeV Elec. dipole Mag. dipole Elec. quadrupole Mag. quadrupole Elec. octupole Mag. octupole Elec. sixteenpole Mag. sixteenpole 1 1 2 2 3 3 4 4 2 x 10 18 2 x 10 16 1 x 10 16 1 x 10 14 1 x 10 13 1 x 10 11 1 x 10 1° 1 x 108 2 x 1015 2 x 10 13 1 x 10 11 1 x 109 1 x 106 1 x 104 1 x 10 1 1 x 10 -1 0.1 MeV 2 x 10 12 2 x 101° 1 x 106 1 x 104 1 x 10 -1 1 X 10 -3 1 x 10 -8 1 x 10 -10 approximately an E 3 cc y3 dependence on the energy or frequency of the emitted y ray. We have seen this y 3 dependence before in the electric dipole transition rates for atoms, (8-43). Since (r'/2)2 cc y 2 cc E2 , the quadrupole transition rates depend approximately on E5 and the octupole transition rates depend approximately on E7 . The calculations also show that the y-decay selection rules are: For electric transitions (but not ii = 0 to if = 0) lii — if l<L<ii + if (16-23) The nuclear parity must change if L is odd, and it must not change if L is even. For magnetic transitions (but not ii = 0 to if = 0) (16-24) The nuclear parity must change if L is even, and it must not change if L is odd In these expressions, ii is the nuclear spin of the initial state and if is the nuclear spin of the final state of the decaying nucleus. The decay will, of course, always proceed by the allowed transition having the largest transition rate. Because of the strong L dependence of the transition rate, it follows that the dominant transition will have L = — if I. If this value of L is odd, it will be an electric transition when the initial and final states are of the opposite parity, and a magnetic transition when these states are of the same parity. If this value of L is even, it will be an electric transition when these states are of the same parity, and a magnetic transition when they are of the opposite parity. —i f l < L<ii +if Example 16-6. Use the information in the decay scheme of Figure 16-19 to determine the types of radiation emitted by 18A38 in the y decays between its three lowest energy states. •In the decay between the states of i = 3, odd parity, and i = 2, even parity, we have — if l = 1 = L. Since this value is odd, and since the nuclear parity changes, the radiation is electric dipole. In the decay between the states of i = 2, even parity, and i = 0, even parity, we have — if l = 2 = L. Since this value is even, and since the nuclear parity does not change, the radiation is electric quadrupole. • By running the arguments of Example 16-6 in the reverse direction, information about the spins and parities of the nuclear states can be obtained if the types of radiation emitted in transitions between the states are known. The types of radiation can be identified from approximate measurements of the transition rates (or from measurements, described later, of internal conversion). Since the transition rates are very sensitive to the behavior of the nucleons in the nucleus, their accurate measurement provides information that is currently being used to improve the nuclear models. (Since it is not possible for a system of particles to have an oscillating electric monopole moment, or to have any magnetic monopole moment at all, it immediately follows from this result that there is no way to produce an L = 0 y ray, or an L = 0 photon in any region of the electromagnetic spectrum Thus we see why all photons must carry at least one unit of angular momentum.) The parts of the selection rules relating L to the nuclear parities arise from symmetry properties of the matrix elements for the transitions. In Example 8-6, we saw that the electric dipole matrix element can be broken into components, the first of which is M oc J frf* x f/i dz rayemitdnso,fultiparyLcesntofagulrme. (16-25) The factor x enters because it is proportional to the x component of the electric dipole moment. Calculations show that the first component of the electric quadrupole matrix element is M cc J (16-26) Ix2 ,dt The factor x2 is proportional to one of the components of the electric quadrupole moment. (There are generally more than three since a quadrupole generally must be described in terms of a tensor.) For the magnetic dipole matrix element, the first component turns out to be M cc J /i*LfJ dr where Lx is the x component of orbital angular momentum. This factor enters because it is proportional to the x component of the magnetic dipole moment (if we assume, for simplicity, that it is purely orbital). Since dz Lx = (r x P)x=ypZ— zpy =m(yvz— zvy) = m(y dt — z dy dt the magnetic dipole matrix element component can also be written M cc ^f (Y dt — z dt i di (16-27) ) At the end of Section 8-7 we proved that the integral in (16-25) yields zero unless / i and Of have opposite parities. We leave it to the student to prove from similar arguments that the integrals in (16-26) and (16-27) yield zero unless iii and Of have the same parities. These results are precisely the parity selection rules for the three transitions we have taken as examples. In many y decays, several groups of monoenergetic electrons are emitted along with the y rays. (If there is a preceding 16 decay these groups will be superimposed on the continuous fl-decay spectrum.) The energies 6' of these electrons are found to be related to the decay energy E by the equation e _ E— W (16-28) where W for the most prominent group equals the binding energy of a K shell electron of the y-decaying atom, and W for the other groups equals the binding energies of electrons in the L, M, etc., shells. The process involved is called internal conversion. It consists of a direct transfer of energy through the electromagnetic interaction between a nucleus in an excited state and one of the electrons of its atom. The nucleus Ada 3a`dwwt/O The parts of the selection rules relating L to the nuclear spins arise from the requirement that angular momentum be conserved in y decay. The student can verify this with ease, if he will accept a result obtained from quantum electrodynamics: a y • N NU CLEAR DEC AY AND NUC LEAR REACTIONS CO Ç decays to a lower state, without ever producing a y ray. But the decay is still electromagnetic, depending on an interaction between the electron and the longitudinal components of the electric field produced by the oscillating multipole moment of the nucleus. The transverse components are responsible for y decay (see Appendix B). Figure 16-20 shows calculated values of the K shell internal conversion coefficient, aK, for the "Zr atom. This is the ratio of the probability that a K electron will be emitted, in a decay of its nucleus, to the probability that a y ray will be emitted. The calculations should be very accurate because factors involving not too well known nuclear properties cancel out of the ratio. Since the chances for internal conversion increase rapidly as the value at the nucleus of the bound electron eigenfunction increases, aK rapidly becomes larger as the Coulomb attraction becomes larger with increasing Z. For the same reason, at a given Z and E, the quantity aK is usually larger than the quantity ocL . Furthermore, at a given Z and E, the quantity aK/aL depends strongly on the L value of the y-ray transition, and on whether it is electric or magnetic. Accurate measurements of aK/aL , which are relatively easy to make, therefore provide a good method of identifying the type of transition, and of determining thereby the relative spins and parities of the nuclear states involved. Internal conversion does not compete with y-ray emission in the sense that one process inhibits the other. The processes are independent alternatives, so the total rate R 1 for transitions between the initial and final nuclear states is the sum R, = R + Rte (16-29) where R and R IS are the transition rates for y emission and for internal conversion. This can be written as R 1 =R+atR=R(l+at ) where at = aK + aL + am + • • • is the total internal conversion coefficient. If the initial state can decay only to a single final state, as is usually true for longer lifetime decays, then from (16-4) 1 1 T = = (16-30) R, R(1 + at) The experimental values of the lifetime T can thus be used to obtain the transition rate R, since at can be accurately calculated. 10 2 e 101 — — II bi 1 0° U U O ô 1Ô-1 _ `iaÀ› 10-2 O c 10 -3 _ ar .,-. t; _c 10-4 - 10-5 01 K-shell internal conversion coefficients for 40 Zr. The solid curves are for electric transitions and the broken curves are for magnetic transitions. The numbers refer to the multipolarity L. Figure 16-20 2 1 0.4 0.2 Nuclear transition energy (MeV) 4 18 16 Te 14 123 °Ag 110 1b030 b'WW `dJ Te125 Te127 ^ 12 197 Tc97° . P{ H g 197 Xe 131 129 91 Te Nb c99 ° '^ ° Xe 129 o 10 133 Ba133 $n 8 11 7 Te 131 ô Nb95 Ba135 Hg 199 ^195 Y87 K^ n 115 •' • Zn69 Mn52 % In113 $r87 . °YBa137 6 4 I I 20 50 I I I 100 200 500 Energy (keV) ^ Lifetimes for a group of magnetic sixteenpole y-decay transitions. The base10 logarithm of the product of the lifetime T (in sec) and the sixth power of the nuclear radius r' (in F) is plotted as a function of the energy of the y ray (in keV). The points are experimental and the straight line is the prediction of the shell model. Figure 16-21 Figure 16-21 is a comparison of the transition rates so obtained, and the predictions of the shell model calculations, for a group of transitions that have been identified as magnetic sixteenpole (L = 4, parity change). The agreement is fair. Inspection of the shell model diagram of Figure 15-18 will demonstrate that all such transitions are between states quite near those filled at the magic numbers. So this is where the calculations should be at their best. For other transitions shell model predictions are in poor agreement with measurement. But collective model predictions can be used in these cases to obtain good agreement since the collective model can describe quite accurately the complicated oscillations in the charge, or current, distributions that are responsible for the emission of electric, or magnetic, radiation. The lifetime of an excited state is frequently expressed in terms of its width. According to the energy-time uncertainty principle, if an average nucleus survives in an excited state only for the lifetime T of the state, then its energy in the state can be specified only within an energy range F, satisfying approximately the relation F=T (16-31) Excited states are, therefore, not perfectly sharp. Instead, they are spread over an energy range of width F. A detailed treatment shows that (16-31) is actually satisfied exactly, providing F is the full width at half-maximum of the energy profile of the state indicated in Figure 16-22. Let us estimate the width of a typical y-decaying state of lifetime T 10 -10 sec. We find h 10 -15 eV-sec r= - ^ = 10 5 eV T 10 -10 sec The width F of an excited state. A mathematical expression for the shape shown in this figure is given in (16-32). Figure 16-22 Energy NUCLEAR DECAY ANDNUCLEAR REACTIONS In comparison to the typical energy E = 1 MeV of such a state, I' is extremely small. In fact, the minute value of the ratio F' 10 -5 eV _ 10 _ 11 ti E 106 eV explains why we have hitherto neglected the widths of the lower energy states that are excited in radioactive decay. When we consider the higher energy states excited in nuclear reactions, we shall see that some of them have widths that are too large to be neglected. 16-6 THE MeISSBAUER EFFECT In 1958 a graduate student named Mössbauer made a discovery that allows the extremely small width to energy ratio of low-lying excited states to be used in many different applications as an energy spectrometer of extremely good resolution. The basic idea of the Mössbauer effect is illustrated in Figure 16-23. A source nucleus in an excited state makes a transition to its ground state, emitting a y ray. The y ray is subsequently caught by an unexcited absorber nucleus of the same species, which ends up in the same excited state. The potentialities as an energy spectrometer become clear when it is realized that changes in the source energy, the absorber energy, or the energy of the y ray in flight, will destroy the "resonant" absorption— even if the energy change is only a few parts in 10 11 ! For some years physicists had been attempting to utilize these potentialities, but with little success. The problem had to do with recoil of the nuclei upon emission and absorption of the y ray, as we see in the following example. Example 16 7. Mdssbauer's original resonant absorption experiments used y rays emitted in transitions from the 0.129 MeV first excited state to the ground state of 77 1r 191 . (a) Consider the recoil of the nucleus, assumed to be free, when it emits the y ray, and determine the downward shift in the energy of the y ray that results from the energy taken by the nuclear recoil. (b) Then compare this energy shift to the width of the first excited state of 77 1r 191 which has a measured lifetime of T = 1.4 x 10 -1° sec. ■ (a) Since the total linear momentum of the decaying nucleus is zero before emitting the y ray, the magnitude of the nuclear recoil momentum p„ after the emission must equal the magnitude of the momentum p y carried by the emitted y ray. As the nuclear mass M is high, its recoil velocity is low, so we may use the classical expression - p„ = V2MK to relate p„ to the kinetic energy of nuclear recoil K. The y-ray momentum py is related to its energy E by the relativistic expression Py = E Thus we have = ^/2MK Py = —c =P„ — or E2 C2 = 2MK E2 K =2Mc2 y Decay Z, A y Excitation Z, A Figure 16-23 Resonant absorption, the basis of the M&ssbauer effect. AE= — K= E2 2Mc2 Because M is so large, AE is very small compared to E, and we may evaluate it approximately by setting E = 0.129 MeV. Using the relation 931 MeV = uc2 to express the nuclear rest mass energy Mc 2 in MeV, we have $ (0.129)2 MeV2 _ —4.7 x 10 MeV AE 2 x 191 x 931 MeV = —4.7 x 10 -2 eV The same result could be obtained by considering the y ray to be emitted from a moving source, the recoiling nucleus, and using the longitudinal Doppler shift formula of Example 2-7 to evaluate the downward shift in its frequency, or energy. (b) If the lifetime of the first excited state of 77Ir191 is T = 1.4 x 10 -10 sec, its width is h 6.6 x 10 -16 eV-sec 4.7x10 -6 eV t= = T 1.4 x 10 -10 sec Clearly, the y ray emitted by the decay from the first excited state of the 771r191 source nucleus cannot excite a 77Ir 191 absorber nucleus from its ground state to its first excited state. The nuclear recoil shift of the y ray is larger by a factor of 10 4 than the width of the state it is supposed to excite. So the y ray is thrown completely out of resonance, and the resonant absorption is destroyed. (If there actually were an absorption, there would be two sources of the total recoil shift, one due to recoil of the emitting nucleus and the other due to recoil of the absorbing nucleus. This is because to be absorbed by a free nucleus, the y ray must have an energy that is greater than the energy difference of the nuclear states by the amount AE = + K. There would also be two sources of the total width of the resonance, one due to the width of the state emitting the y ray and the other due to the width of the state absorbing it.) 4 If the emitter nucleus is bound in a solid, the solid recoils as a whole: the momentum of the solid is equal in magnitude and opposite in direction to the momentum of the emitted photon. Because the mass of the solid is so large, the kinetic energy of recoil is extremely small and can be neglected. An estimate of the recoil energy can be obtained by substituting a mass of a few grams into the equation for AE developed in the preceding example for a single nucleus. That the recoil energy is so small does not necessarily mean that the photon energy is the same as the energy difference of the excited and ground states of the nucleus. The emitting nucleus interacts with atoms of the solid and participates in the lattice vibrations. As explained in Section 11-9, lattice vibrational energy is quantized in units of hv p , called phonons. Here h is Planck's constant and vp is the frequency of vibration. Upon emission of a photon, a phonon may also be emitted or absorbed and, in these cases, the photon energy is greater than or less than the energy difference of the nuclear states by hv p . It is of prime importance that some photon emission events occur without the emission or absorption of a phonon. This is the Mössbauer effect. A typical emission spectrum might look like that shown in Figure 16-24(a). There is a distribution of photon energies, on the order of a few tenths of an electron volt wide, because, in different events, phonons with different energies are created. This is called the phonon wing. The zero phonon or Mössbauer peak is sharp. These are the events for which no phonons are created and the photon energy is the same as the energy difference of the nuclear states. The peak does have width, given by h/T, where T is the lifetime of the excited state, but it cannot be seen on the scale of the drawing. There is also a small number of events for which a phonon is absorbed and the photon energy is greater than the energy difference of the nuclear states. A typical absorption spectrum is shown in Figure 16-24(b). Again there is a sharp peak at the energy corresponding to the nuclear transition energy and at higher energy there is a phonon wing. For photon energies in this range, a phonon is created during the absorption ^ rn ^ 103d33 1:13f1 `dB SSOW31-11 Since the sum of the y-ray energy E and the nuclear recoil energy K must equal the energy available in the y decay, i.e., the 0.129 MeV energy of the first excited state of the decaying nucleus, we see that E is less than the energy of the first excited state by an amount K. This is the downward shift AE in the energy of the y ray due to nuclear recoil. That is Nu m ber of p hotons NU CLEAR DECAY AND NU CLEARREACTIO NS (b) (a) Figure 16 24 (a) Emission spectrum and (b) absorption spectrum for a nucleus bound in a solid. The quantity E is the photon energy and E 0 is the energy difference of the nuclear states. - process and the photon must have a correspondingly higher energy to be absorbed. Note that the emission and absorption spectra overlap only for the Mössbauer events and for a few events involving low energy phonons. Without this overlap the photons emitted would not be absorbed and the absorber levels could not be used as a photon detector. The fraction of events which occur without phonon emission or absorption depends on the temperature. At high temperatures there are fewer such events and the Mössbauer peak becomes indistinguishable from the phonon wing. Most Mössbauer experiments are performed at the temperature of liquid helium. The Mössbauer peak can be scanned by placing the emitter and absorber in different solids and moving them relative to each other. Since the relative velocity y is much less than the velocity of light c, the photon energy in the reference frame in which the absorber is at rest is given by E = 4(1 + v- a result which follows from the Doppler shift of the photon frequency (see Example 2-7). Here E0 is the energy in the frame in which the emitter is at rest and the relative velocity is positive if the absorber and emitter are moving toward each other. This is equivalent to shifting the emission spectrum shown in Figure 16-24(a) to the right by AE = E0 v/c. Photons which pass through without being absorbed are counted and the fraction absorbed is displayed as a function of the relative velocity. Relative motion is usually obtained by mechanically driving the emitter toward and away from the absorber with a variable velocity. The motion is repeated many times to obtain a large number of counts. A typical result is shown in Figure 16-25. The central 'region is due to the overlap of the Doppler shifted Mössbauer emission peak and the Mdssbauer absorption peak. The tails of the curve show some emission and absorption in the phonon wings. More of the phonon wings can be seen if higher relative velocities are used but, for most applications, it is the Mössbauer peak itself which is important. Note that the peak occurs for y = 0, indicating that the nuclear states in the emitter and absorber have the same energy difference. Its full width at half maximum is about 10 x 10 -6 eV. This agrees well with the expectation that it should be twice the width F of the two nuclear states involved, since their measured lifetimes of T = 1.4 x 10 -1° sec yield F = 4.7 x 10 -6 eV. The agreement also verifies (16-31), used to calculate F from T, and therefore verifies the energy-time uncertainty principle! Example 16 8. For 77 Ir 191 what range of emitter speeds must be used to scan the Mössbauer peak? ^ At the half intensity points AE is the sum of the emission and absorption widths or 2 x 4.7 x 10 -6 eV = 9.4 x 10 -6 eV. The emitter speed is given by v/c = AE/E ° = 9.4 x 10 -6 eV/ 0.129 x 106 eV = 7.3 x 10 -11 or y = 0.022 m/sec. So the velocity must range from — 0.022 1 m/sec to +0.022 m/sec, as can be seen in Figure 16-25. - Most applications of the Mössbauer effect deal with situations for which the emitter and absorber are in different environments, so that the emission and absorption peaks do not occur 1.2 08 04 . -0.04 0 0.04 0.08 Source speed (m/sec) 0.012 I I I I 1 I I 1 I -30 -20 -10 0 10 20 30 40 50 Doppler shift (10 -6 eV) 771r191 The Mössbauer effect in at 88 ° K. Note the extremely low source speeds and extremely small resulting Doppler shifts which are sufficient to eliminate the resonant absorption. Figure 16-25 at precisely the same energy. The relative velocity required to obtain maximum absorption is measured and the results used to study the environment of the emitter or absorber. For example, the position of the peak on the Mössbauer curve depends on the electronic configuration around the emitter and absorber nuclei. Wave functions for electrons in s subshells do not vanish at the nucleus and there is a probability that such an electron is inside the nucleus, where it interacts strongly with the protons and changes the nuclear energy levels. The shift in energy is proportional to p(r 2)av, where p is the electron probability density at the nucleus and (r 2)a, is the mean square radius of the proton distribution. Both the excited and ground states are shifted and if the proton distribution radii are different, the energy of the Mössbauer peak is changed by Apl(ri)av - (r(2,)avl• Here the subscript 0 refers to the ground state, the subscript 1 refers to the excited state, and the quantity A is a constant of proportionality. Furthermore the emitter and absorber can be placed in different host solids for which the electron probability density is different. Then the Mössbauer peaks for emission and absorption differ by AE = A(Pe - Pa)I(rl)av - (r0)avl where Pe refers to the emitter and p a to the absorber. To match the Mössbauer absorption peak the frequency of the photon must be Doppler shifted and the peak of the Mössbauer curve occurs for relative velocity y = cAE/E o , not for y = O. A measurement of the relative velocity which tunes the system to maximum absorption can be used to investigate either p e - Pa or (ri)av - (rô)av, provided the other quantity is known. The first quantity is of interest to solid state physicists and chemists who want information about the electron distribution in a solid while the second is of interest to nuclear physicists who want to know if the proton distribution changes when a nucleus is excited. The change in the position of the Mössbauer peak is known as the chemical (or sometimes isomer) shift. By placing the emitter in various solids and measuring the chemical shift for each situation, it is possible to obtain information about the charge state of an ion and about changes in the electron distribution brought about by changes in bonding. Even if it is chiefly the distribution of electrons in p and d subshells which change, as in covalent or partially covalent bonds, these influence the s subshell electron distribution and the chemical shift. Mössbauer experiments are also used to study the internal magnetic fields of solids. For this purpose, one of the most widely used nuclei is 26 Fe 57 . Unstable 27 Co 57 nuclei, implanted in the sample, decay by means of electron capture to the first excited state of 26 Fe 57 and many of the iron nuclei decay to the ground state by y emission. The two 26 Fe 57 states of interest are separated in energy by 14.4 keV and the width of the excited state is on the order of 10 -9 eV. The nuclear ground state has spin io = 1/2 and the first excited state has spin i 1 = 3/2. In magnetic field B a nuclear Zeeman effect occurs, with the result that the ground state splits 103333 a3f1 `d8SSQ W3H 1 . NUC LEAR DECAY A ND NUC LEARREACTIO NS into 2 levels and the excited state splits into 4 levels. The splitting is proportional to It • B, where µ is the magnetic dipole moment of the nucleus. The magnetic dipole moment may be different for the ground and excited states. Since Am i = + 2 transitions are very slow, y rays with 6 different values of energy are produced, in different events. For splitting to occur it is necessary that the magnetic field remain constant over periods which are longer than the precession period of the magnetic dipole moment and it is usual to place the absorber in a host for which the internal field fluctuates rapidly. The absorber then has a single narrow Mössbauer absorption peak, which is used to scan the 6 peaks of the emission spectrum. Both the local magnetic field at the site of the nucleus and the ratio of the magnetic dipole moments of the excited and ground states can be calculated from the positions of the Mössbauer peaks. The Mössbauer effect is particularly useful for the study of the magnetic field in ferromagnetic materials. For example, the transition to a paramagnetic state can be investigated. The effect is also used to study the environment of iron atoms in biological materials. Splitting of nuclear levels also occurs if the nucleus has an electric quadrupole moment and is situated in a spatially varying electric field. Then measurements of the Mössbauer peak separation can be used to obtain information about the electric field gradient at the nucleus. This information, in turn, provides knowledge of the distribution of charge around the nucleus. Mössbauer studies have been used to determine the number of bonds formed by atoms in solids, for example. One important use of the Mössbauer effect has been to verify the prediction of relativity theory that the frequency of electromagnetic radiation is dependent on the strength of the gravitational field. Suppose the emitter is a distance d above the absorber in a uniform gravita= Eo/c2. Compared tional field. When it is in the ground state, the mass of the nucleus is to the absorber it has an additional potential energy mgd = Eogd/c 2 , where g is the acceleration due to gravity. Similarly, when it is in the excited state the nucleus has an additional potential energy E lgd/c2 . The energy 'difference of the emitter states is now in E1 ( 1 + g c d) — Eo (1 + c d ) dI = AE( 1 + 9 c where AE is the energy difference of the absorber states (or of the emitter states in the absence of a gravitational field). The photon energy is now n by = hvol 1 + gd) where hv o is the energy of a photon which will cause a transition in the absorber. The photon energy is greater than the energy of the absorption peak and the absorber must move away from the emitter for absorption to occur. If the emitter is below the absorber, the energy of the photon is less and the absorber must move toward the emitter. The experiments were first carried out by Pound and Rebka, around 1960, and excellent agreement with theory was obtained. Mössbauer was awarded a Nobel prize in 1961. 16-7 NUCLEAR REACTIONS We turn now from nuclear decay to nuclear reactions. One important reason why nuclear reactions are studied is that they provide information about the excited states of nuclei which supplements that provided by the study of nuclear decay. Other important reasons will become apparent when we discuss nuclear fission and fusion in subsequent sections. And, of course, the energy balance in nuclear reactions is studied with real justification because it tells about the masses of the participants in the reactions. In our treatment in Section 15-4 of the energy balance in nuclear reactions we have already considered the application of the total relativistic energy, linear momentum, and charge conservation laws to the initial and final states of a reaction. By way of summary, we shall list these conservation laws and also others that apply to any reaction, and then use them in an example. In any nuclear reaction the following quantities must be conserved: (1) total relativistic energy, (2) linear momentum, (3) Example 16 9. When 50.0 MeV protons in the external beam of a cyclotron strike a beryllium target, it is found that copious numbers of high-energy neutrons are emitted from the target. The highest energy neutrons are emitted in the same direction as the incident protons, and their energy is 48.1 MeV. In order to increase the number of neutrons produced, so that they can be more easily used in other experiments, it is decided to put the beryllium target inside the cyclotron where it will be bombarded by the much more intense internal beam. In this configuration neutrons produced at 30° to the direction of the bombarding protons will have a clear path out past the external parts of the cyclotron. (a) Use the conservation laws to find the residual nucleus in the reaction in which a proton 'Hi is the bombarding particle, a neutron 0n 1 is the product particle, and 4Be9 is the target nucleus. (b) Then apply the conservation laws to predict the maximum energy neutrons produced at 30° to the direction of the 50.0 MeV bombarding protons. ^ (a) The reaction is 1H1 + 4Be9 — ZXA + on i where Z X A represents the unknown residual nucleus. Conservation of charge requires that the sum of the Z values on the left side of the reaction formula equal the sum of the Z values on the right side. That is 1+4= Z +0 or Z= 5 Conservation of the number of nucleons requires that the sum of the A values on the left side equal the sum of the A values on the right side. Therefore 1+9=A+1 or A= 9 Thus we have identified the residual nucleus as 5 B 9, and the reaction is 'Hi + 4Be9 -+ 5B9 + on i - (b) To calculate the energies of neutrons emitted at various angles, we use the conservation of total relativistic energy and linear momentum, combined in the form of the Q-value formula of (15-16) Cl m Q = Kb C1 + m — Ka B( KaK bmamb) 1I2 cos 0 a B/ where Ka and ma are the kinetic energy and mass of the proton, Kb and mb are the kinetic energy and mass of the neutron, m B is the mass of 5B9, and 9 is the angle of emission of the neutron relative to the direction of the proton. Since we are always dealing with the maximum energy neutrons emitted, the Q value always pertains to a situation in which the residual nucleus is in its ground state. First we determine the Q value by setting Ka = 50.0, Kb = 48.1, and 0 = 0, where we use MeV for the unit of energy. Since to a very good approximation ma/mB = mb/mB = 1/9, we have — — 50.0x92 150.0x48.1x9x9 9 = 53.4 — 44.4 — 10.9 = —1.9 Q = 48.1 x — SNO IlJd3 1:1 bd37Of1 N angular momentum, (4) charge, (5) parity, and (6) the number of nucleons. In all the reactions we discussed before the number of nucleons was conserved, i.e., the total number of nucleons present before the reaction equals the total number present after. It is found that this is true of any nuclear reaction. We did not consider the conservation of angular momentum or parity at all in Section 15-4 because these quantities do not affect the energy balance. But they do affect the rates, or cross sections, for the reactions, as we shall indicate later. It is clear that angular momentum must be conserved in a nuclear reaction. Parity is conserved because the interaction involved in a nuclear reaction is the strong parity conserving nuclear interaction, not the weak parity nonconserving fl-decay interaction. or NU CLEA R D EC AY A ND NU CLEAR REACTIONS Q = —1.9 MeV Note that Q is just equal to Kb — K a . But this is only true when m a = mb, 0 = 0, and IQ! is small compared to K a . Knowing the Q value, we find Kb when 0 = 30° by again using (15-16). We have, since cos 30° = 0.866 —1.9 = Kb x 9— 50.0 x 9— 90 1/5 .0 x 0.866 1/Kb We write this as 1.11(1/K b)2 — 1.36 1/Kb — 42.5 = 0 to make it easier to apply the standard solution of a quadratic equation in the unknown 1/K b . This gives 1.36 ± 1/(1.36) 2 +4 x 1.11 x 42.5 _ 1.36 ± 13.79 ^Kb 2.22 2 x 1.11 The equation is not a quadratic in Kb, and has only one valid solution. We may easily show that it is obtained for the plus sign. Using that sign, we find 1/Kb = 6.82 or Kb= 46.5 Thus the maximum neutron energy produced at 30° is Kb = 46.5 MeV The subject of nuclear reactions is a vast one because there are so many different types of reactions. Any stable nuclear particle can be the bombarding particle; any stable nucleus can be the target nucleus; and a wide variety of particles can be emitted from the reaction as product particles. The residual nucleus can be either stable or radioactive. Typically it will be stable if the reaction does not change the Z-to-A ratio of the residual nucleus very much from the stable Z-to-A ratio that the target nucleus has. An example of a reaction that often leads to a stable residual nucleus is (d,a), where the notation means that a deuteron, 'H 2, is the bombarding particle and an a particle, 2He4 , is the product particle. If the reaction significantly decreases the Z-to-A ratio of the residual nucleus, it is usually radioactive and decays by electron emission to raise its Z-to-A ratio back to a stable value. An example of a reaction that often leads to an electron emitting residual nucleus is (n,p), in which there is a bombarding neutron, ° n', and a product proton, 'H'. Reactions such as (p,n) frequently lead to radioactive residual nuclei which are positron emitters or electron capturers, since the reaction raises the Z-to-A ratio of the residual nucleus over the stable value that this ratio has for the target nucleus. Thus nuclear reactors, which produce intense fluxes of neutrons, are usually employed to produce radioactive nuclei for diagnostic work in medicine, and other fields, as "tracers," if the required nuclei are electron emitters. Cyclotrons, which produce intense fluxes of protons or more highly charged particles, are usually the sources of radioactive tracers that are positron emitters or electron capturers. We present in this section examples of the most important types of nuclear reactions by discussing the processes that can occur when a 50-MeV proton from a cyclotron beam is incident on a target nucleus, of average characteristics, contained in a foil placed in the beam. We describe what happens during these processes—and not just what the situation is like before and after, as we have done in our earlier considerations of the mass-energy balance in nuclear reactions. First we shall give a quick summary of the processes that can occur. The proton, of representative energy 50 MeV, will be scattered away from the typical target nu- SNOIlOb31:1 ab'310 f1N cleus by the Coulomb potential, unless it happens to be traveling almost in the direction of the nuclear center. It can also be scattered by the nuclear potential, if it approaches close enough to feel this potential. If it enters the nucleus, it will probably collide with a nucleon in the nucleus after traveling part way through. Either it or the struck nucleon may escape immediately, in a so-called direct interaction, taking away most of the energy it carries (as in the reaction treated in Example 16-9). But at least one of these nucleons will probably be reflected back into the nucleus by the change in nuclear potential at the surface in much the same way a light wave would be internally reflected by a change in refractive index. (See the discussion connected with (6-53).) This nucleon will collide with another nucleon, each of them will make further collisions, etc., forming a cascade of collisions. Before long, the energy is shared among the excitation of many nucleons in what is called the compound nucleus. At this point, no nucleon has enough excitation to allow it to escape its 8 MeV binding to the nuclear potential. After some time, a fluctuation in the energy sharing will make energetically possible the escape of a nucleon. This will happen, if internal reflection at the nuclear surface does not make it necessary to wait for another fluctuation. Eventually, several nucleons are "evaporated," and their binding energies are largely responsible for removing most of the excitation energy of the compound nucleus. They will almost always be neutrons, since the Coulomb barrier acts to retain the protons. When the excitation energy is below the neutron binding energy, the relatively slow process of y decay takes over and allows the system to finally end up in its ground state. We begin a more detailed discussion of these processes by pointing out that the de Broglie wavelength of a 50 MeV proton moving through a 50 MeV deep nuclear potential is 3 F, and the range of nuclear forces is a little smaller. Since both are about one-third of a typical nuclear diameter, in a crude first approximation we may think of the proton as traveling a fairly well-defined trajectory, and not interacting at a distance. Thus the behavior of the proton is something like that of a classical billiard ball. To an even lesser extent, this approximation also applies to the nucleons that the proton collides with. Of course, the wavelike aspects of these particles will make important corrections to the approximation. Since Coulomb scattering has been discussed at length in Chapter 4 and Appendix E, there is little we need to say about it here, except to comment that the differential scattering cross section da/df2 of (4-9), obtained from Rutherford's classical -theory of the scattering by a Coulomb potential, is identical with the da/dS2 obtained from quantum mechanics for that potential. This remarkable situation is true only for a potential corresponding to an inverse square law of force, and it arises in the following way. From dimensional analysis it can be shown that if the force exerted on a particle varies according to r", then the probability of scattering must vary according to h4+ 2". For the inverse square law n = — 2, the scattering probability is independent of the value of Planck's constant h, and this requires that the quantum mechanical and classical calculations lead to the same results. Figure 16-26 shows the probability of elastic scattering (scattering without energy loss except to the recoil of the residual nucleus), as a function of scattering angle 0, for a 50 MeV proton incident on a typical nucleus. At small scattering angles, the differential cross section follows the rapid but smooth decrease in proportion to 1/sin4 (0/2) of Coulomb, or Rutherford, scattering. The reason is that these angles correspond to collisions in which the proton passes through the Coulomb potential, but misses the nuclear potential. At large scattering angles, the scattering probability shows a diffractionlike structure superimposed on a continued decreasing trend. The reason is that protons scattering at these angles make close enough collisions to feel the abrupt onset of the nuclear potential. The diffraction structure of this so-called nuclear potential scattering arises from the interferences between the incident wave • • NUCLEA R DECAY AND NU CLEA RREACTIONS 10 4 QQ 70 • 10 3 Coulomb scattering 10 2 — 10 1 10° Q i 10- 1 - 10 t) Nuclear potential scattering -2 —s 10 —4 10 . I I I 60° I I I 120° I I I 180° B Figure 16-26 The differential cross section for the elastic scattering of 50 MeV protons from a hypothetical nucleus of typical properties. The cross section unit is the barn; 1 bn = 10 -24 cm 2 . function and the various parts of the wave function reflected from various regions of the nuclear potential. A quantum mechanical analysis of the elastic scattering measurements can be used to determine the nuclear potential acting on the high-energy scattered nucleon. The potential is found to be essentially the same as the shell model potential acting on a nucleon in the ground state of the target nucleus, with one important exception. The potential acting on an unbound nucleon, called the optical model potential, is partly absorptive. The absorption represents the fact that such a nucleon has enough energy to collide with a nucleon in the nucleus, and thus be absorbed from the incident beam. (It is absorbed in the sense that it no longer has the same energy, or de Broglie wavelength, so there can be no interferences between its wave function and the wave function for the incident nucleon.) Collisions are possible since the exclusion principle does not have its usual inhibiting effect if the incident nucleon brings in enough energy that both it, and the struck nucleon, can easily find unfilled states to occupy. The incident nucleon can, of course, also scatter from the more familiar nonabsorptive part of the potential. (That is, it can also interact with the nucleus as a whole, represented by the usual attractive potential, without colliding with an individual nucleon of the nucleus.) The optical model is essentially a generalization of the shell model which applies to nucleons of any energy—not just to nucleons of energy such that they are bound in a nucleus. If the scattering probability is measured as a function of the energy of the incident particle, very broad maxima are sometimes seen at certain energies. These are called size resonances, or single particle states. As the two names imply, they can be thought of in two different ways: (1) constructive interferences between the part of the incident particle wave function scattered from the front surface of the nuclear potential and the part scattered from the back; (2) energy levels of the incident particle in the nuclear potential. The first point of view is related to one developed in our discussion of the Ramsauer effect in Section 6-5, but here we shall find the second point of view more useful. The maxima are broad because the single particle states are very wide. If we evaluate the time required for a 50 MeV nucleon to travel a typical nuclear diameter, we find T = Div — 10 -14 m/108 m-sec 1 = 10 -22 sec. Since this time also characterizes the duration of the nuclear potential scattering process, or the lifetime of the particle in the single particle state, the width F of the state is, typically, F = b/T 10 -15 eV-sec/10 -22 sec = 10' eV = 10 MeV. Note that the width of a typical highenergy single particle state is some 12 orders of magnitude greater than the width of a typical low-energy y-decaying state considered at the end of Section 16-5. 5 — O ô4 o. ô3 — 7 2- ^1 C ;. Highest energy inelastic group — u Cc 0 10 20 30 40 Energy of emitted protons (MeV) 50 Figure 16-27 The energy spectrum of protons emitted at a forward angle when 50 MeV protons are incident in the bombardment of a hypothetical nucleus of typical properties. The low-lying energy levels of the residual nucleus show up in the high-energy inelastic groups. As these levels fuse into a continuum, so does the inelastic spectrum. The cutoff in the spectrum at about 10 MeV represents the effects of internal reflection and of the Coulomb barrier in preventing the escape of protons. Now we reconsider the collisions between the incident proton and nucleons of the nucleus. Before colliding, the linear momentum of the proton is approximately in the direction of the beam, and it is of much larger magnitude than that of any nucleon in the nucleus. Linear momentum conservation thus demands that after the first collision both the nucleons tend to move off in the general direction of the beam, and this is particularly so of a nucleon if it happens to be carrying most of the incident momentum or energy. A higher energy nucleon is the one most likely to escape internal reflection at the nuclear surface, and be emitted in what is called a direct interaction. It will preserve its tendency to move in the general direction of the incident beam, even though it is refracted somewhat in passing through the surface. Figure 16-27 shows the spectrum of high-energy protons emitted, at some fixed angle, from a typical nucleus. The group of highest energy contains the elastically scattered protons. They have the same energy as the incident protons (except for the small amount of energy lost to the recoil of the residual nucleus), and they are the result of Coulomb and nuclear potential scattering. The group of next highest energy contains inelastically scattered protons, which come from direct interactions. When a proton is emitted in this group, the residual nucleus remaining is in its first excited state. When a proton is emitted in the group of next lowest energy, that nucleus is in its second excited state, etc. Thus the energy spectrum gives immediately the locations of the excited states of the nucleus. e Figure 16-28 The differential cross section d6/dS2 for the highest energy group in the inelastic scattering of 50 MeV protons from a hypothetical nucleus of typical properties. The general preference for forward angles of emission is characteristic of the direct interaction process, but do-/a2 is suppressed at very small angles if orbital angular momentum is transferred to the nucleus in the reaction. The figure represents da/dS2 for a reaction in which the state excited has orbital angular momentum one unit higher than the ground state. SNOIlOb'3a 1:Itf3l0 flN d .Q E Elastic group NU CLEAR D ECAY AND NU CLEAR REACTIONS Figure 16 29 Illustrating the relation between the linear and orbital angular momenta transferred to a nucleus in a direct interaction inelastic scattering leading to its first excited state. The linear momentum of the incident nucleus is p i . It leaves the nucleus at angle 0 with linear momentum p f . Since it is emitted with almost as much energy as it had when incident, pf ^ pi ^ p, and the momentum Ap = pi — p f is transferred to the nucleus primarily because the direction of p i- differs from the direction of p i . The figure shows the interaction occurring near the edge of the nucleus of radius r', where it will be most effective in transferring angular momentum AL to the nucleus. Since AL = r' x Ap, we have AL = r' Ap sin a r'Apa, because the angle a = 0/2 defined in the figure tends to be small 2r' pa' . For 2p ia 2pa. So AL in a direct interaction. The figure shows that Ap a case in which one unit of orbital angular momentum is given to the nucleus, we have - AL = J1(1+1)h=1.4h is the de Broglie Thus we obtain a 2 ^ 1.4h12r'p = 1.4h/2r'(h/.1,) = 1.4/4ir(r'/,1) where wavelength of the proton. As indicated in the text, r'/.l ^E 5/3 for a 50 MeV proton moving through the 50 MeV deep potential of a nucleus of typical radius r' = 5 F. So a 2 ^ 1.4/4ir(5/3) 6 x 10 -2 15°. Thus the emission angle 0, that this semiclassical calculation 2.5 x 10 -1 rad or a predicts would lead to a transfer of one unit of orbital angular momentum, is 0 = 2a ^ 30°. For angles much smaller than this the reaction would not be possible. If an even larger orbital angular momentum must be transferred to the nucleus, because of the difference between the spins of its ground and first excited states, an even larger angle of emission is required. The general tendency for small angles of emission of the higher energy nucleons coming from direct interactions is shown in Figure 16-28. This represents the differential cross section daldS2 for the protons emitted in the highest energy inelastically scattered group, for the typical case of the previous figure. Also indicated in the figure is the tendency for d6/dS2 to be suppressed at very small angles, if orbital angular momentum must be transferred to the nucleus from the incident proton in the reaction because the state excited has orbital angular momentum different from that of the ground state. The semiclassical argument of Figure 16-29 shows that this tendency reflects the fact that it is difficult for a particle, which experiences only a very small decrease in the magnitude of its linear momentum in interacting with a target of restricted radius, to transfer orbital angular momentum to the target unless it changes its direction of motion enough to produce a sufficient change in the vector describing its linear momentum. Of course the billiard ball arguments, which predict the general trends, fail to predict the oscillations about them seen in Figure 16-28. These arise from interferences between parts of the emitted nucleon wave function that originate in different regions of the nucleus. The structure of the differential cross section curve can be analyzed to yield information about the nuclear spin and parity of the state of the residual nucleus that is excited in the emission of the inelastically scattered group. The procedures used in the analysis are a little too complicated to go into here, but it should be said that they also confirm that parity is conserved in the nuclear interAlthough an incident proton has about a 90% chance of making a collision with a nucleon in traversing the nucleus, in only about 10% of these events will there be a direct interaction nucleon emitted. Usually, both the incident proton and the nucleon it hits are trapped in the nucleus by internal reflection. In about 1% of the events, both the incident proton and the struck nucleon escape. If their linear momenta are measured, valuable information can be obtained about the initial momentum of the struck nucleon when it was in the nucleus (after correcting for refraction and absorption as the protons leave the nuclear optical potential). This has become an important research technique. 2 since this is how long it takes The time required for the first collision is —10'sec, for a nucleon of typical velocity to travel a distance equal to a typical nuclear diameter. The subsequent steps in the cascade of collisions occur at intervals of roughly the same time. In the first two or three steps, there is a chance that one of the nucleons that has collided will escape, but the chance diminishes rapidly because the collisions lead to a sharing of energy. Internal reflection in the nuclear potential becomes more likely as the energies of the individual nucleons decrease, and soon an even stronger inhibition sets in because the excitation energies of the nucleons become less than their binding energies. After perhaps 10 steps of the cascade, which takes —10 -21 sec, the energy is well distributed over all the nucleons of the nucleus. None of these nucleons has enough energy to escape; instead they exchange energy in a kind of thermal equilibrium. This equilibrium system is called the compound nucleus. Because the equilibrium system does not contain a very large number of particles (A — 100), big fluctuations in the energy sharing can occasionally happen. If some nucleon accumulates about ten times as much excitation energy as it has on the average it will have the equivalent of its binding energy, and it can try to escape. Typically, this 'takes about 10 -16 sec, and typically the nucleon will not succeed because it is internally reflected. But eventually a nucleon will escape, carrying away a little more than its binding energy. The elapsed time at this point is something like 10 -15 sec, on the average. After several nucleons have escaped, there is no longer enough excitation energy in the nucleus to provide the — 8 MeV required to emit another nucleon. As we have mentioned, y decay is used to dissipate the final few MeV of excitation energy, and as we have also mentioned, almost all of the nucleons that are evaporated in fluctuations from equilibrium are neutrons. Protons generally cannot accumulate enough energy to overcome the Coulomb barrier acting on them. In a compound nucleus the excitation is distributed over many particles. The excited states of the nucleus are consequently called many particle states. In contrast to the very broad single particle states, the many particle states are fairly narrow. Since it takes the compound nucleus T ' 10 -15 sec to decay by neutron emission, the width I' of a typical one of its states is given in terms of this lifetime by F= h/T — 10 -15 eV-sec/10 15 sec = 1 eV These narrow states can be observed by measuring as a function of the nucleon energy the probability, or total cross section defined in (2-18), that an incident nucleon will form a compound nucleus. As the separation between the many particle states rapidly decreases, and their width increases, with increasing excitation energy, it is easiest to see them if an incident nucleon of the lowest possible energy is used. Figure 16-30 is an example of the many particle states, or compound nucleus resonances, observed when very low-energy neutrons are incident on a typical nucleus. SNOIlOt/3aad310 nN action. 100 e J 0 100 Bombarding neutron energy (eV) 200 Figure 16 30 The total cross section for an incident neutron of very low energy to undergo any reaction other than elastic scattering with a hypothetical nucleus of typical properties. The many particle states of the compound nucleus of excitation energy about 8 MeV (the binding energy brought in by the incident neutron) are seen directly in such data. - The shape of any individual cross-section resonance in Figure 16-30 is given by the Breit- Wigner formula r(E) = n(212702 )2 cts F„Fr (E — Ei) 2 + F2/4 (16-32) 6 where the total reaction cross section G r(E) is the cross section for the formation of a compound nucleus which decays by any process other than emission of a neutron of the same energy as the incident one; E is the energy of that neutron and .1 is the corresponding de Broglie wavelength; Ei is the resonance energy; F is the full width at half-maximum of the resonance; and F,,, or F r, is F times the ratio of the probability of decay of the compound nucleus by emitting a neutron of the same energy as the incident neutron, or by any other process, to the total probability of decay by all processes. The same formula, with F r replaced by Fn gives the total cross section for the formation of a compound nucleus which subsequently decays by emitting a neutron of the same energy as the incident neutron, i. e., the compound nucleus elastic scattering cross section a (E). A similar formula describes the shape of the y-ray resonances in Figure 16-22 and Figure 16-25. In fact, the same basic form is found for the resonance curve in any type of damped wave or oscillatory motion. The student may have seen a derivation of it in the case of a damped pendulum or a resistive resonant circuit. A very interesting feature of (16-32) that is particular to the case of low-energy neutron resonances is the factor 7r(),./2n) 2, which determines the maximum possible value of the total neutron cross sections at the peak of a resonance. It is the area of a circle of radius equal to the neutron de Broglie wavelength A divided by 2ii, and not the area of a circle of nuclear radius r'. Since A » r' for sufficiently low-energy neutrons, the total reaction, or scattering, cross section at a resonance peak can be very much larger than the projected geometrical cross section, nr'2, of the nucleus. This is possible because the low-energy neutron acts like a wave, not a classical particle, and at resonance it can interact with the target nucleus whenever the expectation value of its position passes within a distance of about A,/27r of the nucleus. Later we shall see that this property is very important in the operation of a nuclear reactor. Another characteristic of a compound nucleus is that in its relatively long lifetime it forgets the details of how it was formed. For instance, since the original linear momentum of the incident particle becomes distributed over the many particles that are excited in the compound nucleus, there cannot be a preference for the neutrons to be emitted in the beam direction. Figure 16-31 shows an example of the isotropic differential cross section for emission that characterizes the low-energy neutrons , 10 -3 I 0° I I 60° I J I I 120° e I I 180° Figure 16-31 The differential cross section for the compound nucleus evaporation of low-energy neutrons following the 50 MeV bombardment of a hypothetical nucleus of typical properties. The lack of a preferred direction of emission is characteristic of the compound nucleus process. produced in nuclear reactions. These are the neutrons evaporated from compound nuclei. The measured differential cross section for the emission at 40° of the highest energy inelastically scattered proton group from 26 Fe 54 bombarded by 60 MeV protons is da/dûû = 1.3 x 10 - 3 bn per unit solid angle. These inelastic protons leave the 26Fe54 residual nucleus in its first excited state at 1.42 MeV. Calculate how many events per second are recorded in a measurement of the inelastically scattered protons made with a detector of area 10 -5 m2 located 10 -1 m from a pure 26 Fe 54 foil, of mass per unit area 10- 1 kg/m2, which is bombarded by a 10 -7 amp proton beam. (In nuclear physics, the unit of area for cross sections is called the barn, written bn; 1 bn = 10 -28 m2.) ^ The number n of nuclei, or atoms, contained in a unit area of the target is the mass per unit area of the target divided by the mass of a 26 Fe 54 atom. Since this is almost exactly 54 times the mass of a 1 H 1 atom, we have 10 -1 kg/m 2 n= = 1.1 x 1024nuclei/m 2 54 x 1.66 x 10 2 kg/nucleus Example 16 10. - The solid angle dQZ subtended by the detector at the target is its area divided by the square of its distance from the target. So 10-5 m2 dS1= (10_1m)2 =10-3 sr (A unit solid angle is called a steradian, written sr; 1 sr = solid angle subtended by 1 m 2 at 1 m.) The product of the differential cross section du/dSZ for the events of interest times the solid angle dS2 subtended by the detector gives an area per nucleus that is effective in leading to the detected events. This effective area per nucleus da is bn sr da = 1.3 x 10 -3 x 10 -3 sr = 1.3 x 10 -6 bn/nucleus = 1.3 x 10 -34 m2/nucleus nucleus The product of the effective area per nucleus, da, times the number of nuclei per unit area, n, equals the probability that one incident proton will produce a detected event. This probability P is P = dun = 1.3 x 10 - 34 m2/nucleus x 1.1 x 1024 nuclei/m 2 = 1.4 x 10 -10 That is P = 1.4 x 10 -1° event/proton The number of protons per second I in the incident beam is the charge per second in the beam divided by the charge per proton, or 10 coul/sec I= = 6.2 x 10 11 p roton sec 1.6 x 10 -19 coul/proton / SN OI10 `d3H 1:I `d310 11N N NUC LEAR D ECAY AND N UCLEA R REA CTIONS Multiplying the number of protons per second I by the probability P that a proton will produce a detected event, we obtain the number of events detected per second. This is dN = IP = 6.2 x 10 11 proton/sec x 1.4 x 10 -10 event/proton = 87 event/sec Note that the preceding equation can be written as dN=IP=I do- n =d^ In (K2 in agreement with (4-8), the definition of a differential cross section. 4 16-8 EXCITED STATES OF NUCLEI Figure 16-32 reviews information about the excited states of nuclei obtained from the study of nuclear decays and nuclear reactions. The energy-level diagram represents energy states of the entire nucleus, and not of individual nucleons in the nucleus. Up to an excitation of — 8 MeV, the states y decay to the ground state. Above —8 MeV, nucleon emission becomes energetically possible, and this process soon becomes the dominant decay mode since it has a much shorter lifetime or much higher transition rate. This is the region of the many particle states. They are very closely spaced because there are a large number of different divisions of energy between the many particles of the nucleus that lead to almost the same total nuclear excitation energy. Continuum of unbound states Nucleon E emission __# ^^ m = amommwm ••••••••• ■■■ IM mA^^a^'^'^^^^ ■ ^^^—^—. ^ y emission Figure 16 32 - . ^.^^.^..^^^ EMU First excited state ^'8 MeV ground state An over-all view of the excited states of a typical nucleus. (3/2, even) (3/2, odd) (7/2, odd) 8 0 17 The spacing decreases with increasing A because more divisions are possible. It also decreases as there becomes more excitation energy available to divide among the particles. Thus the many particle states soon fuse together into a continuum of allowed nuclear energy states, but the continuum maintains some structure since the many particle states tend to group together into the very wide single particle states through which they have been excited. Each many particle state in a group has the same angular momentum and parity as the original single particle state. Now let us look more carefully at the low-lying excited states. The simplest case is for a nucleus whose ground state consists of a core of filled magic number subshells, plus one nucleon. In the first excited state, the extra nucleon jumps to the next highest energy subshell, and the core remains undisturbed. Figure 16-33 shows, as an example, the low-lying excited states of 8 0 17 . The spin and parity of the first excited state agree with the predictions of Figure 15-18 of the shell model, but its energy is not predicted by the model. If the ground state of a nucleus consists of a core of filled magic number subshells, plus one hole, its first excited state is the shell model state of the hole. But in both these cases, usually even the second excited state has unpredicted spin and parity. Between magic numbers, the first few excited states of nuclei often show regularities expected from the collective model. An example is the even-even nucleus 92U238 illustrated in Figure 16-34. On the right are the observed energy levels, and on the left are the predictions of the quantum mechanical formula E l( 2± 1) h 2 i= 0,2,4,6,... (16-33) for the allowed values of total energy E of rotation of a symmetrical rotator, such as an ellipsoid rotating with rotational inertia, or moment of inertia, f, about an axis perpendicular to its symmetry axis. Equation (16-33) is the same as (12-1) that we derived while treating the rotational spectra of diatomic molecules, except that (1) the quantum number we must use here is i, instead of r; (2) we therefore avoid confusion by using the symbol 5, instead of I, for the rotational inertia; and (3) since we deal with a symmetrical rotator, only even values of the rotational quantum number i will arise. The reason for the last statement is that the rotational eigenfunction for the system has the parity of (-1)`, and thus will be odd if i is odd, and even if i is even. It can be shown to follow from the symmetry of the rotator that it can have no angular momentum in the direction of its symmetry axis, and that all of its states must have the same parity. Since an even-Z even-N nucleus has an even parity ground state, we therefore see that its excited states must also have even parity. Thus the odd values of i must be deleted in (16-33). Inspection of the excellent agreement between (16-33) and the low-lying states of 92 U 238 , shown in Figure 16-34, makes it clear that collective effects in that nucleus deform it into an ellipsoidal shape. In particular, the evidence is that it has essentially the same shape in all of I310 f1N 3OS31b1Sa31I0X3 o Figure 16-33 The low-lying excited states of 8 0 17 . Excitation energies, spins, and parities are shown. The spin and parity of the first excited state are correctly predicted by the shell model as are, of course, the spin and parity of the ground state (see Figure (1/2, even) 15-18). The energy of the first excited state is not (5'2,even) predicted by the model, nor are any of the characteristics of the higher excited states. (1/2,odd) 0 0 (12, even) NUCLEAR DECAY AND NU CLEAR REACTIONS (10, even) (12, even) co 1.0 (10, even) a)a^ (8, even) 0.5 — > (8, even) a) (6, even) (6, even) (4, even) (2, even) (0, even) (4 even) (2, even) (0, even) Symmetrical rotor 0 92 238 92U238. Right: The data. Left: The predicThe low-lying excited states of tions for the rotational states of a symmetrical ellipsoid of rotational inertia f. The value of 5 was chosen to give the best fit to the experimental energies, the value being 2940u-F 2 . The average discrepancy in the fit is only 0.0204 MeV, which indicates the success of the model. Most of this discrepancy is in the form of very small downward displacements of the higher rotational states from the predicted values. It can be understood as a small increase of 5 in these states due to centrifugal effects. Figure 16-34 these states, including the ground state, because the predictions of (16-33) are obtained by using a constant value of the nuclear rotational inertia J. Of course, we already know, from the discussion of the collective model and nuclear electric quadrupole moments in Section 15-10, that even-N, odd-Z or odd-N, even-Z nuclei, with N and Z between the magic numbers, are usually ellipsoidal in shape. The tendency for an ellipsoidal shape is particularly strong for such nuclei in the region of the rare earth elements (the lanthanides), and it is fairly strong for nuclei in the region of uranium and the elements just above it in the periodic table (the actinides), since in these regions both N and Z are far from magic numbers. What is new here is the evidence for the ellipsoidal shape of the even-N, even-Z nucleus 92U238. Recall that in Section 15-2 we concluded that if a nucleus has zero nuclear spin in its ground state, as is the case for 92U238 and all other even-N, even-Z nuclei, then it would not be possible to observe an ellipsoidal shape in its ground state, even if it actually has such a shape, in averaged measurements like the hyperfine splitting determinations of the electric quadrupole moment. The measurements on nuclear decay and nuclear reactions that lead to the 92 U 238 energy levels of Figure 16-34 are sensitive to the actual shape of the nucleus—not to just the average of all possible orientations of the shape as is true of the hyperfine splitting measurements on zero spin nuclei. These more sensitive measurements show that the nucleus is ellipsoidal. Similar measurements show that this is generally true of all nuclei, no matter whether N and Z are even or odd. The only exceptions are nuclei with N and Z at or very near the magic numbers, where collective effects are insignificant. Such nuclei are truly spherical. Since the deformation of nuclear shapes from spherical to ellipsoidal is a consequence of collective effects, nuclei where these effects are strong because both N and Z are far from magic numbers have, in their low-lying energy states, relatively large and essentially rigid deformations, like 92U238. These states consist of the various rotations allowed by quantum mechanics. Nuclei in which N and/or Z are not very far from magic numbers have deformations that are not very large, and that are not rigid. The low-lying states of such a nucleus involve vibrations of its shape back and forth between an ellipsoid elongated in the direction of its symmetry axis and an ellipsoid shortened in that direction. The motion is further complicated by the fact that the nucleus can also rotate. Nevertheless, the first few energy levels of nuclei of this type are rather evenly spaced, like the energy levels of a simple harmonic oscil- 1.0 (3, even) (4, even) ^ ^ (2, even) n 0.5 (2, even) The low-lying excited states of 78 Pt 192 . For these states the nuclear shape is both vibrating and rotating. Figure 16-35 0 78 Pt (0, even) 192 lator. An example is found in the low-lying excited states of 78 Pt 192 , shown in Figure 16-35. Note that the lowest collective states of ellipsoidal nuclei, whether rotational, vibrational, or a combination of both, have much smaller excitation energies than the lowest shell model states of spherical nuclei. This can be seen by comparing Figures 16-34 and 16-35 with Figure 16-33. Another regularity of low-lying excited states is found in comparing these states in certain pairs of nuclei whose shell model descriptions are identical, except that the neutrons and protons are interchanged. An example of such a so-called mirror pair of nuclei is 'H3 and 2He 3, whose ground state shell model descriptions were shown in Figure 16-14. Another example is 3Li7 and 4Be'. In general, two nuclei form a mirror pair if they contain the same number of nucleons, and if the number of protons in one equals the number of neutrons in the other. We have found that mirror pairs play an important role in allowing the experimental determination of the f-decay coupling constant. The reason is that since the charge independent nuclear forces do not distinguish between neutrons and protons their ground state eigenfunctions are identical, except for the effect of the small difference in the relatively weak Coulomb forces in these very low-Z nuclei. For the same reason, their ground state eigenvalues are almost identical. That is, their ground state energies, or masses, are very nearly the same. Furthermore, the eigenfunctions and eigenvalues of the low-lying excited states of a mirror pair should be essentially the same if nuclear forces are charge independent. Thus there should be a close correspondence between the spins, parities, and energies of these states in the two members of a mirror pair. This is found to be the case. An example is shown in Figure 16-36, which presents the low-lying excited states of 3Li' and 4Be'. More complicated relations are found between the lower excited states of mirror triads, such as 5B12 , 6C12 , 7N12, and of even larger sets of isobars (nuclei with common values of A). These relations will be discussed briefly in the following chapter in the section titled Isospin. 8 (5/2, odd) (5/2, odd) (5/2, odd) (5/2, odd) (7/2, odd) (1/2, odd) (3/2, odd) 0 3 Li 7 Figure 16-36 (7/2, odd) 4 (1/2, odd) (3/2, odd) 4 Be7 The low-lying excited states of the mirror pair 3 Li 7 and 4 Be 7 . The ground state energy of 4 Be 7 is actually about 0.5 MeV above the ground state of 3 Li 7 due to the extra Coulomb repulsion energy in the former. I310f1N 3OS31b'1S a3110X3 â^ w NUCLEAR DE CAY AN D NU CLEA R REA CTIONS 16-9 FISSION AND REACTORS Fission was discovered by Hahn and Strassman in 1939. Using chemical techniques, they found that the bombardment of uranium by neutrons produces elements in the middle of the periodic table. It was immediately realized that a very large amount of binding energy would be released in the fission of a nucleus of large Z, into two nuclei of intermediate Z, because of the consequent reduction in the positive Coulomb energy. Measurements soon showed that an energy of around 200 MeV per fission was released, and carried away largely by the kinetic energy of the two fission fragments. Measurements also showed that two or three neutrons were emitted in each fission. This suggested to several people the possibility of using these neutrons to induce other uranium nuclei to fission, using the neutrons that would be emitted from those fissions in the same way, and so forth, in a chain reaction. A trivial calculation showed that if all the nuclei in a block of uranium could be made to fission in a chain reaction, the energy liberated would be -'10 6 times larger than in burning a block of coal, or exploding a block of dynamite, of the same mass. (This is the usual factor of 10 6 obtained when comparing nuclear to atomic, or molecular, energies.) Because of the extremely short time scale characterizing nuclear processes, the energy would be expected to be released much more rapidly than in a chemical explosion. The potentialities as a weapon were obvious, particularly because of the imminence of World War II. The events that followed dominate the history of this century, but here we shall be concerned with the peaceful applications of fission. In a nuclear reactor, fission proceeds at a carefully controlled rate. A continuous source of power is obtained from the thermal energy produced when the fission fragments come to rest in the materials of the reactor. After many years of technological development, nuclear reactors have become sources of power which are very competitive, economically, with coal or oil. They are also important sources of unstable isotopes, not normally found in nature, that are used as tracers for diagnosing the operation of a variety of processes of interest to medicine, biology, chemistry, and engineering, or used for radiation therapy. The isotopes are produced in nuclear reactions induced by the intense flux of neutrons present in a reactor. Fission occurs in nuclei of large Z because the total Coulomb repulsion energy of the protons in a nucleus is considerably decreased if the nucleus splits into two smaller nuclei. The nuclear surface energy increases in the process, but its magnitude is much smaller than the magnitude of the Coulomb energy, so the increase in surface energy does not alter the fact that it is energetically favorable for a large Z nucleus to fission. The Coulomb energy is minimized if the nucleus splits into two fission fragments that contain equal numbers of protons, but usually the splitting is not completely symmetrical because of the preference for magic numbers. In Example 15-6 we used the binding energy data to show that the energy associated with fission of 92U238 is close to 200 MeV. This value is also fairly typical of the fission energy for other isotopes of uranium. The steps involved in fission are indicated schematically by the set of drawings in Figure 16-37. These define a parameter s which characterizes the progress of the fission by specifying (somewhat unprecisely) the elongation of the fissioning nucleus, and then the separation of the two fission fragments. Figure 16-38 is a schematic plot of V(s), which is the part of the energy of the system that depends on s. Starting PRE2TTT Figure 16-37 A schematic representation of the steps involved in the process of nuclear fission. V 0) s^ Figure 16 38 - An energy diagram for a fissionable nucleus. at small s, there is relatively little change in the Coulomb repulsion energy with increasing s, but the surface area of the nucleus increases rapidly. According to the liquid drop model, the increase in surface area produces an increase in the surface energy. Thus V(s) increases with increasing s, for small s. As s continues to increase, a surface tension effect produced by the surface energy causes the nucleus to assume the form of two regions connected by a narrow neck. And eventually the nucleus splits. After it splits, the surface energy no longer changes with s, and V(s) decreases with increasing s, following the decrease in the Coulomb repulsion energy of the two fission fragments. Since V(s) first goes up and later comes down, it necessarily must pass through a maximum. Calculations, based on the liquid drop model, show that for a typical nucleus of large Z this maximum is about 6 MeV above V(0). We already know that V(0) is about 200 MeV above V(co). Thus we see that nuclei are normally stable to decay by fission since they are sitting, with total energy E = V(0), at the bottom of the depression in the potential V(s). The process can take place by barrier penetration but, because the mass entering in the exponent of (6-55) for the barrier penetrability is very large, the probability of barrier penetration is extremely small. If 92U238 decayed only by this spontaneous fission process, its lifetime would be —10 16 yr. A process of much more importance is induced fission. Usually this is brought about by the nucleus capturing a low-energy neutron. As the binding energy E„ of the last neutron in a nucleus of large Z is around 6 MeV, in favorable cases the capturing nucleus receives enough energy to put it over the top of the fission barrier. Very often this high excitation energy actually does go into collective vibrations in which it becomes sufficiently elongated to fission. It is like a highly excited compound nucleus, with most of its excitation energy in the form of violent vibrations. Induced fission is perhaps the best example of the collective motions that are implied by the liquid drop model, and form the basis of the collective model. The process is indicated in terms of an energy diagram in Figure 16-39. As we saw in Example 15-7, for 92U235 the neutron binding energy En , made available when a neutron is captured, is about 6.5 MeV, so that fission can take place even if the neutron brings in no kinetic energy. This is also true for 92U233. But when 92U238 captures a neutron only about 5 MeV of binding energy is made available, so a neutron must have a kinetic energy of about 1 MeV to cause fission in this nucleus. The difference between the behavior of these isotopes arises from the difference in the pairing energy, as explained in Example 15-7. We have oversimplified our discussion of fission by speaking as if the fissioning nucleus is spherical in its ground state. In fact we saw in Section 16-8 that uranium nuclei are ellipsoidal in their ground states. Even before receiving any excitation energy the nucleus is somewhat elongated. When it receives about 6 MeV of excitation from capturing a neutron, it further elongates, goes over the top of the fission barrier, and then fissions. S1:1O19b'3N aNd NOISS Id ( 0 E NUCLEARD ECAY AND NU CLEAR REACTIO N S co 0 s Figure 16 39 - An energy diagram illustrating induced fission. Evidence has been accumulated which indicates that the fission barrier V(s) shown in Figures 16-38 and 16-39 is probably also an oversimplification, and that the barrier actually has a double hump something like that shown in Figure 16-40. In its ground state the nucleus is very near the bottom of the deeper depression with its ground state elongation s', and stable except for the highly improbable process of barrier penetration. Calculations based on the collective model, i.e., on a combination of the liquid drop and shell models, predict that there is a second shallower depression in V(s) at the larger elongation s". At this elongation the nucleus would also be stable, except for barrier penetration, if it had no excess energy. One prediction of these calculations is that it should be possible to put a fissionable nucleus into a state with the elongation s", where it would remain for a long time. Some spontaneous fission experiments give strong indication that this is true. Because these calculations are also the ones that lead to the prediction of the Z = 114 magic number, mentioned at the end of Section 16-2, the spontaneous fission experiments have made physicists take the prediction concerning Z = 114 seriously. As far as induced fission is concerned, the presence of the shallower depression in V(s) would probably not make very much difference. The possibility of using fission to produce power in a chain reaction arises from the fact that two or three neutrons are emitted in each fission process. An idea of why it happens can be obtained by considering Figure 16-41. The figure shows the Z and N values of the nuclei which are the most stable for each value of A (as in Figure 15-11). These nuclei are represented by the curve of stability. The large dot indicates the fissioning nucleus, and the two small dots indicate the fission fragments. The fragments are usually not symmetrical. Instead one of the fragments has Z and N values near the magic numbers 50 and 82, presumably because this is favored energetically. But both fragments have nearly the same Z/N ratio as the fissioning nucleus. Since their A values are much smaller, their Z/N ratios are smaller than those of stable nuclei with these A values. The fission fragments tend to have relatively too many neutrons. Most of the necessary readjustment slowly takes place T a 114 ^ C W O s' s" s Figure 16 40 - --^ A double hump fission barrier. 100 90 r Curve of stability •• J ^ .• 70 ^Z I 80 Fissioning -I nucleus 60 m ^ C) j .' ^' 40 . 30 • , • • .^ •, Fission fragments .^^ 20 I ^ 10 82 II 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 N = (A — Z) Figure 16-41 Illustrating that fission fragments tend to have relatively too many neutrons. by the fission fragments going through a succession of 16 decays, but some of the readjustment is achieved promptly at the time of fission. Part of the decay of the fissioning compound nucleus takes place through the evaporation of two or three neutrons, of several MeV kinetic energy. Figure 16-42 provides more information about the asymmetry of the fission fragments, by plotting the distribution of their A values. Another process leading to the emission of neutrons, which is of small probability ( 1% of the probability for the prompt emission of neutrons by evaporation from the excited compound nucleus) but of great importance in making it easier to control 132 = 50 + 82 80 90 100 110 120 130 140 150 Mass number A 160 Figure 16-42 The mass spectra of fragments produced in the low-energy neutron in0 233 92U235 and 94PU239. duced fission of 92 S1:1010`d3a aNd N OI SSId - 50 -50 NUC LEAR DECAY AN D NUC LEAR REACTIO N S a reactor, is that of delayed neutron emission. As an example, consider the electron emitting fission fragment 35Br87. Because of the fl-decay selection rules, this nucleus occasionally decays to a state of its daughter 36Kr 87 that is sufficiently excited to allow it to emit a neutron, leaving the stable nucleus 36Kr 86 . Neutrons are emitted in this process, with a delay characteristic of the 55 sec half-life of 35Br 87. Another important example involves delayed neutron emission from 54Xe 137 . For 36Kr 87 or 54 Xe 137 the neutron number N equals a magic number, 50 or 82, plus one. Thus the process depends on the unusually small neutron binding energy that the shell model would predict in such cases. In a reactor, the chances for the neutrons emitted in one generation of fission ultimately inducing the next generation of fission are enhanced because the neutrons scatter from low mass nuclei in the moderator surrounding the pieces of uranium. They rapidly lose energy to the recoil of these nuclei, and they are no longer able to induce fission in 92U238. But they are not lost to non fi ss i on 92U238 capture since moderation occurs outside the uranium pieces. The moderator is usually 6C12, in the form of graphite, or 'H2, in the form of deuterium oxide (heavy water). It is possible to use 'H1, but only if the uranium is highly enriched in 92U235. The reason is that 'H1 has a large cross section for capturing neutrons to form 'H2, and these neutrons are lost from the chain reaction. The purpose of the moderator is to reduce the velocities of the neutrons to the lowest values possible, so that their de Broglie wavelengths 2 will be as long as possible. Because of the wavelike properties of neutrons, their cross section for capture by a nucleus of radius r' is limited by the value of A, and not by the value of r' (see (16-32)). The moderator brings the neutrons into thermal equilibrium at the operating temperature of the reactor, which makes 2 » r' 92U235 capture cross section for neutrons diffusing back andtherbyics into the uranium pieces. The cross section must be large enough that the probability of one of the two or three neutrons from each fission subsequently inducing another fission be at least equal to 1. When the reactor is starting up, this probability is made to be slightly bigger than 1. It is gradually reduced to be precisely 1 when the reactor attains equilibrium at its operating level. Adjustments are made by varying the lengths of control rods inserted into the reactor. These contain nonfissionable nuclei like 48Cd 113 , which have extremely large capture cross sections for thermal energy neutrons, because of fortuitiously located compound nucleus resonances. The delayed neutrons facilitate the control of a reactor by introducing some neutrons in the chain reaction that are emitted with a reasonably long time constant. The kinetic energy given to the fission fragments in the fission process is converted into thermal energy as these fragments come to rest in the materials of the reactor. Typically, this heat is used to make steam which drives turbines that operate generators producing electrical power. Breeder reactors utilize the 99% abundant 92U238. These nuclei capture low-energy neutrons. They cannot fission in low-energy neutron capture, but the resulting unstable 92UZ39 nuclei undergo two successive ,6 decays, turning into the stable nuclei 94 Pu239 . This end product has the same ability to fission in low-energy neutron capture as does 92 U235 . The average time lapse between the emission of a prompt neutron in a fission taking place in a nuclear reactor, and the capture of that neutron to induce the next generation of the chain reaction, is of the order of 10 -3 sec. (Most of the time is required by the moderator to bring the neutron into thermal equilibrium.) Use this figure to estimate the number of free neutrons present in a reactor operating at a power level of 10 8 W. ^ In Example 15-6 we found that the energy release in the fission produced by one neutron is about 200 MeV 1.6 x 10 -13 joule . x — 10 _ 11 joule/neutron E ^^ 200 MeV = MeV neutron Example 16-11. 16-10 FUSION AND THE ORIGIN OF THE ELEMENTS We close our study of nuclear physics with a discussion of nuclear fusion, and its part in the production of stellar energy and of the chemical elements. Fusion involves two nuclei of very low A amalgamating to form a more stable nucleus. The increased stability arises because the A value of the nucleus formed is nearer the value A 60 where the binding energy per nucleon maximizes (see Figure 15-10). From the point of view of the liquid drop model, the situation would be explained by saying that nuclei of very low A have too much surface, relative to their volume, for maximum stability. The Coulomb energy increases in fusion, but its magnitude is too small to prevent the process from happening because nuclei of low A also have low Z. It is fair to say that fusion is the most important phenomenon in nature. Fusion of low-A nuclei in thermal motion is the source of energy of the sun. So it is ultimately the source of energy for all the natural physical and biological processes on the earth. And there is reason to hope that some day fusion will be usable directly on earth to produce energy in a fusion reactor. Because much of the earth is covered by seas containing the hydrogen isotopes 1 11 1 and 1 H2, the fuel supply of low-A nuclei would be almost inexhaustible. One of the several potentially useful reactions for a thermal fusion reactor is 1 H2 + 1 H 2 2He3 + ° n 1 + 3.2 MeV (16 34) where the energy is the Q value of the reaction. But it is much more difficult to build a fusion reactor than to build a fission reactor. The problem lies in the repulsive Coulomb barrier acting between two nuclei, which must be overcome, or at least penetrated, before they can get close enough to allow the short range nuclear forces to come into play and fuse them together. Figure 16-43 plots the cross section for the reaction of (16-34), as a function of the kinetic energy of the bombarding particle. The cross section does not attain a measurable value until the kinetic energy exceeds — 10 4 eV. And even at that energy the cross section is very small because the reaction takes place by penetration of the Coulomb barrier acting between the nuclei, which is — 10 6 eV high. Unless the kinetic energy is appreciably higher than — 10 4 eV, the cross section, and therefore the rate of the reaction, is much too small to be of practical use in a fusion reactor. In the interior of the sun similar reactions do occur, with the kinetic energy of the bombarding particles coming from their thermal energy. This energy is — kT, where k is Boltzmann's constant — 10 -4 eV/°K, and T is the interior temperature of the sun ti 10' °K. Thus the thermal or kinetic energy at the temperature of the interior of the sun is only - 103 eV, and fusion reactions proceed there at an extremely slow rate. Of course, the sun produces large amounts of energy, but only because it is so large that it makes up for the very slow rate of the individual reactions. An efficient thermal fusion reactor of dimensions possible on the earth would have to have a - ô m 0 1 rn 0 FUSI ON AND THE ORIGIN OF THE ELEMENTS If one free neutron has a lifetime before capture of —10 -3 sec, and if on capture it produces a fission energy of —10 -11 joule, one free neutron produces a power of 10 -11 joule/neutron p— 3 — 10 s W/neutron 10 sec So if the power level of the reactor is P = 108 W, the number of free neutrons is 10s W ,r1016 neutron N=P^ p 10 s W/neutron The large number, or flux, of free neutrons present in a reactor makes the device very useful for producing unstable isotopes on the low-Z side of the curve of stability (electron emitters). This is done by placing probes containing appropriately chosen stable isotopes into the interior of the reactor. The unstable isotopes are formed when the isotopes in the probes capture neutrons. 4 NU CLEAR DECAY AND NUCLEAR REAC TIONS 1 0° 10-3 10 4 i 104 I I I 10 5 10 6 10 7 Deutrongy(V) Figure 16-43 The cross section for the reaction in which two deuterons fuse to form 2 He 3 plus a neutron. much higher rate for the individual reactions. Thus its temperature would have to be higher—at least an order of magnitude higher than the internal temperature of the sun! There are ways of achieving such a temperature, if ways can be found to produce a container that would not be destroyed by the temperature. The sun is so massive that gravitational fields provide a container automatically. On earth, it might be done by using magnetic fields acting on the charged nuclei to contain them. Attempts have been made to build such a container, fill it with hydrogen, and then heat the contents by, for instance, firing in a laser beam. There have even been some indications of success, but only for very short times before the container fails. Another approach is to use extremely powerful lasers to add enough thermal energy to small pellets of fusible material to cause them to react. In such a procedure, energy would be produced in a sequence of miniature explosions, and it would be absorbed within a very strong metallic container that would be heated as a consequence. Obtaining thermal fusion for energy production on earth remains one of the great challenges to science and engineering. There are no difficulties in obtaining fusion on earth by nonthermal means. It can be done with ease by using a cyclotron, or other accelerator, to give the bombarding nucleus enough energy to overcome the repulsive Coulomb barrier it sees surrounding the target nucleus; but the amount of energy liberated in the relatively few fusions that can be produced in this way is very small, and microscopic compared to the energy that goes into running the accelerator. So there seems to be no hope of using nonthermal fusion as an efficient energy source. Efficient thermal fusion has, however, been taking place for a long time in the stars. It is responsible for the energy produced in all stars, and also for the production in the stars of all the elements through iron. It is believed that stars are initially formed from the extremely low-density ( 1 atom/cm 3) gas that is known to be distributed throughout interstellar space. The gas is primarily hydrogen, but it contains also about 10% helium that is thought to have been made by fusion from hydrogen in the "big bang" that occurred when the universe started some 10 1 o years ago, plus small amounts of higher Z elements present in certain regions for reasons that will be explained later. In the well-accepted big-bang theory, the electrically neutral universe would have started from a region containing neutrons compressed to an extremely high density. In the first few moments, the following set of processes would take place ow_ _,1H1 + e +v 1 H 1 — ° n 1 +é + on11H1 + 0n' 1H1 e+ë—y+y y—>e+é +y 1H1 1 + on 1 ^ Hz + 1 H2 -* 2He3 + 1H3 °n 1 + 1H2 1Hz+y zHe 3 + on " H3 + 1H11 1 H 3 + 'H' 2 He4 + on " Detailed calculations, involving the cross sections for all the reactions in both sets, show that enough helium could be formed to account for the approximately 10% abundance now observed in interstellar space. The remaining 90% of the matter there would, in agreement with observation, essentially all be in the form of hydrogen, most of the protons being formed from the )6 decay of the neutrons that found themselves in free space after the big bang. According to our present understanding, the first stage in the formation of a star from the very tenuous gaseous material of interstellar space involves some sort of upward fluctuation in density over a very large region. In such a fluctuation, the gas collects into a cluster. If it is large enough it stabilizes itself because of the gravitational attractions between the atoms it contains, and it begins to grow by attracting more atoms. As a cluster grows, the increasing strength of the gravitational attractions causes the interior pressure, and therefore the interior temperature, to build up. When the temperature in the core exceeds about 10 5 °K the hydrogen atoms in that region are completely ionized into a plasma of protons and electrons. And when the temperature exceeds about 10' °K the protons have enough kinetic energy due to their thermal motion to have a small probability of penetrating the repulsive Coulomb barriers that tend to keep them apart. (The 10% helium present does not participate at this stage because the temperature is too low for penetration of the higher Coulomb barriers surrounding these nuclei.) Then two protons can fuse together and form a deuteron, according to the reaction 1 H 1 + 'H 1 _÷ 1 H2 +é +v+ 0.42 MeV where the energy is the energy liberated in the process. Since the process requires both barrier penetration and the weak )3-decay interaction, it occurs at an extremely low rate. The necessity of /3 decay arises from the fact that nuclear forces are not able to make the system 2He 2 (the diproton) be bound, for reasons that will be explained in the next chapter. Although the rate for the deuteron forming reaction is very low, when enough deuterons are present large concentrations of helium can be formed by processes that have relatively high rates because they involve the strong nuclear interaction. Helium is formed in a star in a cycle of reactions, called the proton-proton cycle, consisting of two of the preceding reactions, followed by two of the reactions 1 Hz+ 1 H 1 —, z H e3 +y+5.49MeV and then by one reaction in which the two 2He3 nuclei that have been formed fuse as follows 'H' + 'H' + 12.86 MeV 2 He3 + 2He3 2He4 + Counting the 1.02 MeV liberated each time one of the two positrons annihilates with an electron, the total energy liberated in one cycle is 26.72 MeV. But a little more rn o ^ Sec . 16-1 0 FUSION AN DTHEORI GINOF THE ELE MENTS and there was an equilibrium, at very high temperatures, between neutrons, protons, electrons, positrons, antineutrinos, and y radiation. The radiation, "cooled" by repeated Doppler shifts in the subsequent expansion of the system, would now constitute the isotropic 3°K blackbody radiation whose recent detection provides some of the experimental evidence for the validity of the big-bang theory (see Section 1-5). In the high-density equilibrium distribution that existed for a short time before the system blew itself apart, helium would be formed by the reactions 0 NUC LEAR DECAY AND NU CLEAR REACTIONS ri; ^ Ci. U than 1% of this energy is carried completely away from the star by two neutrinos. The remainder, plus gravitational contraction, continues to heat the core. When the density of helium (including the helium initially present) in the core of the cluster that has turned into a star becomes high enough, carbon can be formed. What happens is that two 2He4 nuclei combine to form 'Be. This nucleus can then combine with another 2He4 nucleus, to form 6C12 , providing it does it almost immediately. The point is that 'Be is not stable, and it will decay back into two 2He4 in about 10' sec if it does not capture the third 2He4 nucleus. The rate for this improbable sounding reaction would be essentially zero if it were not for the existence of an excited state in 6C 12 at an energy of about 7.65 MeV. When the temperature is —108 °K, there is a resonance in the reaction, which makes its cross section reasonably large, because the kinetic energies of the three combining 2He4 nuclei plus the Q value equals the energy of the excited state in 6 C 12 . Straightforward processes involving the successive addition of nucleons to 2He4 could not be used to form elements with A greater than 4 because such processes are blocked by the complete instability of nuclei with A = 5. When enough carbon has been formed in the core of the star, the principal source of energy production is through the carbon cycle, in which carbon plays the role of a catalyst (i.e., it reappears at the end of the cycle) to aid in the fusion of four 'H' into one 2He4, plus assorted positrons, neutrinos, and y rays. The carbon cycle consists of the set of reactions 6012 + 1111 --÷ 7N13 + y + 1.94 MeV 7N13 ^ 6C13 + e+v+ 1.20 MeV 60.3 + 1H1 7N14 • y + 7.55 MeV 7N14+'H1+>8015 + y + 7.29 MeV 8 0 15 -* 7N 15 +é+ v + 1.73 MeV 7N15 + 1H1 -* 60.2 + 2 He4 + 4.96 MeV Counting the energy liberated in the annihilation of the two positrons, the total energy liberated in one cycle is 26.72 MeV, just as in one proton-proton cycle. In the carbon cycle a little more than 5% of the energy is lost from the star by the two neutrinos emitted in the higher energy /3 decays. The rate at which the carbon cycle occurs is much higher than the rate for the proton-proton cycle, because no step in the carbon cycle is anywhere as near as slow as the first step in the proton-proton cycle. The sun has not yet reached the stage in its development where the carbon cycle dominates the energy production, although there is some carbon cycle going on. In a star with a mass greater than about two sun masses, the gravitational contraction is very rapid and the core temperature rapidly reaches the value — 10 8 °K required for carbon formation and the carbon cycle. As the concentration of the stellar core continues, its temperature increases and elements heavier than carbon are formed. At first this is done by the successive captures of 2He4 by 6 C 12, forming 8 0 16 , then 10NeZ0, and then 12Mg24. But when the temperature is — 10 9 °K these nuclei have enough thermal energy to penetrate their Coulomb barriers, directly forming nuclei of even A through 26 Fe 56 . Nuclei of comparable but odd values of A can be formed if the even-A nuclei are forced by turbulence out of the stellar core into the surrounding cooler zone where the protonproton cycle is still going on. In this zone reactions can occur such as 1 0Ne2o + 1H1 "Nall + y 11Na21 -* lo Ne21 + é + v Some of these odd-A nuclei can then participate in reactions which lead to the production of neutrons. An example is loNe21 + 2He4 -* 12M g24 + on1 QUESTIONS 1. Give a qualitative explanation of why an a particle can penetrate a Coulomb barrier. 2. What would be the effect on the a-decay lifetimes, and thus on the terrestrial abundances, of the elements between A = 200 and A = 260 if there were no magic numbers so that the a-decay energies of Figure 16-1 followed the general trend predicted by the semiempirical mass formula? SN OIlS3fl0 The elements heavier than iron are not formed by fusion because the A values exceed the value A ^ 60 where the binding energy per nucleon maximizes; beyond A 60 the Coulomb repulsion of the protons becomes so large that it is no longer energetically favored for a nucleus to capture another nucleus. However, it is certainly favored for a nucleus to capture a neutron since this releases the neutron binding energy of • 6 MeV. Nuclei through 83Bi2o9 are formed by a succession of neutron captures and f3 decays, starting from 26Fe 56. The neutrons come from reactions such as the example given in the preceding paragraph, and the /3 decays take place when necessary to adjust the Z-to-A ratio of a nucleus to a stable value. The abundances of the nuclei that are built up in the succession of neutron captures are inversely proportional to their neutron capture cross sections, averaged over the very high temperature thermal distribution of neutron energies. This is true since, if a nucleus has a large neutron capture cross section, there is only a small chance that it will not capture a neutron and be converted into some other nucleus. The abundance of elements in the solar system is inferred primarily from the composition of the sun seen in atomic spectra measurements, and from solar produced cosmic rays intercepted on the earth. Data are also obtained from meteorites, and from the composition of the earth itself. The abundance curve from iron to bismuth was presented in Figure 15-1. It is very nearly the reciprocal of the neutron capture cross-section curve. On the average, the cross sections increase (and the abundances decrease) as the A value of the nucleus increases, simply because the nucleus becomes larger. But there are some pronounced departures from the average due to the effect of filled subshells on neutron affinities and binding energies which, in turn, affect the neutron capture cross sections. The heaviest element that can be formed in the neutron capture processes discussed here is bismuth. The reason is that when 83 Bi209 captures a neutron it becomes 83 Bî210 , which a decays into 81Tî206 with a half-life of only five days. This decay is so rapid that it takes place before there is time for further neutron capture by 83 Bi210 in the moderate flux of neutrons that normally exists in a star. When some stars come to the end of their life because they have almost depleted their supply of hydrogen, not enough "nuclear heat" is generated in the core to prevent very rapid gravitational collapse. They then explode in a matter of a few seconds with tremendous violence, and they produce a tremendous flux of neutrons. The most spectacular example in recorded history of such a supernova is a star that was observed in 1054 A.D. to flare up to a brightness that allowed it to be seen for a short time in full daylight. Its remnants are now called the Crab nebula. The elements heavier than bismuth are believed to be made in successive neutron captures, starting from S3Bi209 , and using the intense neutron flux present in a supernova. The process happens so rapidly that the a decay of 83Bi21° is of no consequence. The preceding discussion of the life history of a star assumed that its original composition was purely the primordial 90% hydrogen plus 10% helium mixture. There are many examples of such "first-generation" stars. And there are also many examples of "second-" or "third-generation" stars, which are thought to have been originally composed partly of supernova remnants; the sun is one example. In these stars heavy elements will be present, and in fact reasonably abundant, even before the stage is reached where the carbon cycle is the dominant source of energy. CV NUCLEAR DECAY AND NUCLEAR REACTIONS T 3. Is there a 4n + 4 radioactive series? 4. Where would be a likely place to look for traces of the predicted superheavy element Z = 110, A = 294? 5. Construct a figure illustrating a case in which there are three /3-stable nuclei with the same even-A value. 6. Explain why the emission of a particle, with the properties postulated by Pauli, removes the difficulties with angular momentum in /3 decay. What about linear momentum? 7. Just how do neutrinos and antineutrinos differ from photons, which also have no charge or rest mass? 8. How do you justify the fact that electrons are emitted from nuclei in /3 decay, when in Example 6-6 we showed that electrons cannot be contained in nuclei? 9. In the Wu experiment, what is the direction of the magnetic field applied to align the nuclei, from the normal point of view, and as seen in the mirror? What about the direction of the current fl ow in the windings of the magnet that produces the field? 10. Consider viewing the Wu experiment in a mirror located below the nucleus (the mirror being horizontal) instead of in a mirror located to one side of the nucleus (the mirror being vertical). Explain how the arguments in the text would be modified, but in such a way as to lead to the same conclusions. 11. Sugar molecules have a definite helicity. What do you think is responsible? 12. Consider the electric and magnetic monopole, dipole, and quadrupole moments of a nucleus. Are each of these ever found with a constant, nonzero value? With an oscillatory value? Explain why some of these cases do not occur, and what the nucleons are doing in cases that do occur. 13. Electric dipole radiation is emitted with a characteristic spatial pattern (see Appendix B). Does this suggest an experimental technique for determining the type of radiation emitted in a y decay? What would be the difficulty in using such a technique? 14. In y decays from states of excitation energy around 1 MeV, or less, to ground states, electric dipole radiation is almost never observed. Use the shell model to explain this. 15. Predict, from the shell model, the regions of the periodic table in which the first excited states of nuclei have particularly long lifetimes for y decay. 16. A hyperfine splitting measurement tells you that the ground state spin of a nucleus is i = 3/2. What are the possible 1 values of the subshell occupied by the nucleon responsible for the spin? What other information would tell you which of these is the actual value? What could you measure to obtain this information? 17. Explain exactly why the optical model potential which a nucleus exerts on a bombarding nucleon of energy 50 MeV is different from the shell model potential which it exerts on one of its own nucleons. What would you expect the optical model potential to be like for a bombarding nucleon of energy 5 MeV? 18. Why is it easier for an incident nucleon to enter a nucleus than it is for either of the nucleons, resulting from its first collision, to escape? 19. What are the differences between single particle states and many particle states? How are they related? What about y-decaying states? 20. If the compound nucleus 30Zn64 forgets the details of how it was formed, it should make no difference if it were excited by bombarding 29Cu63 with protons, or 28 Ni60 with a particles, providing the same many-particle states are excited. Devise an experiment to test this prediction. 21. What difference (if any) is there between a permanent nuclear ellipsoidal deformation, as seen in the ground and low-lying states of many even-Z, even-N nuclei, and a nuclear electric quadrupole moment? 22. Why is it reasonable to expect that the space distribution of protons in a nucleus is approximately the same as the space distribution of neutrons? 23. Nuclear reactors are particularly suited to power submarines. Give reasons why this is so. PROBLEMS 1. (a) Use the semiempirical mass formula to predict the a-decay energy of 83 Bî 210 (Hint: Take the atomic mass of 2 He4 directly from Table 15-1.) (b) Compare your results with the cc-decay energy shown in Figure 16-1. 2. Derive (16-4), relating lifetime to decay rate. 3. Derive (16-5), relating lifetime to half-life. 4. Unstable nuclei, of decay rate R, are being produced at a constant rate I in nuclear reactions caused by a cyclotron bombardment. If the production process commences at t = 0, calculate the number of these nuclei that will be present at t = t'. (Hint: The equation to be solved is obtained by rewriting (16-2) in the form dN/dt = —NR, and then adding I to the right side. Can you justify this?) 5. Prove the validity of (16-6), the relation between the numbers of decaying nuclei and their decay rates, in radioactive equilibrium. (Hint: Write a set of equations comparable to (16-2). The first of the set is exactly like it, and the others contain two similar terms on the right side. Then show immediately that (16-6) is a solution to these equations providing the decay rate of the parent is very small compared to the decay rates of the daughters.) 6. 90Th 232 cc decays to its first daughter 88Ra228. It is observed that a very thin foil containing 1.0 g of 90Th 232 emits a particles from this decay at the rate of 4100/sec. Use these data to show that the half-life of 90Th 232 is 1.4 x 10 10 yr. 82pb2°8 is the stable final daughter of the radioactive series whose parent is 90Th 232 (see 7. Figure 16-5). The half-life of the parent is 1.4 x 10 10 yr. A piece of thorium ore containing 1 kg of 90Th 232 is found to also contain 200 g of 82Pb208. (a) Assuming that all of the 82 Pb 208 in the rock came from the decay of 90Th 232 , and that none of it has been lost, calculate the age of the rock; that is, calculate how many years have passed since thorium was concentrated in the minerals in the rock and the equilibrium decay began. (b) There are a total of six cc particles emitted in the decay of the radioactive series. Assuming that a negligible number of them could have escaped from the rock because it is so thick, calculate how much helium originating from the a decays should be in the rock. (c) The first daughter of the series, 88Ra228, decays with half-life 5.7 yr into the second daughter, 89Ac228. Calculate how much 88Ra228 should be in the rock. 8. For a three-atom decay sequence A -> B — C with C stable, show that, assuming an initially pure sample of A atoms, the number of B atoms at any subsequent time is given by . N A°^`A N = 2A NB 213 — [e - .1 t - e - ABt] A 90Th230 which in turn decays to 88Ra226. The half life of this uranium isotope is 24.7 x 104 years, and of the thorium isotope 8 x 10 4 years. (a) How many grams of 92U234 and (b) how many grams of 90Th23° will be present after a 20 g sample of pure 92U234 has decayed for 15 x 10 4 years? 10. (a) Use the semiempirical mass formula to evaluate the points on the A = 27 mass parabola for the only three values of Z that are found with this value of A, namely Z = 12, 13, 14. (Hint: It is only necessary to evaluate the terms of the formula that depend explicitly on Z.) (b) Which value of Z corresponds to the stable nucleus? (c) Find the types of decay, and the decay energies, for the fi decays of the two unstable nuclei. 9. 92U234 decays to rn w SW 8 18OHd 24. Can you devise a configuration of magnetic fields that could, at least from a naive point of view, contain nuclei in a thermal fusion reactor? 25. Why is it impossible for two protons to fuse, as in the first step of the proton-proton cycle, without a /3 decay simultaneously taking place? 26. What happens to the y rays that are emitted in stellar nuclear reactions of the protonproton or carbon cycle? 27. How would it be possible to use a neutrino detector on the earth to tell whether the dominant reactions in the center of the sun are in the proton-proton cycle or in the carbon cycle? NU CLEAR DECAY AND NU CLEAR REACTIONS Lv 11. Example 16-3 showed that the fi decay of 4Be 7 to 3Li7 proceeds only through electron capture because the atomic mass difference is 0.00093u, which is less than two electron rest masses. Consider a 4Be7 nucleus, initially at rest, that captures a K electron and emits a neutrino. (a) Estimate the recoil velocity of the nucleus after the process is completed. (Hint: The recoil energy of the nucleus is negligibly small.) (b) Suggest a technique for detecting electron capture. 12. The table here lists three points of the measured momentum spectrum, R(p e), of electrons emitted in the fi decay of a nucleus of small Z. Pe mc R(p e) 13. 14. 15. 16. 17. 18. 2.8 4.9 6.9 375 500 250 (a) Make a Kurie plot of these points. (b) Then extrapolate to find the end point Km," of the spectrum, and so determine the decay energy E. Several examples of the initial and final nuclei in f decays, and their ground state spins and parities, are listed here. For each decay between ground states, determine if it is allowed by the Fermi or Gamow-Teller selection rules. If it is forbidden, estimate roughly the factor suppressing the decay rate. (a) 2He6 (0, even) 3 Li6 (1, even); (b) 4Be10 (0, even) --> 5 B 1° (3, even); (c) 16S35 (3/2, even) -> 17 C135 (3/2, even); (d) 39191 (1/2, odd) 40Zr91 (5/2, even). (a) By using the information given after (16-16), which represents the fi decay of the neutron, calculate the FT value for the decay. (b) Compare with the value calculated in Example 16-4. (a) Use the FT value obtained in Problem 14 to estimate the value of the /3-decay coupling constant. (b) Compare with the estimate obtained in Example 16-5. (c) What justification is there for assuming that the nuclear matrix element is essentially equal to one for the /3 decay of the neutron? Consider a set of positive charges moving in a confined region, like protons in a nucleus, and interacting with an external field of electromagnetic radiation. The charge density is p, so the current density is - pv, where v is the characteristic velocity of the moving charges. Show that the energy of interaction between the magnetic dipole moment of the charges and the external magnetic field is smaller by a factor of - v/c than the energy of interaction between the electric dipole moment and the external electric field. Since the values of the matrix elements for magnetic dipole and electric dipole radiation are proportional to these interaction energies, and since the transition rates are proportional to the "squares" of the matrix elements, the magnetic dipole transition rate is smaller than the electric dipole transition rate by a factor of - (v/c) 2 . (Hint: (i) Show that the ratio of the interaction energies equals the product of the ratio of magnetic to electric dipole moments times the ratio of the magnetic to electric field strengths. (ii) Argue that the ratio of the magnetic to electric dipole moments equals the ratio of the current density to the charge density. (iii) Evaluate the ratio of the magnetic to electric field strengths for electromagnetic radiation in a vacuum.) Consider a set of positive charges q moving in a region of linear dimensions -r', and interacting with the electric part of an external field of electromagnetic radiation of wavelength 2. Show that the energy of interaction between the electric quadrupole moment of the charges and the external electric field is smaller by a factor of - r' /A than the energy of interaction between the electric dipole moment and the external electric field. For the reasons explained in Problem 16, this leads to the conclusion that the electric quadrupole transition rate is smaller than the electric dipole transition rate by a factor of - (r'/2)2 . (Hint: (i) Consider a sinusoidal electric field E = E0 sin 2n(x/2 - vt). (ii) The energy of the electric dipole is E times its dipole moment - qr'. (iii) The energy of the electric quadrupole moment is OE/ax times its quadrupole moment - qr'2 .) The spins and parities of the ground state, first excited state, and second excited state of 62Sm 152 are (0, even), (2, even), and (1, odd). Determine the types of radiation emitted in the y decays between these states. v(m e mg) cB C2 genie + 2pgmg) Ee - Eg 3 (b) Show that the ratio of the magnetic dipole moments is 11e - 3 v(3/2 -> 1/2) - v(1/2 -> 1/2) pg - v(1/2 -> 1/2) - v(1/2 --> - 1/2) (c) Once the chemical shift is subtracted, typical experimental values are v(3/2 -> 1/2) = - 5.57 mm/sec, v(1/2 -> 1/2) = - 3.14 mm/sec, and v(1/2 ---> -1/2) = + 1.04 mm/sec. Calculate the magnetic dipole moment ratio and the magnetic field at the site of the emitter. Take pg = 4.56 x 10 -28 joule /tesla. 24. The reaction 1H1 + 3 Li 7 -i 4 Be 7 + 0 n 1 is sometimes used to produce monoenergetic neutrons from a source of monoenergetic protons. The Q value of the reaction is -1.64 MeV. If a 3 Li7 target is bombarded by a beam of 5 MeV protons, at what angle to the beam are 2.5 MeV neutrons emitted? 25. Use the Q values of the three reactions listed as follows to calculate the energy available for the f decay of 14Si31 1H2 + + 14Si29 1H2 + 14Si30 1H2 15P31 14S1 29 + 2 He4 14 — Si 30 1 H 1 —+ 14 51 31 + 1 H 1 Q = 8.158 MeV Q = 8.388 MeV Q = 4.364 MeV S i/1 3 1 8 0 8d 19. Verify that the parts of the y-decay selection rules relating L to the nuclear spins represent angular momentum conservation requirements. Use the fact that a y ray from a transition of multipolarity L carries L units of angular momentum. 20. Prove that the integrals in (16-26) and (16-27), which represent components of the electric quadrupole and magnetic dipole matrix elements, yield zero unless the initial and final nuclear states have the same parity. 21. Consider carrying out a resonance absorption experiment with the source and absorber not at a low temperature, using the transitions between the first excited state and the ground state of 771r191 considered in Example 16-7. (a) Calculate how much velocity would have to be given to the source to obtain enough Doppler shift to compensate for the recoil of the source and absorber nuclei, so that resonant absorption would be obtained. (b) Would it be possible to get the required velocity by mounting the source on the rim of a centrifuge? (c) Would an extremely sharp resonance be obtained in this manner? 22. A series of Mössbauer experiments is performed with the same emitter and absorber but with the emitter placed in various host materials. The absorber is always in the same host. (a) Show that the chemical shift (the absorber velocity corresponding to the center of the spectrum) is a linear function of the electron probability density p at the site of the emitter and so is given by v = ap + b, where a and b do not depend on the sample in which the emitter is placed. (b) The following data was recorded for four samples : v 1 = 1.42 mm/sec, v 2 = 0.23 mm/sec, v3 = 0.37 mm/sec, and v4 = 0.95 mm/sec. For the first two samples p was found using other experimental data, with the results p 1 = 8.0248 x 1034 m - 3 and P2 = 8.0286 x 10 34 m - 3 , respectively. Find the values of a and b, then find the electron probability densities for samples 3 and 4. 23. 26 Fe 57 , in a ferromagnetic iron sample, is used as an emitter in a Mössbauer experiment. The absorber is in stainless steel and has a single narrow Mössbauer peak in its absorption spectrum. The emitter is in a steady magnetic field so the first excited state splits into 4 levels, identified by m e = - 3/2, -1/2, + 1/2, or + 3/2, while the ground state splits into 2 levels, identified by m g = -1/2 or + 1/2. The energies of the excited states are given by Ee + 2peBme /3 and those for the ground states are given by Eg - 2pgBmg, where Ee and Eg are the energies in the absence of a magnetic field. The magnetic dipole moments of the states are µ e and pg, respectively. The signs in the energy equations are different because the moments are in opposite directions for the excited and ground states. (a) Neglect any chemical shift and show that the Mössbauer peaks occur for absorber velocities given by co T NU CLEAR D ECAY A ND NU CLEAR REACTI ON S ^ 26. Consider a one-dimensional traveling wave eigenfunction li(x) = e`kx where k = -\/2m(E — V)/h Take the potential energy V to be complex, so that it can be written V = VR + iV I . (a) Show that k becomes complex and can be written k = kR + ik1 . (b) Then show that the amplitude of the traveling wave is a decreasing exponential function of x. Eigenfunctions such as this are used to describe the absorption of particles traveling through the complex optical model potential. (c) In what distance would the associated probability density decrease by a factor of 1/e? 27. The total cross section for fission of 92U235 by incident neutrons of energy 1 MeV is about 1 bn. If such a neutron passes through a uniform slab of 92U235 of mass per unit area 10 -1 kg/m2 , what is the probability that it will produce a fission? 28. When a 10 -8 amp beam of 17 MeV protons is incident on a 29 Cu 63 target foil of mass per unit area 10 -2 kg/m2 , it is observed that a counter of area 10 -5 m2 at 1 m from the target detects 240 elastically scattered protons per minute if it is placed at an angle of 30° to the incident beam. Determine the value of the differential cross section. 29. In considering the effects of radiation on the human body, it is necessary to define units for the amount of radiation absorbed. One of these is the rad (radiation absorbed dose): 1 rad indicates an average of 0.01 joule of absorbed energy per kg of body tissue, regardless of which part of the body actually was exposed. A 75 kg worker at a hospital radiology lab inadvertently swallows a capsule containing 5 mg of 88 Ra 226 (half-life = 1600 years). This isotope of radium undergoes alpha-decay, each a particle carrying an energy of 4.87 MeV. If 90% of these particles are stopped inside the man's body, what radiation dose does he receive in 12 hours? 30. There is a resonance in the cross section for neutrons incident on 92U235 with the following set of measured Breit-Wigner parameters: E1 = 0.29 eV; F = 0.140 eV; I,,, = 0.005 eV. (a) Show that F = I'„ + F r, and then evaluate Fr . (b) Calculate the total reaction cross section at the peak of the resonance, 6,.(Ei). Measurement shows that about 75% of ur(E1) goes into fission. (c) Calculate the lifetime of the compound nucleus formed in this resonance. 31. The energies and spins of the first four excited states of 72Hf 180 are: 0.093 MeV, i = 2; 0.309 MeV, i = 4; 0.641 MeV, i = 6; 1.085 MeV, i = 8. (a) How well do the ratios of these energies agree with the predictions of (16-33)? (b) Use that equation to evaluate the rotational inertia of the nucleus. 32. (a) Use (15-16) with Q = 0 to calculate the energy lost by a 1 MeV fission neutron to the recoil of 6 C 12, if it scatters elastically at the typical angle 90° from such a nucleus in the moderator of a nuclear reactor. (b) How much energy does it lose in a 90° scattering if its energy has been reduced to 0.001 MeV? (c) How much energy does it have, on the average, if it is in thermal equilibrium at an operating temperature of 500°K? (d) Estimate the number of scatterings required to bring the neutron into thermal equilibrium. 33. Compare the energy release, per kilogram of fuel consumed, in the thermal fusion reaction of (16-34) to the same figure of merit for the fission of 92U235 34. A hypothetical H-bomb with the explosive power of 50 Megatons of TNT uses the reaction 1H2 + 1 11 2 — 2He3 + 0n 1 (Atomic masses are: H 2, 2.014102u; He 3 , 3.016029u.) The required A-bomb "trigger" is rated at 2 Megatons (included in the 50 above). One ton of TNT produces 2.6 x 10 22 MeV of energy. (a) How much energy does each fusion produce? (b) How much hydrogen does the bomb contain? 17 INTRODUCTION TO ELEMENTARY PARTICLES 17-1 INTRODUCTION nucleon forces as an interface between nuclear and particle physics; probing microstructure of matter 17 2 - NUCLEON FORCES 618 618 review of previously considered information; deuteron ground state and asymmetry of potential; spin dependence; charge independence; charge exchange scattering and Serber's potential; repulsive core; spin-orbit term; approximate description of nucleon interaction 17 3 - ISOSPIN 631 two nucleon systems correlated by isospin; single nucleon isospin; isobaric analogue levels; isospin conservation in nucleon interaction 17-4 PIONS 634 pion fields; exchange origin of nucleon interaction; uncertainty principle argument for pion mass; Yukawa potential and Klein-Gordon equation; assignment of spin, parity, and isospin quantum numbers; baryon number; decay; muons and muonic neutrinos; weak and strong interactions 17 5 - 641 LEPTONS three families of leptons; quantum number assignments; lepton number conservation; intermediate boson 17 6 - 643 STRANGENESS strangeness quantum number; other quantum number assignments; strange particle production and decay; role in weak decay parity violation; hyperons; particles decaying by the strong interaction 17 7 - summary of particle properties; hadrons; i and baryons and mesons; vector mesons 17 8 - 649 FAMILIES OF ELEMENTARY PARTICLES mesons; short-lived OBSERVED INTERACTIONS AND CONSERVATION LAWS 653 summary of strengths, field quantum properties, ranges, and signs of observed interactions; discussion of gravitational interaction; summary of quantities conserved in various interactions; conservation laws, invariance principles, and symmetries; charge conservation and gauge invariance; charge conjugation, CP violation, and time reversal; CPT theorem; decay of K ° K ° system, CP and time reversal violation 617 INTRO DUCTION TO ELEMENTARY PARTICLE S QUESTIONS 661 PROBLEMS 662 17-1 INTRODUCTION This chapter begins with a qualitative, but rather complete, discussion of the nuclear forces that act between two nucleons. The subject is at the border between the fields of nuclear physics and elementary particle physics, and its study will lead us in a natural way into the study of all the elementary particles. Along the route we shall also obtain a comprehensive view of the basic properties of, and interrelations between, the fundamental interactions and conservation laws of nature. The history of quantum physics can be viewed as a sequence of probings, with ever increasing resolution, into the microscopic structure of matter. The first step was the discovery that matter is composed of about 90 different atoms. At that time atoms were considered to be the elementary particles. (The word is from the Greek atomos = indivisible.) Then it was found that atoms are composed of nuclei and electrons. Q Later it was dicovered that nuclei consist of neutrons and protons. At this stage there co was a very satisfactory situation—all matter appeared to be composed of various combinations of a small number of elementary particles: the neutron, the proton, and the electron. But then it was found that there are also muons and n mesons. Their discovery was followed by the discovery of many other related mesons, and an even larger number of particles related to neutrons and protons themselves. The number of such particles became so large again that it was likely that they could be composed of various combinations of a small set of more elementary ones, as was the case for atoms. We will take up that even finer division of matter in the next chapter. 17 2 NUCLEON FORCES - In our study of nuclei we have obtained some information about the nuclear forces acting between nucleons, which we shall call nucleon forces. Since nuclei are studied in terms of models, and since models do not involve the detailed behavior of these forces, we have learned only about certain of their general features. These are: 1. Nucleon forces are strong. The energy associated with the force is larger than that associated with electromagnetism by about 2 orders of magnitude, larger than that associated with /i decay by about 14 orders of magnitude, and larger than that associated with gravitation by about 40 orders of magnitude. More complete discussions of the meaning of these comparisons will be given later. 2. Nucleon forces are short range. They cut off in a distance of about 2 F, so that two nucleons passing each other at a larger distance do not interact by the nucleon force. 3. Nucleon forces are attractive in their over-all effect. Otherwise nuclei would not exist since the nucleons would not bind together. 4. Nucleon forces are charge independent. That is, they make no distinction between protons and neutrons. Evidence for this is seen in the tendency of small-Z nuclei to have N = Z, and in the similarities of the low-lying energy levels of pairs of mirror nuclei. 5. Nucleon forces saturate. The term describes the fact that a nucleon in a typical nucleus experiences attractive interactions only with a limited number of the many other nucleons. This must be true since otherwise the average binding energy per A measure of the departure is the quantity q/r'2 (see Figure 15-20), which has a value of about 6% if we take r' equal to the charge distribution half-value radius a. Calculations show that the measured electric quadrupole moment is obtained if the ground state of the deuteron is a mixture in which 96% is an 1 = 0 state and 4% is an 1 = 2 state. Such a mixed state will also have the measured even parity since for both of its component states 1 is even. Since the ground state nuclear spin is measured to be 1, both component states must have j = 1. The vector addition diagrams of Figure 17-1 illustrate the relations between the 1 and j quantum numbers in both states, and they show that, for both, the intrinsic spins of the proton and neutron are essentially parallel and the quantum number specifying the total intrinsic spin angular momentum is s = 1. In spectroscopic notation, the dominant state is 3 S 1 and the less probable state is 3D 1 . (The superscript gives the value of 2s + 1; the letter gives the value of 1, with S meaning l = 0, P meaning 1 = 1, D meaning 1 = 2, etc.; the subscript gives the value of Vector addition diagrams showing the spin, orbital, and total angular momentum quantum numbers s, 1, and j in the two component states of the deuteron. In the dominant state, I = 0. Since j = 1 it is necessary that s = 1 in this state which, in spectroscopic notation, is designated 3 S 1 . In the less probable state, I = 2. Since j = 1, it is also necessary in this state that s = 1. The state is designated 3D 1 . Figure 17-1 s =1 l =2 j= 1 3D1 S30 1dO3 N O31 Of1 N nucleon, AE/A, would be proportional to A instead of being approximately independent of A. Most of the information about nucleon forces that can be obtained from the study of systems as complicated as a typical nucleus is listed above. More detailed information is obtained by studying simpler systems containing only two nucleons where the nucleon forces have their most directly observable effects. The simplest of these systems is the ground state of the deuterium nucleus 1 H 2 , or deuteron, consisting of a neutron and a proton bound together by the nucleon force. In this section we shall study this system, and other systems containing two unbound nucleons. To avoid complicated quantum mechanical calculations, we shall keep the discussion largely qualitative. But we shall, nevertheless, be able to see how the analyses of certain critical experiments have been used to determine the properties of nucleon forces. At the end of the section we summarize by presenting a quantitative description of the most important of these properties. In a subsequent section we consider the meson theory of the origin of nucleon forces. The ground state of the deuteron is characterized by the following measured quantities: Binding energy: AE = 2.22 MeV Nuclear spin: i = 1 Nuclear parity: even Magnetic dipole moment: ,u = +0.857µ n Electric quadrupole moment: q = +2.7 x 10' m2 Charge distribution half-value radius: a = 2.1 F The fact that the deuteron has an electric quadrupole moment q means that its probability density function is not spherically symmetrical. This immediately tells us that the nucleon potential, which specifies the force acting between the two nucleons, is, itself, not spherically symmetrical. The point is that all spherically symmetrical potentials have 1 = 0 eigenfunctions for their ground states, and the probability density functions for such eigenfunctions are all spherically symmetrical (an example is the Coulomb potential and the spherically symmetrical ground state of a oneelectron atom). But the observed departure from spherical symmetry is not large. o CV IN TRO DUCT IONTO E LEME NTARY P ARTI CLES CO j.) Calculations also show that this mixture of states leads to the measured magnetic dipole moment u = +0.857 tc,,. The value differs by about 3% from what would be obtained if the deuteron were in a pure 3S, state, with the proton and neutron intrinsic spin essentially parallel and no orbital motion, since in that state u would be just the sum of the proton and neutron magnetic dipole moments, +2.7896f.t, — 1.9103,1„ _ +0.8793 µ,,. We conclude from all these considerations that the nucleon potential is not precisely spherically symmetrical, since it does not lead to a pure S ground state for the deuteron. But since the amount of D state it mixes in is small, the asymmetry of the potential must be small. For most purposes the asymmetry can be ignored. Thus we consider the deuteron as a system in which the nucleons are bound in a S 1 state of a spherically symmetrical nucleon potential V(r), where r is the distance between their centers. This potential specifies the force acting between the two nucleons. Some information about it is obtained by demanding that the energy of its ground state yield a binding energy equal to the measured value AE = 2.22 MeV. Additional information is obtained by demanding also that the ground state eigenfunction yield a charge distribution half-value radius equal to the measured value a = 2.1 F. These two pieces of data are not enough to determine the form of the nucleon potential, i.e., the radial dependence of the function V(r). However, if V(r) is assumed for simplicity to have the form of a square well as in Figure 17-2, then the radius r' and depth Vo are determined to be about 2 F and 40 MeV. Precise numbers will be quoted later after we have introduced additional experimental information that does determine something about the form of the potential. It can also be determined that a potential which fits the measured values of both AE and a has the property that its ground state is its only bound state, as indicated by the single bound energy level in Figure 17-2. This agrees with the fact that the deuteron is observed to have no bound excited states. Now the spins of the proton and neutron are essentially parallel in a 3S, bound state of the deuteron. We know that there are no bound deuterons with nucleon spins essentially antiparallel, i.e., in a 'S o state, since none is ever found with the nuclear spin 0 that would be obtained in such a state. What is the reason for the absence of a bound 'S0 state? An explanation is that the nucleon potential is spin dependent, being 3 appreciably weaker when two nucleons interact with essentially antiparallel spins (in a singlet state). If the potential is sufficiently weak to prevent the nucleons from binding stably together, the absence of the 'S o bound state is explained. (A one-dimensional potential has at least one bound state, no matter how weak the potential, because the eigenfunction can extend very far into the classically excluded regions on both sides of the binding region. But due to the different geometry of the eigenfunction, a threedimensional potential can only have a bound state if it is sufficiently strong. This can be seen by inspecting the form rR(r) for the lowest S state of a three-dimensional DE r' o r ^ ^ a) w —Vo V(r) Figure 17-2 A square well potential of radius r' and depth Vo , and its ground state eigenvalue of binding energy AE. For the deuteron this state is the only bound state of the potential. 20 0 101 10 2 103 10 4 Energy (eV) 105 106 Figure 17-3 Measured values of the total cross section a for the scattering of neutrons by protons as a function of the energy of the incident neutron. S3 0 1:1 Od N O31 0f1N square well, displayed in Figure 15-17. Since rR(r) = 0 at r = 0, that function must have enough curvature within the binding region to allow it to match on to a decreasing exponential in the excluded region. This, in turn, requires that for a given breadth the binding region be sufficiently deep.) Additional qualitative evidence in support of the idea of spin dependence of the nucleon potential is found in the absence of a bound state for a system of two protons or, particularly, a system of two neutrons. In both systems the exclusion principle would require it to be a 'S0 state, where the spins of the two identical nucleons are essentially antiparallel. In this state the potential is, presumably, too weak to lead to binding. Quantitative evidence for the spin dependence of the nucleon potential is obtained from the analysis of the scattering of unbound neutrons from protons. The total cross section for scattering, 6, which is proportional to the total probability that a neutron is scattered by a proton, is shown in Figure 17-3. This cross section is made up of a fixed mixture of neutron-proton interactions in the 'S0 and 3S, states. If the orientations of the spins of the neutrons in the incident beam and the protons in the scattering target are random, then the four possible spin states of the two-nucleon system will be equally probable. There are three 3S, states, the triplet states in which the nucleon spins are essentially parallel, and the total spin of the two-nucleon system can have three different z components: — h, 0, + h. One time out of four the nucleons will interact in the iS0 state, the singlet state in which the nucleon spins are essentially antiparallel, and the total spin can have only a single z component equal to O. Because of the fixed 3:1 ratio of the 3S1 and 1S0 interactions, the relative strengths of each cannot be determined from the total cross section. To separate the contribution of 3 S1 and 'S0 scattering, very low-energy neutrons (much lower than shown in Figure 17-3) are scattered from ortho- and parahydrogen. An orthohydrogen molecule has total proton spin of 1, whereas a parahydrogen molecule has total proton spin of O. The slow neutron has a de Broglie wavelength which is much larger than the distance between the protons in the H2 molecule, so that in one interaction the scattering of the neutron from the two protons is coherent and the amplitudes add. Since the scatterings from the ortho- and parahydrogen have different mixtures of 3S, and 'S0 interactions, the strengths of the two spin states can be separated by comparing the two measurements. These data show that the singlet state potential is about 40% weaker than the triplet state potential. That is, if both are square wells of the same radius, the depth of the potential is about 40% less in the singlet state. Hence we conclude that the nucleon potential really does depend on the relative orientation of the spins of the two interacting nucleons. This quantitative information about the spin dependence is confirmed by analyzing the scattering of low-energy protons from protons. And that analysis also provides additional evidence that the nucleon potential is charge independent; i.e., it makes no INTROD UCTIO N TOELEM ENTARYPARTICLES en distinction between protons and neutrons. The evidence is that a nucleon potential which agrees with the measured neutron-proton scattering cross section also agrees with the measured proton-proton scattering cross section. This does not mean that the cross sections are the same. In proton-proton scattering, the Coulomb potential, which is present in addition to the nucleon potential, affects the small angle scatterings, and the exclusion principle affects all the scattering by suppressing certain quantum states. The scattering of a low-energy nucleon from a nucleon does not give information about the form of the nucleon potential. As measured in a frame of reference in which the center of mass of the system is stationary, the scattering is independent of angle, or isotropic. Thus the differential cross section for scattering, du/dS2, which is proportional to the probability for scattering at various angles, is the same at all angles in this reference frame. The constant differential cross section provides only one piece of experimental data—the measured value of da/dS2. This single measured quantity can be used to determine only a single theoretical quantity. The quantity determined is the strength of the potential. (This is Vor'Z for a square well potential.) The reason why the scattering is isotropic in the so-called center-of-mass frame of reference is that at low energies the de Broglie wavelength ), of the wave, which describes the nucleon scattering, is very large compared to the radius r' of the potential, which describes the forces which produce the scattering. If ,. » r', then the separation in the scattering angle between adjacent minima in the diffraction pattern is, according to (15-4), 0 ^ /r' » 1. Since the entire range of scattering angle is only it, the inequality is essentially telling us that there are no minima. In other words, the potential looks to the wave like a point, which can only scatter it isotropically. But if the energy of the < 1. The scattered nucleon is high enough for 2 to be smaller than r', then 0 scattering pattern has structure in these circumstances, and da/dS2 contains information about the form of the potential that causes the scattering. Thus, only high-energy nucleons have enough resolving power to be effective as probes in studying the form of the nucleon potential. We shall show in Example 17-2 that if the radius of the potential is taken as 2 F, the differential cross section for scattering, d6/dS2, can be expected to depart from isotropy when the kinetic energy of the incident nucleon exceeds about 40 MeV. The first high-energy neutron-proton scattering experiments were performed at an incident neutron kinetic energy of 90 MeV. It was expected that they would provide information about the radial dependence of the nucleon potential, but, as we shall see, they actually taught us about a different aspect of the form of the nucleon potential. It was also expected that the differential cross section for scattering, da/dS2, would have the shape of a rudimentary diffraction pattern, with da/dS2 generally increasing for decreasing scattering angle. The reason why it was thought there would be a preference for scattering at small angles into forward directions is indicated in Figure 17-4. If the depth of the nucleon potential V(r) is significantly smaller than the kinetic energy of the incident neutron, the maximum momentum that the potential can transfer to the neutron has a magnitude which is significantly smaller than the magnitude of its initial momentum. (This can be seen from the following order-of-magnitude calculation, which uses the impulse-momentum and work-potential energy relations: Figure 17 4 Illustrating why the scattering angle should be small if a nucleon is scattered by a potential that can transfer to the nucleon only-a momentum of magnitude small compared to the magnitude of its initial momentum. This is the situation that would be expected if the kinetic energy of the nucleon is large compared to the depth of the potential. - Final momentum i Scatter Initial momentum Momentum transferred S3 01:1 O3 NO310 rIN 0.1 0n, CM Figure 17-5 Measured values of the differential cross section do/dS2 for scattering of neutrons of incident energy 90 MeV by protons. The data are actually obtained in a frame of reference where the target proton is initially stationary. Here they have been transformed to a frame of reference in which the center of mass of the system is stationary. The quantity 0,,,cM is the neutron scattering angle in that system. Here p, m, y, and K stand for the neutron's momentum, mass, speed, and kinetic energy; F is the force exerted on it for time At as it passes through the nucleon potential of width r' and depth Vo .) In these circumstances, a large change in the direction of the neutron momentum would not be possible. Figure 17-5 shows the measured do/dfl for 90 MeV neutron-proton scattering. Following convention, these results are expressed in a frame of reference in which the center of mass of the neutron-proton system is stationary. The top part of Figure 17-6 indicates that in this center-of-mass frame of reference the argument we have just gone through leads to the expectation of a preference for small scattering angles. But the measurements show that da/dfl for neutron-proton scattering is approxAp/p ti FAt/p ' F(r'/v)/mv ' Vo/mv 2 ' Vo/K. n n n, CM Range of nucleon force 0 n, CM Figure 17-6 Top: Neutron-proton scattering as seen in a frame of reference in which the center of mass of the system is stationary. If the kinetic energies of the nucleons are large compared to the depth of the nucleon potential, the momentum transfers are small and the neutron and proton scattering angles are small as well. Bottom: The same, for a scattering in which the neutron changes into a proton and vice versa when they interact. Although the momentum transfers are still small, because of the exchange the scattering angles are large. N INTR ODU CTIONTO ELEMENTARY PARTI CLES ^ 90°. Thus there is an equally pronounced preference for large scattering angles. The bottom part of Figure 17-6 represents the physical interpretation of the origin of the observed preference for large scattering angles. In approximately half the scatterings, the neutron changes into a proton and the proton changes into a neutron, when the two nucleons are very close. Although the momentum transfer in every scattering is small, when the exchange occurs it has the effect of producing a large angle scattering. In a later section we shall see that a neutron can change into a proton by emitting a charged meson, and a proton can change into a neutron by absorbing that meson. A more formal interpretation of the results of the neutron-proton scattering experiments is that the nucleon potential V that produces the scattering has a form which can be written approximately as V _ V(r) + V(r)P (17-1) 2 where P is an exchange operator that changes a proton into a neutron and a neutron into a proton, and V(r) is the ordinary nucleon potential we have previously discussed. Now the nucleon potential V enters expressions for the scattering cross section through the matrix element JfvvJi where 0, is the eigenfunction for the initial neutron-proton system (before scattering), and i fi f is the complex conjugate of the eigenfunction for the final neutron-proton system (after scattering). Thus it is of interest to consider the quantity [V(r) + V(r)P1 , _ V(r) , / ^^ V(r) ,/, - 2 Y'i + 2 PO. 2 We write this as imately symmetric about a scattering angle of , " VIPs ^' V(r) Oc + VZr) Ptfri (17-2) using the quantum number 1 to label the orbital angular momentum of the initial system. Since an exchange of the equal mass neutron and proton is equivalent to an exchange of the signs of the coordinates specifying their locations relative to an origin at their center of mass halfway between them, the exchange operation is equivalent in these particular circumstances to the parity operation. Therefore the usual relation between the orbital angular momentum quantum number and parity, (8-47), is applicable, and tells us that PO/ =( -1 )l i That is, the parity of an eigenfunction of a spherically symmetrical potential, 1/i i, is even if l is even and odd if 1 is odd. Thus the parity (or exchange) operator leaves the eigenfunction unchanged in the second term on the right side of (17-2) if lis even, and multiplies it by minus one if 1 is odd. So we have V( r)^l+v2r)Ptkz_[1+ — 11 V i V(r)0i 2 2 From this result we can see that the nucleon potential may be written approximately, without using the exchange operator, in a form called the Serber potential V _ N [1 + 2(-1)I] V(r) (17-3) ~ Note that V ^ 0 if 1 is odd. We conclude that the nucleon potential depends strongly on the orbital angular momentum of the two interacting nucleons, relative to their 0) N Figure 17 7 Two nucleons, each with linear momentum of magnitude p, passing each other at a distance r'. Each has an orbital angular momentum pr'/2 in magnitude relative to the center of mass. The magnitude of the orbital angular momentum of the two nucleon system is L = pr'. V1 - in most situations.) A classical argument, illustrated in Figure 17-7 in the center-of-mass frame of reference, shows that there is a relation between the maximum possible value of the orbital angular momentum L for a system of two interacting nucleons of linear momenta p. The relation is L pr', where r' is the maximum separation at which the nucleons can interact, which is the range of the nucleon force or the radius of the nucleon potential. Since L is related to the quantum number 1 by the equation L = V1(1 + 1)h, it is easy to estimate, for an assumed value of r', the maximum possible value /max of the quantum number in terms of the momenta or kinetic energies of the nucleons. Two nucleons interact with nucleon force of range r' = 2.0 F, in a state in which the angular momentum quantum number assumes its maximum possible value. If this value is /max = 1, what must be the kinetic energy of each nucleon in the center-of-mass frame of reference? The total kinetic energy in that frame of reference? The kinetic energy of the incident nucleon (in a beam) in a frame of reference where the nucleon with which it interacts is initially stationary (in a target)? ^^ We have L=0(l+1)h with 1 = i max = 1. So L=V1(1+ 1)h =fh Example 17 1. - Also L^ pr' or L P r .\/2 h r Thus the kinetic energy of each nucleon in the center-of-mass (CM) frame is 2 h2 K _ p2 2M 2Mr' 2 (1.05 x 10 -34 joule-sec)2 2 = 1.6 x 10 12 joule 2^ is 1.7x10 kg x (2.0 x 10 m) = 10 MeV The total kinetic energy in that frame of reference is just Ktotal CM = 2K 20 MeV It is easy to show that, because the two interacting particles have the same mass, the kinetic energy of the moving one, in a frame of reference in which the other one is initially stationary, is twice the total kinetic energy in the center-of-mass frame of reference. Thus the kinetic energy of the incident nucleon is 4 Kincident = 2Ktotal CM •• 40 MeV S3 017J 03 NO310r1 N center of mass. The potential is approximately zero when the orbital angular momentum quantum number 1 has an odd value. (Later we shall see that 0 for an odd l only if its effect is averaged over all the quantum states for that value of 1, as is the case en Show that the condition lmax = 0 is equivalent to the condition 0 ^ ).lr' » 1 which requires the differential scattering cross section da/dû to be isotropic. •Referring to the calculation in Example 17-1, note that -if the kinetic energy K of each nucleon in the center-of-mass frame is less than about 10 MeV, then each will have a momentum p which is -\11h h p< , ^ r VG7cr' or INTRODUCTION TO ELEMENTARY PARTICLES Example 17-2. ti ^ v h >-\127c pr Using the de Broglie relation to evaluate 2, the nucleons' wavelength, from their momenta p, we obtain —> \12 r' ^ or A ; r »1 According to (15-4), or Appendix L, the separation between adjacent minima in the scattering pattern is 0 ^ A /r', so we have 0 ;»1 Y As we mentioned several pages ago, this inequality means that there are no minima, and the differential scattering cross section dc/dn is isotropic. But we saw in Example 17-1 that K ^ 10 MeV is the condition for having /max = 1 (assuming the range of nucleon forces is r' = 2 F). So for K < 10 MeV, we can have only /max = 0. Thus we have shown that /max = 0 is equivalent to 0/r' » 1. We concluded in Example 17-1 that when the kinetic energy of each nucleon in the center of mass frame is about 10 MeV the kinetic energy of the incident nucleon, in the frame in which the target nucleon is initially at rest, has a value of about 40 MeV. So we can also conclude that da/dn can be expected to depart from isotropy only when the kinetic energy of the incident nucleon equals, or exceeds, about 40 MeV. Example 17-1 shows that, for a nucleon potential of radius r' = 2 F, we have lmax = 0 unless the kinetic energy of each nucleon of an interacting pair exceeds about 10 MeV in the center-of-mass frame of reference. Similar calculations show that lmax = 1 unless these energies exceed about 30 MeV, and /max = 2 unless they exceed about 60 MeV. (All these figures are only approximations since they are obtained from a semiclassical argument.) Now, if we consider a pair of nucleons in a nucleus, their kinetic energies in a frame of reference fixed to their center of mass generally do not exceed 30 MeV. Thus they can usually interact with each other only in 1 = 0 and l = 1 states. But the Serber potential, (17-3), is approximately zero for / = 1. So the nucleons in a nucleus actually interact strongly with each other in only half of the quantum states that angular momentum considerations (and exclusion principle considerations if they are of the same species) would otherwise allow to contribute to the total interactions. This property of the nucleon potential helps make nucleon forces saturate by suppressing the attractive nucleon forces in half of the interactions; but it is not enough. To obtain saturation—a feature that we indicated at the beginning of this section is responsible for one of the most basic properties of nuclei—it is necessary that some of the nucleon forces be repulsive. That is, there must be a repulsive part in the nucleon potential. The study of proton-proton scattering at high energies showed that the radial dependence of the nucleon potential is such that it has a repulsive region in its center. Figure 17-8 gives the measured center-of-mass reference frame differential S3O1:I O3N O31O f1N BCM Figure 17-8 Measured values of the center-of-mass differential cross section do/dS1 for proton-proton scattering. The energy of the incident protons is 330 MeV. cross section, d r/dfl, for scattering of incident protons of kinetic energy 330 MeV from a target of protons. Only scattering angles from 0° to 90° are plotted. The symmetry of the two proton system demands that d6/dS2 be symmetric about 90°, no matter what the form of the nucleon potential, because if one proton is scattered at the angle 0 the other one must be scattered at the angle 180° — O. At angles smaller than about 10°, do'/dS2 has the very rapid angular dependence of Coulomb scattering. In this angular range the distance of closest approach in the scatterings is greater than the range of nucleon forces. At larger angles, the scatterings involve close collisions in which nucleon forces dominate, and da/dI2 for proton-proton scattering is found to be essentially isotropic. The surprising isotropy of high-energy proton-proton scattering was shown by Jastrow to imply that there is a strong repulsive core in the nucleon potential. That is, the potential has a radial dependence something like that indicated in Figure 17-9. It is not difficult to understand qualitatively the essential points in Jastrow's argument. At an incident kinetic energy of 330 MeV the kinetic energy of each of the protons in their center-of-mass frame is 82 MeV, and L ax = 3. Thus the two protons in the scattering can interact only in states of orbital angular momentum given by 1= 0, 1, 2, 3. But since the Serber potential is approximately zero for 1= 1 and 3, significant interactions can occur only in 1= 0 and 2 states. If only the 1 = 0 state were involved, da/dfl would indeed be isotropic because the scattering would be the same as if we had /max = 0, which means B ^ 2/r' » 1. However, in this case the CO V ( r) A nucleon potential with an infinitely strong repulsive core inside an attractive square well. Figure 17-9 INTRODUCTION TOELEMENTARY PARTICLES rR (r) 0 r Figure 17-10 The effect of a repulsive core potential on the radial dependence of the radial coordinate, r, times the radial part of the eigenfunction, R(r), for the I = 0 state eigenfunction for high-energy proton-proton scattering. The solid curve shows rR(r) in the presence of the potential and, for comparison, the dashed curve shows what it would be like in the absence of the potential. Because the energy of the incident proton is large compared to the depth of the att ractive region of the potential, the effect of the repulsive core dominates and rR(r) is pushed out. magnitude of &a/d12 could be only about half as large as the magnitude actually observed. In fact, the isotropy of da/dS2 is a result of a destructive interference between waves scattered in an l = 0 state interaction and waves scattered in an 1 = 2 state interaction. The interference suppresses the tendency, discussed above, for do/dI2 to be large at small angles. Figure 17-10 indicates how a potential with a repulsive core, of height which is very much larger than the kinetic energy of the incident proton, affects the l = 0 state eigenfunction. The repulsive region "pushes out" the eigenfunction as at the edge of an infinite well, and the attractive region "pulls in" the eigenfunction because it increases the curvature. If the incident proton energy is large compared to the depth of the attractive region, the effect of this region is small and the net result is that the l = 0 state eigenfunction is pushed out. Figure 17-11 shows what the potential does to the l = 2 state eigenfunction. Since for small r all these eigenfunctions have the r' behavior given by (7-32), the l = 2 eigenfunction has such a small value throughout the repulsive region near r = 0 that the repulsive region can have practically no effect on it. This eigenfunction is very small for small rR (r) Figure 17-11 The effect of a repulsive core potential on rR(r) for the I = 2 state eigenfunction for high-energy proton-proton scattering. The solid curve shows rR(r) in the presence of the potential, and the dashed curve shows what it would be like in the absence of the potential. Since rR(r) is negligibly small at the core radius even in the absence of the potential because R(r) cc r', the effect of the repulsive core is negligible. Thus the attractive region dominates and rR(r) is pulled in. Experiments on the scattering of high-energy electrons from deuterons provide completely independent evidence of the existence of a strong repulsive core in the nucleon potential. The experiments show that there is a hole in the center of the deuteron charge distribution. This means that the proton avoids the center of the deuteron, presumably because of the very strong repulsion it feels if it tries to get too close to the neutron. Analysis of both the electrondeuteron and proton-proton scattering experiments indicates that the radius of the repulsive core is about 0.5 F. The repulsive core in the nucleon potential is the most important factor responsible for the saturation of nucleon forces. In a nucleus, the cores in the nucleon potentials add large positive contributions to the total energy if the nucleons are too closely packed. This is why the nucleons maintain an average center-to-center spacing, given by the measured nucleon mass density, of about 1.2 F. At this spacing, any one nucleon can interact only with a limited number of other nucleons, since the range of nucleon forces is about 2 F, and so the nucleon forces saturate. If there were no repulsive region in the nucleon potentials, the attractive regions would cause the nucleus to collapse until its linear dimensions were about equal to the range of nucleon forces. Then each nucleon would interact with all the other nucleons, and the binding energy per nucleon, AE/A, would be approximately proportional to A. We found that the nucleon potential depends on the quantum number s specifying the spin angular momentum of a system of two nucleons (i.e., whether they are in a singlet or triplet state), and that it also depends on the quantum number 1 specifying the orbital angular momentum of the system. Certain experiments show that the potential even depends on the quantum number j specifying the total angular momentum of the system. Another way of saying this is that the potential depends not only on the spin angular momentum S and on the orbital angular momentum L, but also on their dot product S • L which determines the magnitude of the total angular momentum J. Thus the nucleon potential contains a spin orbit term, proportional to S • L. The term makes the nucleon potential more attractive if S • L is positive, and more repulsive if it is negative, just as is the case for the spin-orbit term of the shell model nuclear potential. The experiments referred to basically involve scattering a beam of nucleons with aligned spins from a target of nucleons with aligned spins. This allows the interactions in different quantum states, with different spin, orbital, and total angular momenta, to be investigated separately. The spin-orbit term in the nuclear potential, which plays such an important role in the shell model, has its origin in the spin-orbit term of the nucleon potential. To understand what happens, first focus interest on a nucleon moving through the interior of a nucleus. Every time it passes near another nucleon it experiences a spinorbit interaction. When the nucleon it passes is on a particular side of its trajectory the orbital angular momentum of the two interacting nucleons about their center of mass will have a particular orientation. When the nucleon of interest passes near another nucleon on the opposite side of its trajectory this orbital angular momentum will have the opposite orientation. Since on the average it will pass an equal number of nucleons on each side of its trajectory, because it is in the interior of the nucleus, there will be a cancellation and it will experience no net spin-orbit interaction. However, if the nucleon of interest is moving near the surface of the nucleus, then most of the nucleons it passes will be on the same side of its trajectory, and so most of the time - S3 01:1O3 NO31011N r whether or not the repulsive region is present. Consequently, the attractive region is the only one that has much effect on the 1 = 2 state eigenfunction, and so the eigenfunction is pulled in by the potential. The destructive interference leading to the isotropic da/dS2 is due to the 1 = 0 state eigenfunction being pushed out while the 1 = 2 state eigenfunction is pulled in. If the nucleon potential were purely attractive, both eigenfunctions could only be pulled in. INTRO DUCTION TO E LEMENTARYPARTICLES the orbital angular momentum of the two interacting nucleons will have the same orientation. The individual spin-orbit interactions will therefore combine to produce a net spin-orbit interaction on the nucleon of interest. The sign of this spin-orbit interaction is evidentally the same as that of the individual spin-orbit interactions, in accord with the sign required in the shell model. And calculation shows that its magnitude is in reasonable agreement with that used in the shell model. We conclude this section by summarizing what is known about nucleon forces. Certainly the first thing to say is that they are very complicated. When a nucleon of, say, 200 MeV kinetic energy interacts with another nucleon, the system can be in any one of the following quantum states: 'S 0 , 3S 1 , 'P 1 , 3P0, 3P,, 3 P2 , 'D2, 3D,, 3D2, 3 D 3 . The nucleon potential is different in each of these states, and in each, its form involves a fairly complicated radial dependence, as well as departures from spherical symmetry. The only simplifications are: 1. The nucleon potential is charge independent, so it does not depend on the species of the interacting nucleons. 2. The exclusion principle prohibits interaction in certain quantum states between nucleons of the same species. In particular, the 3S,, 'P,, 3D,, 3 D2 , 3D3 states are excluded from the list just quoted in the neutron-neutron or proton-proton interactions. The reason is that if the space eigenfunction for a system of two identical nucleons is symmetric in a label exchange (even l), then the spin eigenfunction must be antisymmetric in such an exchange (singlet); and if the space eigenfunction is antisymmetric (odd l), the spin eigenfunction must be symmetric (triplet). 3. The net effect of all the P state interactions is very small. But the aligned spin experiments show this is partly due to destructive interferences in the interactions from the different P states, and that the interactions in individual P states are not so small. If we are content to describe approximately only their most important properties, however, nucleon forces are not too complicated. Figures 17-12 and 17-13 give quantitatively the radial dependences of nucleon potentials for even-/ quantum states. The first figure shows the potential for singlet states (nucleon spins essentially antiparallel), and the second shows the stronger potential for triplet states (nucleon spins essentially parallel). With these two potentials, and zero potential for all quantum states with odd 1, results are obtained in reasonable agreement with all the properties of the deuteron (except its electric quadrupole moment) and all the nucleon scattering data up to several hundred MeV (except the aligned spin data). Figure 17-13 shows also the eigenvalue and the radial dependence of the eigenfunction for the only bound state of the triplet potential, i.e., the deuteron. Note that the attractive region is just barely strong enough to overcome the effect of the repulsive core and lead to binding. As a consequence, there is a high probability that the two nucleons in the deuteron have a separation larger than the range of nucleon forces. CO V (T) Figure 17 12 The radial dependence of a singlet even-/ nucleon potential in reasonable agreement with experiment. - rR(r) 1.75 0 0.40 — r (F) 2.22 E deuteron > ^ V(r) T no m w —72 Figure 17-13 The radial dependence of a triplet even-/ nucleon potential in reasonable agreement with experiment. Also shown are the eigenvalue and the quantity rR(r) for the eigenfunction of the single bound state of the potential at —2.22 MeV. This state, which is the deuteron, is just barely bound and rR(r) just barely reaches a maximum inside the attractive region (compare with Figure 17-10). The square of rR(r) is r2R*(r)R(r) which is proportional to the radial probability density that specifies the probability of finding the two nucleons in the deuteron with a separation in the vicinity of r. Of course, the nucleon potentials in nature cannot have the abrupt radial dependence of the simplified potentials displayed in Figure 17-12 and 17-13. In a subsequent section we shall see that meson theory predicts something about the behavior of the potentials for relatively large radii, and that it shows that the onset of the attractive region should actually be fairly gradual. 17 3 ISOSPIN - Figure 17-14 shows schematically the lowest energy levels for the three possible two nucleon systems: the dineutron ° n2; the deuteron 1 H2; and the diproton 2He2. The exclusion principle allows only the deuteron to have a triplet spin level, labeled s = 1, and because of the spin dependence of the nucleon force only this level is at a low enough energy to be bound. But all three systems have a slightly unbound singlet spin level, labeled s = 0. Because of the charge independence of the nucleon force, the s = 0 level is at the same energy in all of the .systems, except for the small effect of the Coulomb repulsion energy that is present in the diproton only. The symmetry that is apparent in this set of energy-level diagrams, and that is even more apparent in other sets we shall consider later, can be described in a very convenient way by means of the concept of isospin, T. As its name implies, isospin has mathematical properties that are similar to those we have become familiar with in dealing with spin. But it has no direct physical relationship to spin. It is used to identify related energy levels, or quantum states, in s=0,T=1 On2 1H2 2 He 2 ^ o + u u n s=1,T= 0 Figure 17-14 Illustrating the pattern formed by the lowest energy levels of the three possible two-nucleon systems. CV INTRODU CTION TO ELEMENTARY PARTIC LES CO sets of isobars; i.e., in sets of systems that all have the same number A of nucleons. For the set shown in Figure 17-14, the lowest level is said to be an isospin singlet, labeled T = 0, and the three related levels are said to form an isospin triplet, labeled T = 1. The word triplet is appropriate because there are three related levels, and because associated with T is a component, written TZ, that can assume the three values T..= —1, 0, + 1 when T = 1. The component TZ is used to identify a particular level of an isospin multiplet by specifying the relation between the number Z of protons and the number N of neutrons for the particular isobar that the level belongs to. The relation is TZ = Z 2 N (17-4) In Figure 17-14 the three T = 1 levels are labeled by T_ = (0 — 2)/2 = —1 for the dineutron, TZ = (1 — 1)/2 = 0 for the deuteron, and T. = (2 — 0)/2 = + 1 for the diproton. For the isospin singlet level, T = 0, there is only one possible value of T Z, namely the value TZ = 0 corresponding to the deuteron. In general, the relation between the value of T and the possible values of TZ is (17-5) TZ =—T,—T+1,...,+T-1,+T This is, of course, very analogous to the mathematical relation between the quantum number describing any angular momentum vector, including the spin vector, and the possible values of the quantum number describing its z component. It should be emphasized, however, that isospin is not a vector in any physical space, with a component along a coordinate axis of that space. Instead it is a mathematical construct that exists only in some imagined space. It is, nevertheless, very useful in describing the symmetrical properties of systems containing the same number of nucleons, which result from the symmetrical way the exclusion principle treats identical nucleons of either species, and the symmetrical way the charge independent nucleon force treats all nucleons. A system containing a single nucleon has T = 1/2, with the two possible values of TZ being TZ = —1/2, + 1/2. According to (17-4) the first possibility describes the neutron for which (Z — N)/2 = (0 — 1)/2 = —1/2, and the second describes the proton for which (Z — N)/2 = (1 — 0)/2 = + 1/2. Thus isospin allows us to speak of the neutron and proton as two related manifestations of the same particle, the T = 1/2 nucleon. In one, called the neutron, TZ = — 1/2; in the other, called the proton, T.. = + 1/2. This is like saying that a proton with spin "up" is the ms = +1/2 manifestation of the s = 1/2 proton, and the proton with spin "down" is the m s = —1/2 manifestation of that particle. From this point of view the quantum mechanical label exchange properties of a system containing several nucleons may be expressed in a very general way by saying that if the total eigenfunction for the system is a product of a space eigenfunction, a spin eigenfunction, and an isospin eigenfunction, the symmetry of each in an exchange of any two particle labels must be such as to make the total eigenfunction be antisymmetric because nucleons are fermions. As applied to the two nucleon system levels of Figure 17-14, since for all of these levels 1 = 0, all of the corresponding states have symmetric space eigenfunctions. So for each of them a symmetric spin eigenfunction must be associated with an antisymmetric isospin eigenfunction, or vice versa. Because of their analogous mathematical properties, for both spin and isospin a singlet state is described by an antisymmetric eigenfunction and a triplet state is described by a symmetric eigenfunction. Thus levels of singlet spin (s = 0) should have triplet isospin (T = 1), and the level of triplet spin (s = 1) should have singlet isospin (T = 0), as inspection of the figure will demonstrate to be the case. The power of isospin in identifying related quantum states in sets of systems containing a large number of nucleons is shown in Figure 17-15. The figure shows sche- T= 2 T=1 T=0 T=0 T=1 T=0 5 B 14 6 C 14 7 N 14 (V II I Ii C-7 Er7 I o il C-+^ 8 0 14 + u h" 9 F 14 N + u h' Figure 17-15 The low-lying energy levels of the A = 14 isobars. Note that the positions of the ground state energy levels trace out the parabolas, for the ground state masses of the A = 14 nuclei, that are discussed in connection with $ decay. matically some low-lying energy levels of the set of isobars 5 13 14, 6C14 , 7N14, 80 14, and 9F14. The so-called isobaric analogue levels of a particular isospin multiplet are labeled by T and TZ as before. Except for the small systematic increase in their energies with increasing TZ, due to the increase in the Coulomb repulsion energy with increasing Z, all isobaric analogue levels have the same energy. The reason is that the corresponding total eigenfunctions of each system are all identical solutions (if we ignore Coulomb effects) to a Schroedinger equation for the same nucleon forces, since the nucleon force does not depend on T Z as it is charge independent. But the nucleon force does depend on T as it is spin dependent. We first learned of this as a dependence on the spin; we now realize that the label exchange requirements mean it is also an isospin dependence. The nature of the spin dependence is such as to make the state of lowest T have the lowest possible energy level for the set of systems. This can be seen in both Figure 17-15 and in Figure 17-14. The statement that energies resulting from the nucleon force, or interaction, do not depend on TZ but only on T is consistent with the statement that the isospin T is conserved in processes involving this interaction. To see this, compare the statement that the total angular momentum J is conserved in processes involving a spherically symmetrical interaction V(r), with the statement that energies resulting from this interaction do not depend on its component JZ but only on its magnitude J. However, the conclusion that isospin is conserved in the nucleon interaction is of greater generality than the conclusion, based on the charge independence experiments, that the nucleon interaction depends on T but not T. So it requires additional experimental verification. Evidence from nuclear physics is found, for example, in the reaction 1 H2 + 8016 _> 7N14 + 2He4 In all experimental situations, the incident and target nuclei 'H2 and 80 16 are in their ground states. If the bombarding energy of the incident nucleus is not too high, the product nucleus 2He4 must also be in its ground state because its first excited state lies at an energy above 20 MeV. All three of these nuclei have T Z = 0 in all states, and in their ground states they have the lowest value of T consistent with this TZ9 namely T = 0. The same is true for the ground state of the residual nucleus 'N14. But, as we see in Figure 17-15, the first excited state of 7N 14 has T = 1. As far as the conservation of energy, angular momentum, or parity is concerned, the reaction could produce 7N14 in either its ground or its first excited state. The experimental observation that it is produced only in the ground state provides strong evidence for the conclusion that the nucleon interaction conserves the isospin T. This statement tells us something new about the nucleon interaction, whereas the fact that M INTRODUCTION TO ELEM E NTARYPARTICLE S CD the nucleon interaction also conserves TZ is simply a consequence of charge conservation, as can be seen from (17-4). We shall see that particle physics provides much verifying evidence for the conservation of isospin. We have noted already the assignment of isospin to the nucleon, and we shall learn shortly about its assignment to other strongly interacting particles. In the application to particles we shall find that isospin takes on a broader significance than its use in the classification of nuclear states. Finally, in the next chapter we shall understand the basis of isospin and why it is conserved. 17 4 PIONS - In preceding sections we presented a description of properties of nucleon forces that are observed in experiment. Although theory was used in the description, it was used essentially to correlate the experimental observations, and not to explain their basic origin. But there is a theory that is successful in explaining how certain properties of nucleon forces arise from more fundamental attributes of nature. This is the meson theory, which originated with the work of Yukawa in 1935. Yukawa proposed that a nucleon frequently emits a particle with an appreciable rest mass, now called a n meson or pion. This particle hovers near the nucleon in the so-called n meson field for a very short time, and then is absorbed by the nucleon. During the process the nucleon maintains its normal rest mass, and so while it is happening there is a violation of the law of mass-energy conservation because there is more rest mass present than there is before the n meson is emitted or after it is absorbed. The energy-time uncertainty principle shows, however, that such a violation is not impossible if it lasts for a sufficiently short time. Of course, the it meson cannot permanently escape the nucleon because that would permanently violate the mass-energy conservation law. Such a pion is called a virtual particle because it has a very short existence limited by its violation of mass-energy conservation. If two nucleons are close enough for their meson fields to overlap, it is possible for a it meson to leave one field and join the other, without permanently changing the total energy of the system of two nucleons. Such an interaction between the fields is pictured crudely in Figure 17-16. In the interaction, the momentum carried by the n meson is transferred from one field to the other, and therefore from one nucleon to the other. But if momentum is transferred, the effect is the same as if a force is acting between the nucleons. Thus the exchange of a virtual pion between two nucleons leads to the nucleon force acting between them, according to Yukawa. (We came across a similar idea before when discussing, in Section 14-1, the exchange of a phonon between two electrons in a Cooper pair.) In making his proposal, Yukawa was guided by two analogies available to him at the time. One is the covalent binding in the H 2 molecule and other organic molecules (discussed in Section 12-3). In this process, a force arises from the sharing, or exchange, of an electron between two atoms. An even closer analogy is the - Before Figure 17-16 After A very crude representation of the exchange of a n meson between the fields of two interacting nucleons. Example 17-3. Use energy conservation, as modified by the energy-time uncertainty principle, to establish a relation between the range r' of the nucleon force and the rest mass m,r of the ir meson whose exchange produces the force. Then use the relation to estimate the value of m,r , assuming r' = 2 F. •The range of the nucleon force is of the order of the radius r' of the n-meson field surrounding a nucleon, since two nucleons experience that force only when their meson fields overlap. To estimate the radius of the field, consider a process in which a nucleon emits a meson of rest mass m„, which travels out to the limits of the field, and then returns to the nucleon where it is absorbed. In this process, the it meson travels a distance of the order of r'. While it is happening there is a violation of the conservation of mass-energy. The reason is that the total energy of the system equals one nucleon rest mass energy before and after the process, and one nucleon rest mass energy plus at least one n-meson rest mass energy during the process. But the energy-time uncertainty principle shows that a violation of energy conservation by an amount AE m,tc2 is not impossible if it does not happen for a time longer than At, where AEAt — h The reason is that such a violation could not be detected because the energy cannot be measured in a time At more accurately than AE. Since the speed of the pion can be no greater than c, the time required for it to travel a distance of the order of r' is at least At r' — c These three relations give 2 h he e ——— r' At or mir ti i^ (17-6) ,— rc If we take r' = 2 F, (17-6) gives us an estimate of the n-meson rest mass h 1 x 10 - 34 joule-sec 2 x 10 -28 kg m r'c 2 x 10 -15 m x 3 x 108 m/sec This can also be written mn — 200 m — 100 MeV/c2 — where m is the rest mass of an electron which has the value m = 0.511 MeV/c 2. It is worthwhile restating the argument used in Example 17-3. A meson of rest mass m„ ' h/r'c leads to a nucleon force of range ' r' because the nucleons could not exchange the meson if they were separated by a much larger distance, since its flight time would be so long that the uncertainty principle would allow an accurate '09S i -LISN OId Coulomb force acting between two charged particles. According to the very successful theory of quantum electrodynamics (mentioned in Section 8-7), surrounding each charge is a field of photons, and the Coulomb force actually results from an exchange of a virtual photon between the fields. Quantum electrodynamics shows that the long range of the Coulomb force is a consequence of the fact that photons have zero rest mass. Yukawa adapted the theory to the case of two nucleons, interacting with a short range nucleon force, by assuming that the particle exchanged has a nonzero rest mass. When he made his proposal, pions had not yet been detected, but Yukawa was able to estimate the rest mass that would lead to the observed range by performing a calculation similar to the one in the following example. INTRODU CTION TOELE MENTARY PARTICLE S enough determination of the total energy of the system to make the violation of energy conservation detectable. This argument also explains how the Coulomb force can have a long range. Since a photon has zero rest mass, there is no lower limit to the total energy it can carry. When two charged particles are separated by a very large distance, they can exchange a photon of very low kinetic energy without violating the energy-time uncertainty principle. Of course, such a photon will carry very low linear momentum. Therefore, the force it produces is very weak, in agreement with the well-known decrease in the strength of the Coulomb force as the separation of the charged particles increases. At the time of Yukawa's proposal, there were no known particles of rest mass between the electron rest mass 0.5 MeV/c 2 and the proton rest mass which equals 938 MeV/c2. The n+ mesons, which have a positive charge equal in magnitude to that of the electron, and the n - mesons, which have a negative charge of the same magnitude, were first detected in 1947 by Powell and collaborators. They were found as a component of the cosmic radiation, which is constantly bombarding the earth. Shortly after, the charged n mesons were produced artificially at a large cyclotron in collisions between nucleons of very high energy and nucleons in a target. Cosmic radiation mesons are also initially produced in high-energy collisions. Measurements show that the n+ and n - mesons have the same rest mass (17-7) m,, = 140 MeV/c2 This is certainly close enough to Yukawa's prediction m, — 100 MeV/c 2. Neutral n° mesons were first observed by Moyer and coworkers in 1950, as products of high-energy collisions. Their rest mass is found to be (17-8) moo = 135 MeV/c 2 The free n mesons, which are observed in these experiments, are liberated from the n-meson fields surrounding the colliding nucleons by the energy made available in the collision. They are the same particles as the mesons discussed in the meson theory of nucleon forces. The only difference is that Yukawa's mesons are bound before the nucleons interact by requirements of energy conservation. That is, the free pions are not virtual particles. As is obviously true of the virtual pions that produce the strong force between two nucleons, the interaction of free pions with nucleons is strong. This was indicated in various ways in the early experiments with cosmic ray and cyclotron pions, which showed that the cross section for interaction of a short de Broglie wavelength pion with a nucleus is close to its maximum possible value, the projected geometrical cross-sectional area nri2 , the quantity r' being the nuclear radius. The interaction is also particularly violent; when a pion enters a nucleus most of its rest mass energy goes into splitting the nucleus into fragments which fly apart energetically. Of course, the detection of free pions provided a striking verification of the validity of the meson theory. Experimental evidence for the exchange of pions between two interacting nucleons is found in neutron-proton scattering. As we discussed in a preceding section, the approximate symmetry about 90° of the scattering differential cross section implies that in about half the scatterings the neutron changes into a proton and the proton changes into a neutron, when the nucleons interact. One way this can happen is indicated by the set of reactions. n —+ p+n then n - +p n That is, the neutron emits a negatively charged 7C - meson into its field, becoming a proton. Then the n - meson joins the field of the proton, and it is absorbed by the proton which becomes a neutron. The scattering process can also happen through the set of reactions p—+n+ir+ then n + -Fn—>p - 0 — In this case the proton emits a positively charged n+ meson, which is subsequently absorbed by the neutron. Thus, in about half the neutron-proton scatterings a meson transfers charge as well as momentum between the two interacting nucleons. Because the neutron-proton scattering differential cross section is approximately symmetric about 90°, in about half the scatterings the neutron and proton do not exchange identities when they interact. But they still must exchange a meson which carries the transferred momentum. The two sets of reactions which occur are n—> n+Tc° then ° + p—p and then no + n —+ n p —* p n° The neutral ic° meson transfers momentum, but no charge, between the interacting nucleons. This interpretation implies that an isolated proton should be surrounded by a meson field which will sometimes contain a Tr ° meson and sometimes contain a Tc+ meson. The reactions that take place when the meson is emitted by the nucleon are or p—*n+7r + ppno Of course the nucleon must absorb the meson it has emitted within a very short time, but then it can emit another one. The meson field surrounding an isolated neutron should sometimes contain a Tr ° meson and sometimes contain a 7r - meson, which are emitted through the reactions n — n+ic ° or n —>p+ar But the proton field cannot contain a rc - meson and the neutron field cannot contain a n+ meson. Very direct experimental verification of these predictions is provided by electron scattering measurements of the charge distribution of the proton and of the neutron. Figure 17-17 shows the radial dependence of the charge densities of the two species of nucleons. The charge density of the proton is everywhere positive, and extends out to a distance r of about 2 F. At the larger r within this limit (in the field) the charge is carried by a n+ meson. The neutron charge density is not everywhere zero. At smaller r (near the center where the p from p + nr - dissociation would be) it is positive, and at larger r (in the field where the rc - would be) it is negative. The volume integral of the charge density is, however, zero, since the neutron is neutral and so has no net charge. Meson theory also provides an explanation of how the neutron can have an intrinsic magnetic dipole moment, even though its net charge is zero. It sometimes becomes a proton plus a n - . The proton has an intrinsic magnetic dipole moment, and the rr meson can produce a current which makes an additional contribution to the magnetic dipole moment. At values of r approaching 2 F, the nucleon charge densities are proportional to some measure of the intensity of their meson fields. Both are decreasing fairly gradually as r increases. The nucleon force, which acts between two nucleons when their meson fields overlap, also therefore decreases fairly gradually as their separation increases. Thus the onset of the attractive part of the nucleon potential, describing the r (F) Figure 17 17 The radial dependence of the charge density of the proton and of the neutron. - co (c) INTR ODU CTI ONTOELEMENTAR YPARTICLES ^ nucleon force acting when the two nucleons are beginning to get close enough to interact, is fairly gradual. It is not abrupt as in the simplified nucleon potential of Figures 17-12 and 17-13. In fact, we shall indicate in Example 17-4 that for large values of the separation distance r the nucleon potential should follow the Yukawa potential e - r/r (17-9) V(r) = -g2 r where r'= h mire 1.5F The range r' of the potential is specified by the theory to have a value which agrees with the simple argument of Example 17-3, and with experiment. The over-all strength of the potential depends on the constant g2 , whose value is not determined by the theory but can be by finding the value of g 2 that gives best agreement with experiment. In terms of the dimensionless quantity g2/tic, the value so determined is g 2/tic ^ 15 (17-10) Figure 17-18 plots the Yukawa potential. Note that V(r) cc e - r/r,/r decreases in magnitude with increasing r fairly gradually, but the decrease is very much more rapid than that of the long range Coulomb potential V(r) cc 1/r. At values of r small compared to 2 F, the nucleon potential deviates markedly from the Yukawa potential. In fact, we know it becomes repulsive at -0.5 F. The repulsive core of the potential may arise from the exchange of mesons that we shall meet later, whose rest masses are considerably larger than that of the n meson. But there are other competing explanations for the origin of the repulsive core. Example 17 4. Write a relativistic wave equation for it mesons, and then show how the Yukawa potential, (17-9), can be obtained from that equation. ^ A relativistic wave equation for it mesons can be obtained by writing the relativistic energy equation R E2 = c 2 p 2 + m2c4 where p2 = +pÿ +pz replacing the total energy and the momentum components by the associated operators of (5-32) iii a/az pi, i —iii al aY E —> ih a/at pX , — ih a/ax pZ and then allowing the operator equation thereby obtained to operate on the function `Y. The result is 2 z^ z z - — h2 0 T — c2h2 ( at^ = ax f + aY + az^) m2c4^ r r' V (r ) ^ W The Yukawa potential. For r r' = htm,,c comparblet ghn 1.5 F, the nucleon potential should have this form. Figure 17-18 or 21P mc2 v2ip 1 0 c 2 at2 h2 which is called the Klein-Gordon equation. It plays an important role in the quantum electro- dynamics of bosons. For instance, for m„ = 0 it reduces to the classical wave equation a 2 t" 2_ c2 at2 v for photons, the so-called quanta of the electromagnetic field. The classical wave equation has a static solution of the form — e2 1 4i€ ° r> r 0 as can easily be verified by substitution, using the relation 1 d( 2 d`Y ^ r V21p_ r2 dr dr for `I' = YJ(r). For m,, 0 0 the Klein-Gordon equation has a static solution of the form ^ _ g _ 2 e - r/r' r r >0 where h , m,rc as can also easily be verified by substitution. Since the solution to the wave equation for zero rest mass quanta (photons) gives the Coulomb interaction potential for the electromagnetic field, the solution for nonzero rest mass quanta (pions) is assumed to be the interaction potential for the meson field, that is, the Yukawa potential of (17-9). The constant g2 determines the strength of the Yukawa potential, just as the constant e 2 (the square of the electron charge) determines the strength of the Coulomb potential. Note that the dimensionless quantity g 2/hc has the value =15, whereas the dimensionless quantity e2/4rcE°hc (the fine-structure constant) has the value =1/137. This is an indication of the strength of the nucleon force. • Single free pions can be created in high-energy collisions between nucleons, e.g. (17 11) p + p rt+ + d where d is the deuteron, or destroyed in collisions between pions and nucleons, e.g. - rc + +d gyp+p - (17 12) - From this we can immediately conclude that pions cannot be fermions. The reason is that the number of fermions in an isolated system always remains constant, in the sense that if a fermion is produced, or destroyed, it always happens in conjunction with the production, or destruction, of an antifermion. Examples are electron pair production, or annihilation. Pions are bosons, just as photons are bosons, that can be emitted or absorbed singly. As bosons, pions must have integral spin; that is s = 0, or 1, or 2, .... Measurements show that for all three cases, i , a°, and rr+, the pion spin is O. The first of these measurements involved applying the principle of detailed balancing (see the discussion of (11-4)) to the observed ratio of the cross sections for the forward and backward reactions of (17-11) and (17-12). The value of the rr+ spin influences the cross section for the forward reaction because the reaction rate is proportional to the density of states that can be populated, and this is proportional to the spin degeneracy factor (2s + 1). The cross-section ratio showed that s = O. A very interesting property of pions is that pions have odd intrinsic parity. The initial evidence came from the reaction (17-13) +d >n+n — INT RODU CTIO N TOELE ME NTARY PART ICLES The negatively charged pion is captured by the deuteron after dropping through a sequence of atomic electronlike states to the 1 = 0 state, where its wave function has a large overlap with the deuteron. Thus the total angular momentum on the left of (17-13) is that of the spin 1 ground state of the deuteron. So angular momentum conservation allows the two neutrons to be emitted either with total orbital angular momentum 1 = 0 or 2 and "parallel" spins, or with 1 = 1 and "antiparallel" spins. The first possibilities are ruled out because they would result in a symmetric total eigenfunction for the system of two fermions. Therefore the neutrons are emitted in a state in which the total orbital angular momentum is 1 = 1. The parity of such a state is odd, according to the usual rule that parity is governed by (-1) 1. Therefore, since parity is conserved by the nuclear, or nucleon, interaction, the parity of the system n + d must be odd. Since it has even orbital angular momentum, the parity of the ground state of the deuteron is even, and the (-1) 1 rule also says that the parity associated with the l = 0 motion of the captured n - is even. Thus the n - meson must have an intrinsic parity which is odd. The same is true of the other pions. As the number of nucleons present is unchanged in the reaction, the intrinsic parity of the nucleon is undetermined. The number of nucleons is unchanged because single fermions cannot be created or destroyed, and this also makes it impossible to determine the nucleon parity. By convention, the nucleon intrinsic parity is taken as positive. The triplet of pions have similar masses, identical quantum numbers, and participate equally in the nucleon interaction. It is therefore natural to say that the pion is an isospin T = 1 particle, that has a TZ = —1 manifestation called the n , a TZ = 0 manifestation, the n°, and a TZ = +1 manifestation, the n +. In so doing we are generalizing the relation between TZ and electric charge. The form that we originally used for nucleons, (17-4), is equivalent to the relation Q = Tz + 1/2 (nucleons) (17-14a) where Q is the charge in units of the magnitude of the electron charge. For example, this yields Q = 0 for the TZ = — 1/2 neutron and Q = 1 for the TZ = + 1/2 proton, as before. For pions the relation is different, since Q = Tz (pions) (17-14b) However, we may incorporate both of these relations into one form by writing (17-15) Q = TZ + B/2 (nucleons and pions) where B, called the baryon number, has the value 1 for a nucleon and 0 for a pion. A baryon is a fermion that participates in the strong interaction. The quantity B, introduced here to generalize the relation between charge and isospin, is quite important because it is a conserved quantity. For instance, the proton p antiproton p pair production reaction (17-16) p+p—p+p+p+p is a very good example of the baryon number conservation law B = const (17-17) where the baryon number B has the value + 1 for a nucleon and —1 for an antinucleon. We already know that the total number of fermions in an isolated system will remain constant. But (17-17) tells us something more. It says that the number of fermions of a particular type, called baryons, will remain constant and that, for example, a proton will not turn into an electron. Other baryons will be introduced soon, displaying the further importance of this conservation law. Before leaving the topic, note that reaction (17-16)—the form of which is forced by (17-17)—also tells us that T2, which is + 1/2 for the proton, must be —1/2 for the antiproton in order to conserve isospin. It is generally the case that TZ for an antiparticle must be opposite to T Z for the corresponding particle. Notice that we have already encountered this for the pion, since the particle, n + , has TZ = + 1, and the antiparticle, i - , has TZ = —1. The 7t°, having TZ = 0, is its own antiparticle. Such particles, which have no quantum number that could distinguish particle from antiparticle, are said to be self-conjugate. Another property of the pion is its instability. The n ° decays spontaneously by an electromagnetic interaction with a lifetime of about 10 -16 sec into two high-energy photons ir ° y + y (17-18) or else, rarely, into an electron-positron pair and one photon. Although this sounds like a very short decay time, it should be compared to the time 10 -23 sec that would characterize the decay if it took place through the strong nucleon (or nuclear) interaction. The value 10 - 23 sec is just the time that particles moving with relative velocity c 108 m/sec would overlap within a distance of the range of nucleon forces r' 10 -15 m. The facts first used to identify the electromagnetic nature of the n° decay are that photons participate only in the electromagnetic interaction and that the decay lifetime is much longer than the time 10'23 sec that would suffice if it could go by the stronger interaction. The other pions do not decay in the same ways as the neutral pion. Instead, the n+ decays with the even longer lifetime of about 10 -8 sec, according to the scheme n + —> µ + +vµ (17-19) where µ + represents the positively charged muon, and v is the muonic neutrino. The n+ decays with the same lifetime according to the scheme n msµ+v (17-20) where ji - is the negatively charged muon, and v,, is the muonic antineutrino. The positive muon is the antiparticle of the negative muon, just as the positron is the antiparticle of the electron. In fact, in essentially every regard, except for their higher rest mass, muons are like electrons. The charged pion decays involve an interaction which is related to the /3-decay interaction of nuclear physics. The fact that the lifetime of charged pion decay is much longer than for electromagnetic decay of the neutral pion is a reflection of the fact that the interaction involved in the decay is much weaker than the electromagnetic interaction. The student will recall that we made a similar comparison in the case of fi decay. For these reasons, both the decay of a neutron into a proton plus an electron and (what we now call) an electronic antineutrino, and the decay of a positive or negative pion into a positive or negative muon and a muonic neutrino or antineutrino, are said to take place via the weak interaction. This terminology leads to the nucleon interaction being called the strong interaction. Particularly in particle physics, the terms strong interaction and weak interaction are used to identify what are usually called the nucleon (or nuclear) interaction and the /3-decay interaction in nuclear physics. ^ 17 5 LEPTONS - Muons have no part in Yukawa's theory of the origin of the strong interaction, although this was not appreciated until some time after their discovery in 1936 by Anderson and Neddermeyer. These investigators found the particles as components of the cosmic radiation, and they showed that their rest mass is intermediate between the rest mass of an electron and the rest mass of a proton. We now know that they are produced in cosmic radiation mainly from the decay of pions. But, in 1936, pions had not been discovered, and it was naturally assumed that the le and µ- were Yukawa's mesons (in fact they were originally called it mesons). An ever increasing accumulation of evidence showed, however, that the interaction of muons with matter rn m ^ ^ J en r m o 0 - ^ z ^ CV INTROD U CTION TO ELEMENTARY PARTI CLES CD is very weak. For instance, the muons in cosmic radiation can penetrate great thick- nesses of solid matter with little attenuation, since they can be detected in deep mines. This being the case, muons can hardly be the particles responsible for the strong interaction, despite the fact that their rest mass m„ + = m _ = 106 MeV/c2 (17-21) is quite close to the value predicted by Yukawa. This situation was the source of considerable confusion in the ten years before the discovery of pions, but, after their discovery, it was immediately assumed that pions are Yukawa's mesons since the early evidence indicated that their interaction with matter is strong. Thus pions are closely associated with nucleons and interact via the strong interaction. Muons are closely associated with electrons and interact via the weak interaction. The muon and electron, the muonic and electronic neutrinos, and the antiparticles of each, are collectively called leptons. One of the pieces of evidence for the association between the negative muon and the electron is that both are fermions, both have charge — e and spin 1/2, and both have magnetic dipole moments corresponding to a spin g factor of 2. Their antiparticles, the positive muon and the positron, have charges and magnetic dipole moments of reversed signs. Muonic and electronic neutrinos are also spin 1/2 fermions, but they are uncharged and presumably have no magnetic dipole moments. They are distinguished physically from their antiparticles by their helicities (see Section 16-4), which are left handed for neutrinos and right handed for antineutrinos. It is not appropriate to define either an intrinsic parity or the usual isospin for any of these particles which participate in the weak interaction. The reason is that parity is not conserved in that interaction, as we saw in Section 16-4, and isospin is also not conserved in the weak interaction, as we shall see in a subsequent section. A new family of leptons, the tauons, was discovered in 1975. The quite massive (1784 MeV/c2) T + and T - are presumably accompanied by a tauonic neutrino and antineutrino. This family has all the characteristics given above for the electron and muon families. The electron is stable because there are no less massive particles into which the conservation laws allow it to decay. But muons do decay via the weak interaction, according to the following schemes e + + ve + vu (17-22) µ e + ve + v (17-23) where we use e + for the positron and e - for the electron. The lifetime for both decays is the same, and it has a value of about 10 -6 sec. The need for a distinction between the electronic neutrino y e and the muonic neutrino v u was demonstrated experimentally in 1962 by showing that the muonic neutrinos obtained from pion decay, (17-19) and (17-20), will produce muons but not electrons. Because of their much greater masses, charged tauons can have a variety of decays. For instance, they have purely leptonic decays into electrons and neutrinos like (17-22) and (17-23), or corresponding decays into muons like + µ+ T + vµ + vi (17-24) Tauons can also have semileptonic decays into leptons and strongly interacting particles, as for example T — --* n + v Z (17-25) With its large mass and many possible decay modes, the T has quite a short lifetime, being about 10 -13 sec. Since leptons are fermions, they are created or destroyed in particle, antiparticle pairs. Consequently, the number present in an isolated system will remain constant, if each particle makes a positive contribution to the count and each antiparticle ^ — 17 6 STRANGENESS - In the same year, 1947, that the pion was discovered in cosmic rays, some peculiar cosmic ray events were seen giving V-shaped tracks in cloud chambers. Because the initial work had to be done only with cosmic rays, it took time to learn about these particles. But it was clear that they were produced by strong interactions, since the process had a large cross section, and yet they decayed by weak interactions because their lifetimes were long. For example, a typical observation was (17-29) + p (strong) —* V ° -* n - + p (weak) The V°'s measured lifetime of 10 -10 sec is to be compared with the expected lifetime of 10 -23 sec, if the decay process involves the strong interaction just as the production process does. Except for the lifetime, the production reaction appears to be just the reverse of the decay reaction and hence, by detailed balancing, if the production is strong the decay ought to be also. Instead, the decay rate is 10 -13 of the production rate. This is why the V°'s were called "strange" particles. SS3N 3ONdb1S makes a negative contribution. Because of the distinction among electronic, muonic and tauonic leptons, each type separately satisfies a lepton number conservation law. These can be written (17-26) E Le = const (17-27) E L u = const (17-28) E L T = const The electronic lepton number Le is + 1 for an electron and —1 for the positron; it is + 1 for an electronic neutrino and —1 for an electronic antineutrino. The muonic lepton number Lµ and the tauonic lepton number L., are similarly defined so that the lepton number is + 1 for a particle and 1 for its antiparticle. The student should note that the muon and tauon decay schemes of (17-22) through (17-25), as well as the electronic beta decays discussed in Chapter 16, all satisfy these conservation laws. It will also be noted that these laws are of the same form as (17-17) for baryon number conservation, because baryon and the various lepton numbers are, like charge, additive quantum numbers. However, parity is a multiplicative quantum number. That is, the parities in an initial state are multiplied and, if parity is conserved, the product is equated to the product of the parities in the final state. The existence of these separate lepton numbers and the mass differences among the e, µ, and r are the only distinctions we know among these otherwise identical leptons. We also know from experiments that, unlike the strongly interacting particles (nucleon, n, etc.), they have no spatial extent down to at least 10 -18 m (10 -3 F!). With no structure to distinguish them, the point-like leptons are now considered to be truly fundamental particles. In the next chapter we shall see how these fundamental particles may relate to the strongly interacting particles discussed in this chapter. And more will be said about the nature of the weak interaction in the next chapter. But here it is desirable to mention at least that like the electromagnetic and strong interactions as manifested in nuclei, the weak interaction should be carried by a field quantum. This field quantum, or intermediate boson, is actually expected to appear in three forms, the W + , W - , and Z°. Indeed, in 1983 evidence was obtained for the W's, as well as for the Z ° . These spin 1 particles are quite massive, with the W's having a mass of about 80 x 10 3 MeV/c2 = 80 GeV/c 2 and the Z° having a mass of about 90 GeV/c2. Just as we saw that the massless photon gives the electromagnetic interaction a long range, and the —140 MeV/c 2 pion gives the strong interaction a short range, so we see that the massive intermediate bosons give the weak interaction an extremely short range. In fact, the weak interaction is not intrinsically weak; it is the very large mass of its field quanta which makes it appear so. ^ INTRODUCTION TO ELEMENTARY PARTIC LES co It was not until 1953 when they could be produced in an accelerator, the Brookhaven Cosmotron, that it was proved that two of these particles were produced in association with each other, and the idea of Gell-Mann and Nishijima was borne out that their behavior could be understood in terms of a new additive quantum number. The point is illustrated by a typical reaction for producing strange particles (17-30) + p—*A ° + K° where A° and K ° are symbols now used for two of the strange particles. If we assign the new additive quantum number, called the "strangeness" S, values such that S = 0 for the ordinary particles n and p, but S = + 1 for the K ° and S = —1 for the A °, then S will be conserved in this strong interaction. On the other hand, the typical decay, which is really that of (17-29) in modern notation A° —> rc + p (17-31) will not conserve S. Hence it cannot occur by the strong interaction, but must involve the weak interaction. To recapitulate, A ° and K ° particles are produced in association at a high rate (large cross section) in processes involving the strong interaction. They each decay independently, because they have flown apart, in processes involving the weak interaction. The decays occur at a low rate (long lifetime) because changing S requires the interaction to be weak. Because of the long decay lifetimes and also because of some neutral decay modes, both strange particles in one interaction were not seen in the original cosmic ray observations that used small gas cloud chambers. A more modern visualization of the production reaction (17-30) is shown in Figure 17-19, which is a photograph of tracks in a large liquid hydrogen bubble chamber. An incident rc - strikes a p in the hydrogen, producing a A ° and K °, with the A° decaying into a p and i - , as in (17-31), and the K ° decaying into a Tc + and 7r - , as we shall discuss later. What is now called the A ° particle, since that is a V (the appearance of its decay mode in a cloud chamber) upside down, has the rest mass m,,, = 1116 MeV/c 2 (17-32) This may be compared with the neutron and proton rest masses of 940 and 938 MeV/c2 . The value of this mass, as well as the need to conserve baryon number in the reaction (17-30), suggests that the A ° is a strange version of the nucleon; i.e., a baryon. Experiment has shown that like the neutron, the A° particle is a neutral spin 1/2 fermion. Also like the neutron, the A ° parity is taken to be positive by convention, since S-conservation prevents determining the relative neutron-A ° parity. Because there is no other particle of similar rest mass, the A ° is the only member of an isospin singlet, i.e., the A ° has T = 0 and TZ = 0. Having discussed the baryon A ° , we turn to its associatively produced K meson. Experiments have shown that there are four K mesons, the positively and negatively charged K + and K - , and the neutral K ° and K ° . Like the is mesons, the K mesons are all spin 0 bosons of odd intrinsic parity, where the parity has been measured relative to that of the A °. Their rest masses are m K + = mK - = 494 MeV/c 2 (17-33) and mKo = mKo = 498 MeV/c2 (17-34) Assuming that, as in nuclear physics, isospin and its z component are conserved in strong interactions involving strange particles, we can use the production reaction (17-30) to assign quantum numbers to the K mesons. Since T = 1 for the 7r — , T = 1/2 for the proton p, and T = 0 for the A°, the only possibilities for the K ° are T = 1/2 or T = 3/2. If the latter were true, there would be a quartet of T Z values and the K SS3N3ONda1S Figure 17-19 The associated production of a A ° and a K° in a hydrogen bubble chamber. An incident Tr - interacts with a p of the liquid hydrogen filling the chamber. The K ° decays into a Tc + and a Tr - . The A° decays into a p and a i - . The production takes place through the strong interaction, but the decays each utilize the weak interaction. The curvature of each particle in the applied magnetic field is used to identify the particle. (Courtesy Lawrence Berkeley Laboratory) meson family would have to span a range of four different electric charge states. But, in fact, there are only three charge states: Q = —1, 0, + 1. Therefore T = 1/2 for the K ° and the other K mesons. Note also that since Tz has the values —1 for the 7E - , + 1/2 for the p, and 0 for the A°, it must have the value Tz = —1/2 for the K°. In consideration of the way Q depends on Tz in other situations, we naturally say that the K meson with T = 1/2, Tz = + 1/2 is the K +. The K - is the antiparticle of the K +, and the K ° is the antiparticle of the K °, so the K - and K ° have values of Tz opposite to those of the K + and K °, respectively. Thus the K + and K ° form one INTRODU CTION TO ELEMENTARY PARTIC LES ti a `cs isospin doublet and the K+ and K ° form another. Note that, unlike the n ° which is identical to it °, the K ° and K ° are quite different particles. The reason is that the value of S for an antiparticle is the negative of its value for the particle, just as is the case for the quantum numbers B, L e , L M , and L. Thus S = +1 for the K + and K ° and S = —1 for the K - and K °. This difference in the value of S has many experimental consequences. For example, the reaction (17-35) K° + p ^ A° + rc + is possible (i.e., it conserves Q, B, T, and S), but no similar reaction can take place with a K ° . Notice that the nucleon, which is a baryon, has half-integer isospin and the pion, which is a meson, has integer isospin; but the baryon A has integer isospin and the meson K has half-integer isospin. Clearly the existence of strangeness changes the relationship (17-15) among charge, baryon number, and isospin. If we are to include all particles introduced so far, (17-15) now becomes B+S (17 36) 2 For pions and nucleons, which have S = 0, this reduces to (17-15). But for the A ° it properly tells us that TZ is 0, while for the K's it predicts correctly that T Z is either +1/2 or —1/2. Q =TZ + Example 17-5. - Verify the statements made immediately above about the A ° particle and the K mesons, and determine the value of TZ for each K. ■ The A ° has Q = 0, B = + 1, and S = —1. Hence (17-36) becomes 1-1 0 = TZ + 2 or TZ = O For the K +, these values are Q = +1, B = 0, and S = +1. So 1= TZ + 1 2 or TZ = +1/2 For the K - they are Q = —1, B = 0, and S = —1, giving —1 —1= T+ 2 or TZ = — 1/2 The K° has Q= 0, B = 0, and S = +1. Hence 1 O =TZ + 2 yielding Tz = —1/2 Finally, the K ° has Q = 0, B = 0, and S = —1. Thus O= TZ + —1 2 and TZ =+1/2 • The Gell-Mann-Nishijima relation (17-36) also tells us that in deciding whether a strong interaction takes place we need to check only three out of four quantities, Q, has AS = 1, ATZ = 1/2 (i.e., —1/2 —> —1 + 1), and A T = 1/2 (1/2 0 or 1, but not 2). Notice that the strange particle decays we have discussed so far, (17-31), (17-37), and (17-38), are unlike any previous weak decay in that only strongly interacting particles are involved. These nonleptonic processes are weak decays because strangeness is not conserved, and they do not have to involve leptons because the particle decaying does not possess lepton number. However, strange particles also have semileptonic decays, such as (17-39) K + —>n ° +e+ +ve This again displays AS = 1, ATZ = 1/2 (1/2 --> 0), and A T = 1/2 (1/2 —> 1), since only the K + and 7E° have nonzero T. There are also purely leptonic decays, like K + —> µ+ + vµ (17-40) We shall discuss the K ° lifetime separately, since it is unusual. But the decay of (17-38) has a lifetime of about 10 -10 sec, while the K + or K - has a lifetime close to that of the pion, about 10 -8 sec, quite a lot longer than the 10 -1° sec lifetime of the A ° or K ° . The reason the decay K + 7t + + 7L° (17-41) has a lifetime two orders of magnitude longer than the decay (17-38) is that in (17-41) AT = 1/2 is not possible. Note that the i + ir° state has TZ = + 1 so it cannot have T = 0. And since the it's have spin 0, the spin part of the eigenfunction is symmetric, as is also the (-1)` space part, because the zero spin of the K forces the orbital angular momentum to be zero. Thus the isospin part of the eigenfunction must be symmetric too, and this is the T = 2 state because two parallel vectors are symmetric with respect to label interchange. Since the decay involves T = 1/2 T = 2, it is inhibited because AT = 3/2. In the decays (17-38) or (17-41), we have noted that since both the K and it mesons have zero spin, angular momentum conservation requires that the two n's be emitted in a state of zero orbital angular momentum. Thus the parity of the final state is the product of the negative intrinsic parities of the two pions times the orbital factor of (-1)1, where 1 = 0, giving an overall positive parity. As the K meson was discovered prior to 1957 when parity violation in the weak interaction became known, it was thought that it, then called the B, had positive intrinsic parity. On the other hand, a particle of similar mass and lifetime, then called the r (not to be confused with the much more recently discovered lepton which now goes by that symbol), was observed to decay into three pions. SS3N3 ONdalS TZ , B, and S, since all must be conserved but are related through (17-36). For example, once we know the TZ assignments for particles of given Q and B, we do not have to be concerned about S. Applying this to particle decays, we have seen from (17-31), or the closely related decay A° ^ n + 71 ° (17-37) that S is not conserved in weak processes. Rather, in weak interactions when S is nonzero, it changes by one unit, so that AS = 1. This rule could also be expressed as ATZ = 1/2 in weak strange particle decays. That is, the A° with TZ = 0 decays into a neutron n with TZ = —1/2 and a 7c° with TZ = 0, corresponding to a total change of the z component of isospin of one-half. Not only are S and TZ not conserved in the weak decay, but T is not conserved either. In (17-37), T = 0 initially and T = 1/2 or 3/2 in the final state since T = 1/2 for the nucleon and T = 1 for the pion. Detailed consideration of the decay rates show that the predominent decay occurs for AT = 1/2, so in this case the pion-nucleon system is formed in the T = 1/2 state. The same rules of course apply to K decays. For example K° 7c - + i + (17-38) co INTRODUCTIO N TOELEMENTARY PARTICLES ^ Now if the i like the 0 has zero spin, for which there was evidence, then it must have negative intrinsic parity to be a different particle from the 0, if parity is conserved. To understand why the z would have negative intrinsic parity requires a more detailed explanation. That the product of the intrinsic parities of the three pions in the final state would give an overall negative parity is clear; the question is how to handle the possible orbital angular momenta of the three particles. Consider, for the sake of definiteness, the z + in the reference frame in which it is at rest. Whatever motion the three particles into which it decays has in its rest frame, their orbital angular momenta can be broken up into that (call it L) of the rc + ir + system and that (call it 1) of the rc - about the center of mass of the two ir + 's. The overall parity of the final state is then (- 1) 3( - 1) 1 ( -1)r = — ( —1)21 = —1 (17-42) The first equality depends on the fact that the vector sum of I and L must add to zero to conserve angular momentum, so their magnitudes must be equal and l = L. As the properties of the z and 0 were found experimentally to be more and more alike, it became ever more difficult to believe they were not the same particle. Inspired by this, Lee and Yang analyzed past experiments and found that there was no compelling evidence for parity conservation in weak interactions. They proposed tests, one of which is discussed in Section 16-4, which proved that indeed parity is not conserved in these interactions. The z and 0 then became the same particle, called the K meson. Since weak interactions do not conserve parity, strong interactions must always be employed in determinations of particle intrinsic parities. However, because of strangeness conservation, no strong interaction will involve just a single strange particle. Therefore, it is impossible to determine the parity of a strange particle relative to nonstrange particles. Thus the intrinsic parity of the A is defined to be even and, with respect to that definition, the parity of the K is odd. While the K and A were the first strange particles observed, a large number are now known. We shall discuss just a few of these, starting with those which are strongly interacting fermions (i.e., baryons) that decay via the electromagnetic or weak interactions. Any baryon possessing strangeness is also called a hyperon. The hyperons can be classified according to their strangeness, with values of S = —1, — 2, and — 3 being possible. Like the A°, the E hyperon has S = —1. But instead of being an isospin singlet, it is an isospin triplet with the E - , E°, and E + having T = 1 and T.. = —1, 0, and + 1, respectively. The three E particles have nearly the same mass mE ^ 1190 MeV/c2 (17-43) and spin 1/2 with even parity. The S = —2 hyperons constitute an isospin doublet that are called E particles. The E° with TZ = + 1/2 and the E - with T Z = —1/2 have roughly the same mass m^ ^ 1320 MeV/c 2 (17-44) and spin 1/2 with even intrinsic parity. Finally, there is an S = — 3 isospin singlet, the f - particle of rest mass mo - at 1670 MeV/c 2 (17-45) This T = 0, TZ = 0 particle has spin 3/2 and even intrinsic parity. Each of the E, E, and 0 particles are produced in a high-energy collision through the strong interaction in association with other particles in such a way as to conserve strangeness. For instance, the E - with S = — 2, which was first discovered in cosmic rays, is produced in association with two K mesons that both have S = + 1. With one exception, these hyperons decay by the weak interaction. As an example, the E decay 8 - —> A ° + (17-46) with lifetime —10 -23 sec because S (or Te), T, and Q are conserved. 17 7 FAMILIES OF ELEMENTARY PARTICLES - Table 17-1 lists the particles we have discussed that are stable, or else decay only by the electromagnetic or weak interactions. Related particles are grouped into families: the photon, the leptons, the mesons, and the baryons. Both the leptons and the baryons are fermions, and both the photon and the mesons are bosons. The mesons and baryons, i.e., the particles that participate in the strong interaction, are called collectively hadrons, and this term is widely used. The entries in the table are: family name; particle symbol; rest mass; lifetime; charge Q; intrinsic spin s; lepton number L e , or Lu, or LL; baryon number B; and, where appropriate, intrinsic parity P; isospin T; isospin z component TZ; strangeness S. S3 -10I11:1 `dd Aa `d1N31/1313 JOS3nIW dd has a lifetime of about 10 -10 sec, which we have seen is typical of a weak interaction. Because of the sequential decays, (17-46) followed by especially (17-31), the E is often called the cascade hyperon. The exceptional hyperon decay is that of the E °, which decays electromagnetically according to the scheme E° A° + y (17-47) with a lifetime of about 10' 9 sec. Note that in this electromagnetic interaction the z component of isospin is conserved since TZ = 0 for the photon, the E ° , and the A° . But this is required by the Gell-Mann-Nishijima relation, (17-36), and the obvious conservation of strangeness in the decay (17-47). It is generally true that T and hence S are conserved in the electromagnetic interaction. It is TZ or S conservation and the values of the masses (i.e., K's cannot be produced to carry off S) which prevent the E or S2 decays from proceeding relatively rapidly by the electromagnetic interaction. Unlike the strong interaction, however, the electromagnetic interaction does not conserve T. Recall that isospin conservation in the strong interaction is a way of expressing its charge independence. Since the electromagnetic interaction is obviously not charge independent, it cannot conserve isospin. Another way of saying this which will be useful later is that although the photon has TZ = 0, it is a mixture of T = 0 and T = 1, so in interactions involving a photon T does not have a definite value. In addition to the A, E, E, and SI hyperons which decay via the weak or electromagnetic interactions, there are known a large number of strange particles, both mesons and hyperons, which decay via the strong interaction. Although these particles exist only very briefly ( 10 -23 sec), they are in every other way equivalent to the baryons and mesons we have discussed. It is just an accident of their higher mass, permitting them to decay into other strongly interacting particles, which makes them seem so different. There are many nonstrange particles which decay strongly also. We shall discuss them further in the next section, but here we shall simply mention the classes of short-lived strange particles. At the time of writing, about 14 A-like particles (S = —1, T = 0) were known, ranging in mass up to about 2600 MeV/c 2. There were about 12 E-like particles (S = —1, T = 1) going up to about the same mass. While only the one S = -- 3 particle was known, there were at least four E-like particles (S = —2, T = 1/2). In addition to these hyperons, there were about 7 K-like mesons (S = +1, T = 1/2) which decay strongly, with masses ranging up to about 1800 MeV/c 2. As an example of a strong decay involving strange particles, consider the K* meson which has a mass of about 890 MeV/c 2, allowing the decay (17-48) K* + K + + 7C° O) c0n Table 17-1. Particles that are Stable or Decay either Weakly or Electromagnetically Lepton Number Le, Lµ, or Lt Generic Name Particle Symbol Rest Mass (MeV/c2) Lifetime (sec) Photon y 0 stable Leptons ye vµ vT 0 0 0 0.511 105.7 1784 0 0 0 —1 —1 —1 1/2 1/2 1/2 1/2 1/2 1/2 +1 0 —1 +1 0 0 0 0 0 0 0 0 0 0 0 497.8 493.8 549 958 stable stable stable stable 2.2 x 10 -6 5 x 10 -13 -8 2.6 x 10 8 x10 -17 2.6 x 10 -8 1.2 x 10 -8 8.9 x 10 -11 and 5.2 x 10 -8 1.2 x 10 -8 8 x 10 -19 2 x 10 -21 0 —1 0 0 0 0 0 0 938.3 939.6 1116 1189 1192 1197 1315 1321 1672 stable 925 2.6 x 10 -10 8.0 x 10 -11 6 x 10 -2° 1.5 x 10 -10 2.9 x 10 -10 1.6 x 10 -1° 8.2 x 10 -11 +1 0 0 +1 0 —1 0 —1 —1 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 3/2 e- µzn+ o n- n Mesons K+ K° _ K° K- I/ o n' p n Baryons A° E+ E° E,° n- 139.6 135.0 139.6 493.8 497.8 Charge Q 0 Intrinsic Spin s 1 +1 +1 Isospin z component T., Strangeness Intrinsic Parity P Isospin T Odd 0, 1 0 0 0 0 0 Odd Odd Odd Odd Odd 1 1 1 1/2 1/2 +1 0 —1 +1/2 —1/2 0 0 0 +1 +1 0 0 0 0 0 0 0 0 Odd Odd Odd Odd 1/2 1/2 0 0 +1/2 —1/2 0 0 —1 —1 0 0 0 0 0 0 0 0 0 0 0 +1 +1 +1 +1 +1 +1 +1 +1 +1 Even Even Even Even Even Even Even Even Even 1/2 1/2 0 1 1 1 1/2 1/2 0 + 1/2 —1/2 0 +1 0 —1 +1/2 —1/2 0 0 0 —1 —1 —1 —1 —2 —2 —3 0 +1 +1 Baryon Number B 0 +1 +1 0 S 0 0 0 0 0 0 0 ' FAMILIES OF ELEMENTARY PARTI CLES The leptons and baryons all have antiparticles, although they are not shown in the table. Compared to a lepton or a baryon, the "quantum numbers" of its antiparticle have values with: opposite Q; same s; opposite Le , or Lk, or L„ or B; and, for baryons, opposite P; same T; opposite TZ; opposite S. An antiparticle has the same rest mass, and also the same lifetime, as the particle. The reason for these two equalities will be discussed in the next section. The antiparticles of the mesons are shown in the table. We have already discussed the fact that the K - and K ° are, respectively, the antiparticles of the K + and K ° . Inspection of the table will confirm that the relation between the quantum numbers of the K + and K , and of the K ° and K ° , agree with the particle, antiparticle rules quoted earlier for leptons and baryons, except that the intrinsic parity does not change in the K, anti-K case. The predicted (and experimentally confirmed) particle, antiparticle parity rules reflect the facts that mesons are bosons, and that baryons are fermions. Similarly, the it + and it are particle and antiparticle, while the 7r ° is its own antiparticle, as we have already discussed. Two entries in the table have not been mentioned yet; they are the n° and n mesons. Like the it ° , these nonstrange mesons decay electromagnetically and are their own antiparticle. They are very like the lr °, except that they have T = 0 and greater masses. The main decay of the n ° is, again like the n° , into two photons. But its larger mass gives the n° a much shorter lifetime. Since the n' is even more massive, it has a still shorter lifetime. However, its large mass makes the decay into an n° and two it's more favorable than the decay into photons. Omitted from the table are the graviton, W + , W- , Z° , and the extremely numerous particles which decay via strong interactions. It should be emphasized again that the short-lived particles are in every way equivalent to the other particles, except for their lifetimes; they are excluded only to avoid making the table too long. But a few of the short-lived particles need to be discussed, since they are quite important. The first short-lived particle found was not immediately recognized as such. In pion-nucleus scattering experiments performed by Fermi and others in 1952 it was found that there is a strong resonance in the cross section at a pion bombarding energy of 195 MeV. Figure 17-20 shows the rc tp elastic scattering cross section as a function of the quantity s, the square of the total center-of-mass energy of the system including the pion and nucleon rest masses. Since the n + has T = 1, Tz = +1 and the p has T = 1/2, TZ = + 1/2, the system is in the T = 3/2, TZ = 3/2 state. (The 7r - p system in the T = 3/2, TZ = —1/2 state shows the same kind of cross-section resonance at the same energy, providing thereby additional evidence for the conclusion that, while the strong interaction depends on T, it does not depend on TZ .) The full width at half-maximum, F, of the resonance, whose peak occurs at a total energy of 1232 MeV, is about 120 MeV. This means that the pion and proton must temporarily form a composite entity that holds together for a time t h/F — 10's eV-sec/10 8 eV — 10 -23 sec. If moving at a characteristic velocity of c/3, the entity would maintain its existence over a distance d — et/3 — 108 m/sec x 10 -23 sec — 10 -15 m, which is the range of the strong interaction. It is therefore not unreasonable to speak of a pion and a proton forming a very short-lived particle, which is called the A(1232). It has a definite set of quantum numbers: s = 3/2, B = 1, P = even, T = 3/2, S = 0. But its mass is not definite, and it would be best expressed as 1232 + 60 MeV/c 2 . The indefiniteness of the mass is just what would be expected from the uncertainty principle, the energy uncertainty of 120 MeV corresponding to the time uncertainty of —10 - 23 sec. Many more pion-nucleon resonances were later found. Some, like the A(1232), have T = 3/2 and some, like the N(1440) have T = 1/2 just as does the nucleon. At the time of writing about 13 of the T = 3/2 particles called A's were known, ranging in mass up to about 3200 MeV/c 2 and in spin up to at least 11/2. Above the nucleon INTRODU CTIO N TOELEMENTARY PARTI CLES 4 5 6 8 101 s (10 6 MeV 2 ) 3 4 5 6 8 102 2 2 3 4 5 6 + The elastic scattering cross section for 7E mesons on protons, as a function of the square of the total center-of-mass energy of the system. Note the peaks in the cross section which are the pion-nucleon resonances—or short-lived baryons—described in the text. Figure 17-20 in mass, and going up to about 3000 MeV/c 2 , there were around 17 known T = 1/2 particles, the N's with spins again as large as 11/2. Just as in the strange particle case, there are short-lived mesons as well as baryons. One particularly important class is the vector mesons. They are so called because they have spin 1, which has three components just as does any spatial vector that has the three components x, y, and z. The first short-lived meson found was the p meson. It could be seen as a resonance in 7E-7E scattering, although this required some interpretation since one pion is not free but is in the field around a nucleon. The existence of a particle such as the p can be more directly inferred just by measuring the momenta of its decay products and reconstructing from that information the mass of the parent particle. In the case of the p this can be viewed as a twostep process 7E (17-49) +p —>p° +n —>n+ + 7c + n all of which takes place very rapidly. The momenta and rest masses of the two pions give a p rest mass of 769 + 77 MeV/c 2 . Thus the mass uncertainty, or mass width, is about the same as for the A(1232), and hence the lifetimes are also about the same. The quantum numbers of the p meson are s = 1, B = 0, P = odd, T = i, S = 0. Another short-lived meson, the w, has the same quantum numbers except that T = 0. Its rest mass is 783 + 5 MeV/c 2 , and it decays mainly into three pions. Yet another vector meson, which has quantum numbers identical to the co, is the 0, with a mass of 1020 + 2 MeV/c 2 . The 0 decays predominantly into two K mesons of the opposite strangeness. Since there is barely enough energy for that decay to occur, there is very little volume in phase space available—that is, very few final states which the decay can populate. This reduces the decay rate and so makes the width narrower. The reason the 4) does not decay into pions will be discussed in the next chapter. There are still heavier vector mesons, but the p, w, and 4) are the most important. One importance is the role the vector mesons, especially the w, are believed to play ^ 01 w 17 8 OBSERVED INTERACTIONS AND CONSERVATION LAWS - Particles which decay by the strong, eletromagnetic, and weak interactions have been introduced, and many of their properties have been discussed. These three interactions, plus the gravitational interaction, constitute the four interactions observed in nature as we normally perceive them. (In the next chapter the true character of these interactions will be introduced.) Table 17-2 summarizes the properties of the four observed interactions. In the table, the intrinsic strength comparison depends to a certain extent on the choice of exactly what attribute of the strength is to be compared; the numbers quoted are obtained from comparisons made in the manner of Table 17-2. Name The Observed Interactions Intrinsic Strength Strong (nuclear) 1 Electro magnetic 0 -2 Weak (/3 decay) Gravitational 10 -14 10 -40 Field Quantum Name Pion Photon Intermediate boson Graviton Rest Mass Spin Range 0 '-10 -15 m (with smaller repulsive core) 0 1 —105 MeV/c2 1 Long (cc 1/r) —10 -18 m 0 2 —10 2 MeV/c 2 (with heavier mesons for repulsive core) Long (cc 1/r) Sign Attractive overall (but with repulsive core) Attractive or repulsive Not applicable Always attractive OB SERVED INT ERACTI O NS AN D CO NSERVATION LAWS inproducgthes-arpulivcoenth pial.Anotherteresting way in which vector mesons appear is in the high-energy interaction of photons. Except for having T = 0 and 1, photons have exactly the same quantum numbers as the vector mesons. Thus photons can become vector mesons for times short enough to satisfy the uncertainty principle, just like the pions which are emitted and absorbed by nucleons in the manner described in Section 17-4. Since the vector mesons interact strongly, this is the predominant way in which a high-energy photon interacts. In this sense, the electromagnetic interaction becomes like the strong interaction at high energy. But photon cross sections for interaction with nucleons are still only about 1/200 that of pion cross sections because the photon infrequently turns into a vector meson. There are many other strongly decaying mesons which we shall not discuss, such as those having spin 2. Table 17-1 does not list them or other strongly decaying particles. And there are even weakly decaying particles not listed there. Many of these will be discussed in the next chapter, where we will learn that some particles can have strangeness-like quantum numbers which we have not encountered yet. With so many particles existing, it is not surprising that they cannot all be considered elementary; that subject will also be taken up in the next chapter. INTRO DU CTION TOEL EME NTARY PA RTIC LES Section 16-4. All of the entries in the table have been discussed previously, except for the characteristics of the quantum of the gravitational field. The gravitational field quantum is called the graviton. Its rest mass must be zero since the gravitational interaction has the same long range as the electromagnetic interaction, whose quantum is the zero rest mass photon. The spin of the graviton is known to be 2. The reason is the absence of negative gravitational mass, which prevents the existence of the oscillating gravitational dipole that would be required to radiate a spin 1 graviton. The lowest possible multipolarity oscillating gravitational source is a quadrupole (a distribution of mass oscillating between a prolate and oblate ellipsoidal shape), and a quadrupole source emits a spin 2 quantum. This is essentially the same argument as the one we used in Section 16-5 to conclude that a photon has spin 1 because there are no oscillating electromagnetic monopoles. While there is indirect astronomical evidence for gravitons, the laboratory searches have not yet yielded direct proof of their existence. These are extremely difficult experiments because the effects that can be studied on a laboratory scale are so small. However, the gravitational interaction is the only one of the four that is both long range and always of the same sign. Therefore its effects are cumulative so that, despite its intrinsic weakness, gravity is by far the most obvious of the interactions on the scale of the macroscopic world. Table 17-3 lists the three interactions of the microscopic world, i.e., of quantum physics, and all of the quantities that are conserved in certain interactions. The entry yes, or no, means that a quantity is, or is not, conserved. We have discussed all of the entries in this table, except those referring to charge conjugation and time reversal, which will be discussed shortly. However, the basis for some of the other entries will be taken up first. The conservation of energy, linear momentum, angular momentum, and parity all relate to symmetries of space and time. Each of these conservation laws implies an invariance principle, which results from a symmetry. For example, conservation of linear momentum comes from the invariance of the system to a spatial translation, and that invariance is a result of the homogeneity of space. That is, if one part of space is like another, then it does not matter where in space the system is located. If that is true, momentum will be conserved since there are no external forces. Similarly, angular momentum conservation occurs when there is invariance to the rotation of the system, which will be the case if space is isotropic. Energy is conserved if there is invariance to translation in time, which will occur if time is homogeneous. Table 17-3. Applicability of the Conservation Laws to the Observed Interactions ("yes" Means Conserved; "no" Means Not Conserved) Conserved Strong Electromagnetic Energy Linear momentum Angular momentum Charge Electronic lepton number Muonic lepton number Tauonic lepton number Baryon number Isospin magnitude Isospin z component yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes yes no yes yes yes yes yes yes yes yes yes no (A T = 1/2 for nonleptonic) no (A TZ = 1/2 for nonleptonic) Strangeness Parity yes yes yes yes no (AS = 1) Charge conjugation Time reversal (or CP) yes yes yes yes Quantity Weak no no yes (But 10 -3 violation in K ° decay) OBSERVE D INTERACTI ONS AND CO NSERVATION LAWS All three of these relations among conservation laws, invariance principles, and symmetries can be proved classically or quantum mechanically. Parity conservation, which generally is a useful concept only for quantum mechanical systems, results from reflection invariance arising from a symmetry between left and right. The familiar conservation of charge results from a different kind of invariance principle, called gauge invariance. While the student may be familiar with gauge invariance from the study of electromagnetism, he probably will not have learned of its relation to charge conservation, to be explained next. In its simplest application, gauge invariance means that only differences of electric potential can have physical significance, and that a unique value cannot be assigned to a single potential. Wigner has given a simple demonstration of the relationship between gauge invariance in this sense and the conservation of charge. He supposes that charge is not conserved and that a charge creating and destroying device exists at a potential V which creates a charge Q, requiring an amount of work W to do so. Next the charge and the device are transferred some distance to a place where the potential is V', with V' < V. The charge and the device gain an amount of energy Q(V — V') in this transfer. At the new position the device is used to destroy the charge, regaining the energy W expended in its creation. This is possible because regaining W is independent of the particular value of the potential as a consequence of gauge invariance. Now the chargeless device can be brought back to the initial position where the potential is V without doing any work against the electric field associated with the potential difference between the two positions. In this cycle there has been a net gain in energy of Q(V — V'). Thus if gauge invariance and the nonconservation of charge are assumed, energy conservation is violated. The various lepton numbers and baryon numbers are chargelike quantum numbers. However, there is no known gauge principle which assures their conservation and hence lepton and baryon number conservation may not be absolute conservation laws, but only extremely good approximations. This issue is taken up in the next chapter. Also in that chapter is a discussion of the reasons for the conservation of isospin and strangeness and the introduction of other strangeness-like conservation laws. Concerning the new entries in Table 17-3, charge conjugation is the process of changing every particle of a system into its antiparticle. As an example, the charge conjugate of the ground state deuterium atom contains a nucleus with an antineutron and an antiproton, and an atomic positron. All available experimental evidence is consistent with the conclusion that the operation of both the strong and electromagnetic interactions is unaffected by, or invariant to, charge conjugation. For instance, such invariance is found experimentally in the strong interaction annihilation of a proton and an antiproton into the particle antiparticle pair K + K- , plus other particles, and is also found in measurements of the electromagnetic decay of the pi ° meson.Thrf,wblivetahnucsoftiderma(whos behavior is governed by the strong interaction) and also the positron (whose behavior is governed by the electromagnetic interaction) would act in the same way, because they are in the same quantum state at the same energy as the nucleus and the electron in the normal deuterium atom. So we may say, as indicated by the "yes" symbols in the table, that charge conjugation is conserved in the strong and electromagnetic interactions because the description of a system governed by either of these interactions is invariant to the operation. This is parallel to the terminology we use when we say by the "no" symbols in the table, and elsewhere, that parity is not conserved in the weak interaction because a description of a system whose behavior it governs is not invariant to the parity operation. In fact, the experimental evidence for the "no" symbol in the table that indicates charge conjugation is not conserved in the weak interaction, i.e., that the weak INTROD UCTI ON TO ELEMENTARY PARTI CLES interaction does distinguish between a system and its charge conjugate, is the same as the experimental evidence for parity nonconservation in that interaction. This can be understood quite simply from the pion decay of (17-19) or (17-20), which is shown schematically in Figure 17-21 for a frame in which the pion is at rest. In that frame the µ and y go off in opposite directions with equal magnitude of momentum p. Because the 2t has zero spin, the spin —1/2 u and y must have their spins essentially parallel or antiparallel to their directions of motion so that the two spin angular momenta add to zero. The parallel case (# 1) is shown above a mirror and the antiparallel case (# 2) is shown below the mirror. Each is a mirror reflection of the other. This is true because in such a reflection—or parity operation—the linear momenta reverse direction but the angular momenta do not because they describe circulations which do not reverse their sense. (Compare the situation here with the one illustrated SZ Case #1 = +1/2 C-›) • p2 > 0 • IT Sz = 0 pz =0 positive z CD • S=—½ l ^ pz < 0 Mirror / > 0 ^ Case #2 p2 • 71- =0 positive z • p2 < 0 Figure 17-21 The decay it—> µ + y in the rest frame of the ir. The directions of the linear momentum of the it and of the y are indicated by arrows labeled by the signs of their z components, such as pz > O. The directions of their angular momenta are indicated by straight arrows labeled with the values of the z components, such as Sz = +1/2, and also by curved arrows showing the senses of the corresponding circulations. Since reflection in a mirror whose plane is parallel to the plane of circulation does not change its sense, the reflection does not change the directions of the angular momentum vectors. But reflection in a mirror whose plane is perpendicular to the direction of motion reverses that direction. Therefore . the linear momentum vectors are reversed by the reflection. The two possible cases which conserve both linear and angular momentum in the decay are shown, and labeled #1 and µ + v u , only case #2. Each is the parity inversion (mirror reflection) of the other. For 7r + #1 is seen in nature, while for 7r ) µ+ + v, only case #2 is seen. These observations show that neither parity nor charge conjugation are conserved in the decay. - in Figure 16-15, being sure to take into account the difference in orientation of the mirrors in the two figures.) Since parity is not conserved in this weak decay, # 1 or # 2 will be observed, but not both equally. If charge conjugation were conserved and parity not conserved, whichever of the decays # 1 or # 2 dominated, the same one would have to dominate if the system (say, 7c+ p+ + vii) were charge conjugated (to n - > u- + I,L). That is not observed. Instead, # 1 dominates for n - decay and # 2 for n+ decay, showing that both parity P and charge conjugation C are not conserved. Thus the entries for both of these should, in fact, be the "no" symbols shown in the weak interaction column of Table 17-3. The combination of P and C violation can be expressed by saying that particles are left handed. That is, the y has its momentum and angular momentum antiparallel, as would a left-handed screw, while the antiparticle v is like a right-handed screw with its momentum and angular momentum parallel. This handedness, or helicity (which was introduced in Section 16-4), is at or near a maximum for the y or v because these particles are traveling at or near the velocity of light since their mass is zero or close to it. Angular momentum conservation forces the y+ or ,u - to have helicity opposite to what it would like to have (i.e., the particle y is naturally left-handed and the antiparticle µ+ naturally right-handed), and this suppresses the rate of n decay by a factor of 10 5 . But n decay occurs at all only because the y has mass and is traveling at y < c. It is possible to have a reference frame traveling faster than a particle of finite rest mass. In such a frame the helicity is reversed since the spin is unchanged but the particle appears to be moving in the opposite direction. A zero rest mass particle travels at y = c, and it is not possible to have a more rapidly traveling reference frame. So the helicity cannot be reversed unless the rest mass is nonzero. When P and C were found to be not conserved in the weak interaction, the hope was that the combined operation CP (that is, performing in sequence each of the two operations) would leave invariant the description of a system governed by this interaction. For example, if such CP conservation were valid it would require that if decay # 1 in Figure 17-21 occurs for the n - then decay # 2 would occur for the n +. This is just what is observed. Indeed, experimental tests show that CP is conserved to at least the 1% level in weak interactions. We shall see shortly that CP is closely related to time reversal. Time reversal is the process of changing the time variables describing the evolution of a microscopic system into their negatives. In other words, it changes the direction of flow of time, like running a motion picture backwards. Application of time reversal to Figure 17-21 is not interesting because it leads to a description of the improbable situation in which a y and a y collide to form a n. It is worthwhile noting, however, that time reversal preserves helicity. To see this, take the y as an example. Time reversal reverses the direction of the vector describing the linear motion; but it also reverses the sense of circulation so that the spin vector reverses as well, keeping the particle left-handed. Time reversal invariance cannot be tested by measuring the rates for forward and backward weak interactions because one of the rates would be too small to measure. But that method has been used for the strong and electromagnetic interactions. An example is p + p < n+ + d, which can be observed in both directions as was discussed in Section 17-4. Another example of a time reversal experiment for the strong interaction is a comparison of the cross section for a reaction such as 12Mg 24 + 2He4 -* 13Al27 + 1H1 — and the cross section for its inverse 13Al27 + 1H1 --^ 12M g24 + 2H e4 with the momentum vectors of the bombarding and target nuclei in the second reaction adjusted to be equal but opposite to ttlose of the product and residual nuclei of OBS ERVED IN TER ACTI ONS ANDCON SE RVATI ON LAWS - INTRO DUCTION TO ELEM ENTA RY PARTIC LES the first reaction. Time reversal T (not to be confused with isospin) is found by such experiments to be a good symmetry for strong and electromagnetic interactions. In somewhat more complicated experiments (involving trying to observe processes described by an odd number of momenta and angular momenta vectors which would change sign under the time-reversal operation), invariance to T is found in weak interactions to the 1% level. Although testing time-reversal inva ri ance directly to a high degree of accuracy for the weak interaction is difficult, a sensitive indirect test is available by using the socalled CPT theorem. This is a very general theorem of relativistic quantum mechanics which shows that, for any system governed by any interaction conforming to the relativistic requirement that cause must precede effect, the result of successively carrying out the charge conjugation operation C, the parity operation P, and the time-reversal operation T is to leave the essential description of the behavior of the system unchanged. As a consequence of the CPT theorem, the observed violation of P in the weak interaction requires that C and/or T be violated as well. Direct experiments show that C is violated, as was discussed above. If T is also violated then the CPT theorem demands that CP be violated. Hence if the CPT theorem is correct—and not only would its failure destroy the basic theoretical structure of much of physics, but also it has been tested extensively by experiment—then a test of CP is also a test of T. As this is being written there is only one particle known whose properties provide sufficient sensitivity to test for small effects of the nonconservation of CP or T, and that is the K °. In a rather amazing demonstration of quantum mechanics, the K ° subsequently decays by the weak interaction. The particle produced by the strong interaction, which conserves strangeness, must be described by an eigenfunction of a strangeness operator whose eigenvalue is one or the other of the two possibilities for the K, namely + 1 or —1. That is, either the K ° with S = +1 or the K ° with S = —1 thaisproducengtaioshemprtclaonwhi istheproducal.Btsinerg ocsvedbythwakinr- action responsible for the decay of the K, the particle that decays is not required to be described by an eigenfunction of the strangeness operator. Now the neutral K n+ + i a system described by an eigenfunction of the isobervdtcayn CP operator with eigenvalue + 1. This can be seen simply from Figure 17-22, where , • vrf =I • 0 zr - x=0 • ar+ x l^ • a- I m+ and a 7E — of Figure 17 22 The diagram on the top represents a zero angular momentum from K ° decay. They are located on the x axis on each side of its origin and at equal distances from it. When the parity operation P is carried out by interchanging the signs of the coordinates of the two pions, the diagram in the center is obtained. When the charge conjugation operation C is carried out on the center diagram by interchanging the signs of the charges of the two pions, the diagram on the bottom is obtained. Since it is identical to the diagram on the top, the combined effect of the two operations is to make no change in the system. - • ^+ x I = 0 • a- K° K2 = + 1 [K° — CP(K °)] _ (K° — K °) (17-52) where the symbols represent the eigenfunctions for the corresponding particles and 1/4 gives the correct normalization. By applying CP to (17-51), the student will see that this operator gives the same eigenfunction back again, so that the corresponding eigenvalue is + 1. In the same way he can see that (17-52) is an eigenfunction of CP with an eigenvalue of —1. (The careful student may note that these statements seem apparent when using just C, as did Gell-Mann and Pais when they first investigated this subject, but that P introduces a bothersome minus sign. However, the charge conjugation operation has an undetermined phase which can be taken to be —1, so the original Gell-Mann-Pais convention can be retained.) Thus to conserve CP it is necessary to have K° î+ + 7r - but K° n + + it (17-53) + The K° can decay into a 7C and 7r - , but the K2 cannot. Since the 7c° is its own antiparticle, under charge conjugation it goes into itself and has a C eigenvalue of + 1. Its P eigenvalue is —1. Hence a system of three nc ° 's has a CP eigenvalue of (+ 1) 3(-1) 3 = —1. Therefore K2 -+ 7r ° + + n:° but K° (17-54) + It° + rc ° All of the possible decays of the K2 have at least three particles in the final state. This means that the volume occupied in phase space is small, making the decay much slower than that for the K°. Thus the K° has a lifetime of about 10 -10 sec, while the K2 has a lifetime of about 5 x 10 -8 sec, which is why it was not observed in the early cosmic ray experiments. Note that if (17-51) and (17-52) are added or subtracted the result is K° = ^ (K° + K2) (17-55) SM F' NOI1b'/11:13SNO0 GNVSNOIlO`d1:131NI 03/1a3S8 0 the parity operation interchanges the 7r + and it and the charge conjugation operation changes them back again. The result is to leave the i + 7c - system just as it was in the beginning; in other words, the eigenvalue of the operator CP for the eigenfunction describing the decay has the value + 1. Now if CP is conserved by the weak interaction, then the neutral K which decays to 7r + + ,r - must also have eigenvalue + 1. However, neither the K° nor the K° are described by eigenfunctions of the CP operator because charge conjugation of the K° gives the K°, and vice versa—a change which cannot be undone by the subsequent parity operation. Since the same state is not obtained after the CP operation on a neutral K, the state cannot be an eigenfunction of CP. How can we create eigenfunctions of CP in the neutral K system? First we note that the CPT theorem requires that particle and antiparticle have the same mass. Thus the K° and K° are degenerate in energy. But if these degenerate states suffer a small perturbation then we can consider them to be linear combinations of perturbed states which do not have quite the same energy. (See Appendix J.) The extremely small perturbation comes about through the process K ° ±27r4 K ° (17-50) which has a particularly low rate because it involves two successive weak interactions. The process gives the perturbed states, called K° and K2, slightly different masses. The K° and K° are then described by eigenfunctions of CP, constructed as follows _ 1 [K° + CP(K °)] _ (K° K°) (17-51) 0 co INTRODUCTIO N TOELEMENTARY PARTICLES co or = (K° — K3) (17-56) Thus, if a K° or K° is produced, half of the decays will occur through the shortlifetime mode K° and half through the long-lifetime modè K. A casual glance at (17-51), (17-52) and (17-55), (17-56) gives us an interesting, 'if somewhat oversimplified, view of the time evolution of the K°. Say a K° is produced. It corresponds to an eigenfunction of the S operator, but not of the CP operator, being half K° and half K. However, the K° component decays quickly, leaving just KZ which corresponds to an eigenfunction of CP but not of S, consisting of half K° and half K ° . Now suppose the K2 goes through matter. Because the K° has S = 1, just as do the hyperons, there are many reactions it can undergo, as we have already noted in connection with (17-35). Hence the K ° component can be absorbed out, leaving just the K° with S = + 1. The process is called regeneration. We see that either allowing the system to evolve in time, or to pass through matter, changes the nature of the particle. This means that, if the S eigenvalue is measured, information on CP is lost, and vice versa. The situation is analogous to determining the components of angular momentum in a Stern-Gerlach experiment. — The above description of the time evolution is not quite accurate because the small mass difference between the K? and K° causes the relative phase of the two corresponding wave functions to change with time, changing the K ° K ° mixture. This actually produces oscillations in the amount of K ° and K ° present. By measuring the wavelength of these oscillations, the K?—K? mass difference Am can be found. Since Am arises from the process (17-50), we might expect by the uncertainty principle to have AEAt = (Amc 2)(At 1) — h (17-57) where At 1 is the K° lifetime, or Am — h/At 1 c2 . Measurements give about half this value, or about 4 x 10 -6 eV/c2 . Since Am has been measured to better than 1%, and since the value of m is about 5 x 10 2 MeV/c2, the inaccuracy in the mass difference is smaller than the mass itself by 16 orders of magnitude! The discussion of the K ° began with the question of CP conservation. Clearly (17-53) and (17-54) test its validity. In 1964 Christenson, Cronin, Fitch and Turley found, at such a distance from the K° production point that the K°'s had all decayed, about 0.1% of the K°'s decayed by the CP-violating 2n decay. Thus to this miniscule degree CP conservation and hence, by the CPT theorem, T invariance are violated. Other experiments on the K° system have shown directly that it is T, and not CPT, which is not conserved along with CP. That is, there is evidence that through the rare mode in the weak interaction decay of the long-lived component of the K°K° system nature can distinguish at a microscopic level the direction of flow of time. This startling result would seem to be of great significance. In the next chapter we shall return to the issue while discussing gauge theories of particle interactions. Example 17-6. Discuss each of the following reactions in terms of the conservation laws listed in Table 17-3 and the particle quantum numbers listed in Table 17-1. (a)i +K ^ This reaction is impossible because it requires a strangeness change of 2. • (b) K - +p S. +K+ + K° 110-This is the reaction in which the SF, which has S = —3, was first produced. It is strangeness conserving since S = +1 for the K + and K °, while S = —1 for the K - . Charge and baryon number are conserved. So are angular momentum and parity because the final state can have one unit of orbital angular momentum. (Recall that the parity associated with orbital angular momentum is given by (— 1) 1.) Since isopin and its z component are also conserved, we see , • (f)A ° --^ n+y ^ This reaction, if it can occur, obviously must be electromagnetic. Since TZ = 0 for the A° and y, while TZ = — 1/2 for the n, we see that it cannot occur because TZ is conserved in the electromagnetic interaction. This conclusion agrees with experiment, and it is one of the reasons why TZ = 0 is assigned to the photon. The same conclusion could be reached by considering • S; the student should do so. QUESTIONS 1. Why is 3 P 1 not a component of the ground state of the deuteron? What about '5 0? 2. What experiments can be performed to test for the existence of a stable system of two protons? Of two neutrons? 3. In the center-of-mass frame of reference the differential cross section for neutron-proton scattering is isotropic at low energies. Describe qualitatively the behavior of the differential cross section in a frame of reference in which the target proton is initially stationary. 4. In considering the quantum mechanical behavior of a system of two identical particles, we talk of exchange of the labels of the particles. In considering neutron-proton scattering, we talk of exchange of the particles. What is the reason for this difference? 5. Why is the proton-proton scattering differential cross section necessarily symmetric about 90° in the center-of-mass frame of reference? 6. Explain why the scattering differential cross section is isotropic if only the 1 = 0 state participates in the interaction that produces the scattering. 7. A very large part of what we know about the forces acting in atoms is obtained from the study of the bound states of the simplest atom, hydrogen. Why is only a small part of what we know about the forces acting in nuclei obtained from the study of the bound states of the simplest nucleus, deuterium? 8. Why is the name isospin an appropriate one to use for the concept discussed in Section 17-3? 9. Can the exclusion principle be expressed in terms of isospin? See Figure 17-14. 10. Is there a physical picture of how the momentum of a zc meson transferred between the fields of two nucleons leads to an attractive force between them? From the point of view SNOIlS3f10 that the reaction can proceed via the strong interaction. If this were not the case, the cross section would be too small for it to be observable. • (c)52 - ->E° + it ■Here charge and baryon number are conserved. Angular momentum and parity are also conserved by the final state containing one unit of orbital angular momentum. Since the values of T are 0 for the S2 - , 1/2 for the E °, and 1 for the it we see that there must be an isospin change of at least A T = 1/2. Also, T is 0 for the 0 + 1/2 for the E °, and —1 for the n - , so the z component of isospin changes by A TZ = 1/2, which is equivalent to AS = 1. These quantum number changes do allow the decay to proceed by the weak interaction, but they prohibit it from proceeding more rapidly by the electromagnetic or strong interactions. 4 (d)m + +p—*p+p+n ■First we must determine the quantum numbers of the antineutron n. Applying the quoted rules to the table, we find: Q = 0, s = 1/2, B = —1, P = odd, T = 1/2, TZ = +1/2, S = O. Inspection demonstrates that all quantum numbers are conserved by the reaction, so it can take place by the strong interaction. 4 (e) +e+ +ve ^ If this goes at all, it must be by the weak interaction since v e does not participate in any of the others. Charge is conserved since Q = —1 for the p. The total baryon number equals —1 before and after, so it is conserved also. Electronic lepton number is conserved because it has the values —1 for the e + and + 1 for the ye . Angular momentum can be conserved. Parity is not defined for leptons, but parity is not a significant consideration for a weak interaction involving leptons. The same is true for isopin and strangeness. So the reaction can take place by the weak interaction. Note that it is just the charge conjugate of the /3 decay of the neutron. INTR ODUCTI ONTOELEMENTARY PARTI CLES Tâ c U 11. 12. of the position-momentum uncertainty principle, is it realistic to expect to be able to construct such a picture? What species of it mesons are exchanged in proton-proton scattering? In neutron-neutron scattering? What particle would remain if a proton emitted a it - meson? If a neutron emitted a 7r + meson? Why is it that the proton field cannot contain only a it meson, and the neutron field cannot contain only a i+ meson? Why is it believed that the repulsive core of the nucleon potential arises from the exchange of mesons heavier than the pion? What examples have been considered in earlier chapters of the conservation of the number of fermions, and the nonconservation of the number of bosons, in an isolated system? Exactly what is meant by the statement that a pion has odd intrinsic parity? Comparison of the decay rate of cosmic ray muons in flight with the decay rate of muons at rest provided the first experimental verification of relativistic time dilation. What would be a possible way to carry out such a comparison? Cosmic ray muons have been used in an attempt to discover hidden burial chambers in Egyptian pyramids, in much the same way that x rays are used to discover internal imperfections in a metal casting caused by gas bubbles. Why were muons used? Are there any particles other than neutrinos and antineutrinos which have definite helicities? Explain. Why must all field quanta be bosons? There are four distinctly different K mesons. Why do we not assign to them the isospin quantum number T = 3/2 so that they would constitute an isospin quartet? Exactly what does the strangeness quantum number S specify? Why is the copious production of A ° and K particles very difficult to reconcile with their slow decay, without the concept of strangeness? How does strangeness provide a reconciliation? Is there a conflict between the statement that isospin magnitude is not conserved in the electromagnetic interaction, and the statement that isospin z component is conserved in that interaction? Consider viewing the fl-decay experiment illustrated in Figure 16-15 in a mirror located below the nucleus (the mirror being horizontal) instead of in a mirror located to one side of the nucleus (the mirror being vertical). Explain how the arguments in the text concerning the appearance of the mirror image of the charge conjugate would be modified, but in such a way as to lead to the same conclusion. Give an example of a macroscopic system whose behavior is invariant to time reversal, and of a macroscopic system whose behavior is not invariant to this operation. Why can we say that the 7r ° meson is its own antiparticle? Do all particles have antiparticles? What about the photon? Does it seem reasonable to you to say that a meson or baryon resonance is an elementary particle? Just what is an elementary particle? Suppose a virtual particle and a real particle that decays by the strong interaction have about the same lifetime. What is the difference between them? To what mass or energy does their lifetime relate (through the uncertainty principle) in each case? - 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. PROBLEMS 1. Consult the discussion of the centrifugal potential in Section 15-8, and then: (a) Write the equation which determines the radial dependence R(r) of the deuteron eigenfunction, by evaluating (7-17) for l = O. (b) Show that it can also be written h2 d2u(r) 2µ dr2 + V(r)u(r) = Eu(r) where (c) Compare this with the time-independent Schroedinger equation for one-dimensional problems. (d) Give a physical interpretation of u*(r)u(r). (e) Evaluate, and give a physical interpretation of, the reduced mass u. 2. (a) In the equation obtained in Problem 1, take the nucleon potential V(r) to be a square well of radius r' and depth V0 , as in Figure 17-2. (b) Show by substitution that the general solution to the equation obtained is r < r' r > r' u(r)=A sin k i r +B cos k i r u(r) = Ce- k2r + Dek2r (c) Evaluate k 1 and k2 in terms of p, Vo , and the deutron binding energy AE. 3. (a) Apply to the general solution obtained in Problem 2 the conditions that R(r), and therefore u(r), must be finite, continuous, and single valued, and have first derivatives with the same properties. (b) Show that the application of these conditions at r = 0, r = r', and r -> oo leads to the relation V2µ(V - AE) co t [J2 u(Vo h ^ - AE) r ,1 _ N/2^4 E 4. Show, by substitition, that the relation obtained in Problem 3 has a solution with AE = 2.2 MeV, the observed deuteron binding energy, when the potential has a radius and depth of r' = 2.0 F and Vo = 36 MeV. 5. (a) Use the calculations in Problems 1 through 4 to evaluate the radial dependence of the eigenfunction for the ground state of the deuteron in a potential of radius 2.0 F and depth 36 MeV. (b) Sketch the potential V(r) and the function u(r) = rR(r). (c) Also sketch the radial probability density P(r). 6. A nucleon is incident on a nucleon which is initially stationary. Its kinetic energy, which is also the total kinetic energy of the system in that frame of reference, is K. Show that the total kinetic energy of the system, in a frame of reference in which the center of mass of the system is stationary, is K/2. 7. (a) Show that, for a nucleon potential of radius r' = 2 F, the maximum value of the orbital angular momentum quantum number is lmax = 1 unless the kinetic energy of each nucleon exceeds about 30 MeV in the center-of-mass frame of reference. (b) Also show that /max = 2 unless the kinetic energies exceed about 60 MeV. 8. (a) Calculate the value of /max for a 50 MeV proton incident on a nucleus of atomic weight A = 100. Take the radius r' of the optical model potential acting on the proton as the sum of the half-value charge distribution radius a = 1.07A"3 F and the range of nucleon forces 2.0 F. (b) Also evaluate 8 1/r', and compare with the angle between adjacent minima in the differential scattering cross section shown in Figure 16-26. 9. (a) Use the results of the electron scattering measurements, presented in Figure 15-6, to calculate the total number of nucleons per unit volume in the interior of a typical nucleus. (b) Then calculate the average center-to-center spacing of the nucleons. (c) Compare this with the radius of the repulsive core of the nucleon potential, and with the range of the nucleon force. 10. The position-momentum uncertainty principle produces an effect which tends to prevent the collapse of a nucleus that would occur if the nucleon potentials had no repulsive regions. (a) Show that this principle demands the kinetic energy of a typical nucleon confined to a nucleus of radius r' must be a least K, where K cc + 1 Although K becomes more positive as r' decreases, the potential energy V of the typical nucleon becomes more negative if the nucleon potentials are purely attractive and the nucleus is sufficiently collapsed to make the separation between all pairs of nucleons sw 318 oad u(r) = rR(r) less than the range of the nucleon potential. Show that, in these circumstances 1 Vcc r3 INTRODU CTI ON TOELEMENTARY PARTICLE S — 11. 12. 13. 14. r U (c) Then show that the total energy of the typical nucleon, E = K + V, would become more negative as r' decreases further so that the nucleus would continue to collapse, despite the uncertainty princple, if the nucleon potentials had no repulsive regions. Use information contained in Figure 16-14 and 16-36 to assign values of T and TZ to the isobaric analogue ground state levels of: (a) 1H3 and 2He 3; (b) 3 Li 7 and 4Be7 . (a) Estimate the maximum time that a it meson can exist in the field of an isolated nucleon before it is absorbed by that nucleon. (b) Estimate how many it mesons there can be at any instant in the field at distances from the nucleon about equal to the range of the nucleon force, 2 F. (c) Estimate how many there can be at distances about equal to the radius of the repulsive core, 0.5 F. The it° lifetime has been determined by studying the decay from rest of the K + meson in the mode K + -* it ° + 7E +. The average distance traveled by the i° in a block of photographic emulsion before it decays in the easily observable mode it —> e + + e— + y is measured, and from the calculated velocity of flight of the rc ° its lifetime is obtained. Given that the lifetime is 0.8 x 10 -16 sec, predict the average distance traveled by a 7r ° beforitdcays. In the laboratory (LAB) frame of reference, particle 1 is at rest with total relativistic energy E 1 , and particle 2 is moving to the right with total relativistic energy E2 and momentum p2 . (a) Use the relativistic momentum-energy transformation equations i — — v2/c2 (Px — vE/Px C 2) py = pv PZ = PZ 1 E'= — v2/c2 (E— vPx) to show that the frame in which the center of the relativistic masses of the system is at rest is moving to the right with velocity v =c cp2 E1 +E2 relative to the laboratory frame, and show that the total momentum of the system is zero in this center-of-mass (CM) frame. (b) Now let the two particles have the same rest mass m ° , and let the total relativistic energy of the system in the laboratory frame be ELAB. Evaluate ECM, the total relativistic energy of the system in the center-of-mass frame, and show that ECM = 2m °C 2 FLAB 15. Use the relation quoted in Problem 14b to evaluate the kinetic energy in the laboratory frame of the bombarding proton at which the proton, antiproton pair production process, (17-16), becomes energetically possible. 16. (a) Estimate the cross section for a 1 MeV electronic antineutrino incident on a proton to produce the reaction v e +p—>n+e+ (Hint: (i) Assume there is some probability of the reaction occurring when the distance between the ve and p is within the v e de Broglie wavelength ). Then estimate the time interval during which they can be that close. (ii) Estimate the probability P as the ratio of that time interval to the characteristic time —10 3 sec for the reaction. (It is the inverse of n + e + —> p + ie , which is an alternative to n —> p + e + ve; detailed balancing requires that all three have the same characteristic time which, we see, is just the neutron f-decay lifetime.) (iii) Take the cross section to be — P). 2 .) (b) Use the estimate to evaluate the (b)A° —> p + e (c)p --> e + ve + vµ (d)n+p-*E + +A° (e)p+pay+ y (f) p +p >n +E° +K° n+ + n — + n ° + n ° (g)K° — ^ SW3180ad mean free path of a 1 MeV v e in lead, by justifying the assumption that the cross section for its interaction with a lead nucleus is —10 2 times larger than the cross section for its interaction with a proton. 17. (a) Why is the p ° meson not allowed to decay into two n ° mesons? (b) Assuming that the incident deuteron has sufficient energy, why is the reaction d + d --+ 2 He4 + 7Z° not allowed? (c) Why is the decay of a ir + meson into an e + and a y not possible? (d) What prevents the reaction n —* p + e - + ve from taking place when the neutron is part of a deuteron? 18. For each of the following reactions state the fastest interaction through which the conservation laws allow it to proceed. If the reaction is forbidden by all interactions, state why. (a) p—*7r + +e+ +e 18 MORE ELEMENTARY PARTICLES 18-1 INTRODUCTION 667 particles that are more elementary; the new strong interaction; unification of electromagnetic and weak interactions 18-2 EVIDENCE FOR PARTONS 667 partons, or pointlike constituents of hadrons; evidence from neutrinonucleus scattering and electron-proton deep inelastic scattering 18-3 UNITARY SYMMETRY AND QUARKS 673 composite particles on the basis of isospin, or SU(2); including strangeness or hypercharge to make SU(3); quarks from SU(3); u, d, and s quark properties and multiplets; basis of isospin and strangeness conservation 18-4 EXTENSIONS OF SU(3)—MORE QUARKS 678 a fourth quark flavor, c; e + e - colliding beam production of ce states; Zweig-forbidden decays of quark- antiquark states; charmonium spectrum; particles with charm; c, b, and t quark properties; the T states of bb; quark masses; evidence for new quarks from 6(e + + e- — hadrons)/ 6(e + +e —>µ+ µ ) 18-5 COLOR AND THE COLOR INTERACTION 683 necessity for the color quantum number; evidence for color; color charge as the source of the true strong interaction; gluons; interquark gluon potential; asymptotic freedom and color confinement; gluon flux tube and hadronic energy density; magnitude of the color force 18-6 INTRODUCTION TO GAUGE THEORIES 688 gauge theories for all the fundamental interactions; converting a global gauge symmetry into a local one in classical electromagnetism; electromagnetic gauge invariance in quantum mechanics; application to relativistic quantum mechanics; Yang-Mills gauge theory; Abelian and non-Abelian theories 18-7 QUANTUM CHROMODYNAMICS 691 SU(3) of color; changing global color symmetry to local color symmetry; properties of gluons; evidence for gluons; gluon couplings to give quarkantiquark and three-quark binding; gluon masslessness and confinement; running coupling constant ; and antiscreening 18-8 ELECTROWEAK THEORY from Yang-Mills theory to electroweak theory; renormalization; spontaneous symmetry breaking; Goldstone and Higgs mechanisms; weak isospin; gauge bosons; Higgs particle; role of the W ± and Z°; neutral currents; 666 699 Cabibbo quark mixing; GIM mechanism; lepton-quark symmetry; masses and discovery of the W ± and Z °; relation between weak and electromagnetic interactions; apparent weakness of the weak interaction GRAND UNIFICATION OF THE FUNDAMENTAL INTERACTIONS 706 unification of the coupling constants; SU(5) unification of strong, electromagnetic, and weak interactions; experimental tests of unification (proton and double beta decays); neutrino mass searches; other unification schemes; cosmological consequences (dark mass, baryon-antibaryon ratio) QUESTIONS 710 PROBLEMS 712 18-1 INTRODUCTION In the previous chapter a large number of particles have been introduced, and the existence of many more has been mentioned. As ever increasing numbers of particles were discovered, it became more and more apparent that all of these could not be elementary. Once again by probing with finer resolution, which means higher energy, it was possible to discover particles which were more elementary. However, this time these constituent particles could not be separated and studied directly, so their discovery and the elucidation of their hidden properties makes an impressive detective story. This in turn has led to a completely new understanding of the strong, electromagnetic, and weak interactions. The strong interaction is not at all what it has seemed to be, and the electromagnetic and weak interactions are closely related to each other. Further unification of all the fundamental interactions appears likely. The 1970's produced a true revolution in fundamental physics, and it is the purpose of this chapter to present in an introductory way the consequences of that revolution. 18-2 EVIDENCE FOR PARTONS The proliferation of particles led to the general feeling that most, if not all, must be composites of other, more elementary particles. In addition, some theoretical models (of which the most important will be discussed in Section 18-3) suggested this composite nature. Additional impetus for this belief then came from experiments. In this section two of these experimental results will be discussed and their interpretation in terms of the parton model will be introduced. Parton is the name given to whatever are the constituents of hadrons such as the proton. Partons are pointlike (i.e., having no detected size), quasi-free constituents, only some of which will turn out to be the quarks discussed in the next section. One demonstration that hadrons have pointlike constituents is provided by the total cross section, as a function of energy, for neutrino-nuclear scattering. This statement requires considerable explanation. But first the utilization of neutrinos should be explained because the neutrino seems an unlikely particle to use for this purpose. It has only the weak interaction, which means for example that neutrinos from beta decays in the sun have about one chance in a million of interacting with anything even if they pass through the earth along a diameter. Thus doing experiments with these particles requires large numbers of them and very massive detectors. To produce neutrino beams, protons from a high-energy particle accelerator strike a nuclear target, creating r and K mesons. These mesons are focused by magnetic fields SNOlabd biO330N3aIn3 18-9 MORE ELEMENTARY PARTICLE S Figure 18-1 Electronic neutrino detectors (of the CDHS and CHARM groups) at the CERN laboratory, Geneva, Switzerland. This illustrates the massiveness of detectors required to measure the scattering of these weakly interacting particles. so as to create beams that go long distances, allowing decay, principally into muons and neutrinos (see (17-19), (17-20), and (17-40)). While the muons are also weakly interacting particles, they possess charge and hence undergo electromagnetic interactions, enabling them to lose energy by collisions with electrons in matter. By interposing sufficient shielding material (often iron or earth), the muons can be stopped. Those mesons which have not decayed interact strongly in this shielding material, and hence only neutrinos are left to enter the detector. Figure 18-1 shows a large electronic neutrino detector. Such a detector can identify each neutrino interaction and hence determine a total cross section. By changing the incident meson beam energy, the total neut ri no cross section a as a function of neutrino energy E can be determined, and typical results are shown in Figure 18-2. It is seen that the cross section has the behavior a cc E. This proportionality is the result expected if the apparently complicated process of the neutrino interaction, which produces many hadrons as well as a muon, is basically just the elastic scattering of a neutrino by a single pointlike particle. The promised explanation of this last statement will now be given. If the pointlike neutrino and a pointlike constituent of the nucleon undergo an elastic scattering, the probability or cross section for this contact interaction would depend only on the strength of that interaction (given by /3, the weak interaction coupling constant; cf. Section 16-4) and by the volume in phase space available for the process. That is, /3 determines the rate for a transition to any particular final state, and the phase space volume determines the number of possible final states. Since the interaction occurs at a point, the coordinates are unique, and hence momentum space is the same as phase space. The phase space volume thus depends just on the momentum, p, of the two particles in the center of mass system. In momentum space, p is the length of a radius vector, and the volume available with a momentum between p and p + Ap is a spherical shell 4np2Ap. Thus a oc p2 . Now a relativistic calculation shows that p2 = mE/2, where E is the laboratory energy of the neut ri no scattering elastically 130 120 110 SNOlat/d aOd 3 0N3 4IA3 100 90 0 a) U C 80 N E 4 70 b 60 0 0 20 40 100 120 140 160 180 200 80 60 Laboratory neutrino energy E (GeV) Figure 18 2 Total neutrino cross section Q on nucleons as a function of neutrino laboratory energy E from experiments at CERN (Switzerland), Fermilab (U.S.A.), and Serpukhov (U.S.S.R.). The linear dependence of c on E over two orders of magnitude in E is a demonstration of pointlike constituents (partons) inside the nucleon. The measurement errors are shown only for a few points at the higher energies, but these are typical percentage errors, so they would not be visible at lower energies. - from a pointlike target at rest of mass m. Therefore the experimental result that a cc E is to be expected for a contact interaction between a neutrino and a parton. While evidence for the existence of partons accumulated from different neutrino experiments over a period of time, evidence for partons came from even the first experiment on deep-inelastic electron-nucleon scattering at the Stanford Linear Accelerator Center (SLAC) in 1968. The term deep inelastic scattering needs to be explained. In Section 17-4 the charge distributions of the proton and neutron as determined by elastic electron-nucleon scattering experiments were shown. These displayed the existence of the pion cloud and the nucleon core. To explore the latter in more detail required higher electron energy to get a smaller de Broglie wavelength. However, the elastic cross section drops rapidly with energy, making the measurements much more difficult. Furthermore, elastic scattering implies that the nucleon recoils as a whole object, whereas exploring its structure indicates breaking it apart. Thus inelastic electron-nucleon scattering, in which other hadrons are produced from the nucleon, proved to be the way to find the parton structure of the nucleon. The adjective "deep" implies that the collision is highly inelastic. We illustrate the difference between elastic and inelastic electron-proton scattering by the Feynman diagrams of Figure 18-3. These diagrams are actually prescriptions for making calculations of rates or cross sections. But they have become the language - O ^^ MORE E LEM ENTA RY PARTICLES Time cu o o Figure 18-3 Feynman diagrams for (a) elastic and (b) inelastic electron-proton scattering. The space coordinate is the ordinate and time is the abscissa. The e and p approach each other and exchange a virtual photon, after which the e goes off in one direction, and either (a) the p or (b) somegrupfaticlnmsWgo U Time off in the other direction. In the inelastic case, the electron's energy changes in the W interaction from E to E', with the virtual photon carrying off the difference, E — E'. of particle physics, and hence we introduce them in this present simple application. As drawn here, time increases along the abscissa and space, represented by a single coordinate, is the ordinate. Thus the electron and proton are pictured as coming together in their center of mass system, and then interacting electromagnetically by the exchange of a virtual photon. After the interaction the electron goes off in one direction and, for elastic scattering, the proton in the opposite direction. For inelastic scattering, the proton breaks up, producing other particles which recoil oppositely to the electron. In the original experiments, only the final state electron was measured; hence the nature of the other particles was not important. To provide a little more familiarity with Feynman diagrams, we shall interrupt the discussion of electron-proton scattering to give another example of using these diagrams. Construct a Feynman diagram for neutron-proton scattering resulting from the exchange of a 7C meson. •Initially the neutron and proton approach each other, so lines are needed which start apart and converge. It does not matter whether the neutron or the proton is at the top, nor does it matter what kind of line is used. However, solid lines are most frequently used for baryons, just as the wavy line shown in Figure 18-3 is almost always used for the photon. For the exchanged pion a dashed line will be used to distinguish it more clearly from the baryons. Now we have a choice. The first possibility is that the proton emits a 7C + , turning into a neutron, and the 7r + is absorbed by the initial neutron, turning it into a proton, as shown in Figure 18-4a. The second possibility is that the neutron emits a i turning into a proton, and the 7C — is absorbed by the initial proton, turning it into a neutron, as shown in Figure 18-4b. Note that the dashed lines for the pions have appropriately different slopes in the two cases, indicating the two different origins and time progressions. However, these two diagrams are completely equivalent. The reason is that these virtual pions exist for too short a time to permit, even in principle, any measurement which could distinguish Figure 18-4a from 18-4b. Since the it+ and 7E — are antiparticles of each other, this illustrates the principle that an antiparticle is equivalent to a particle going backward in time. (That is, the emission of an antiparticle is equivalent to the absorption of a particle.) Because the distinction between (a) and (b) is meaningless, we shall frequently draw vertical lines for the extremely short-lived exchanged virtual particle. (The infinite slope of a vertical line does not imply that the particle travels with infinite speed.) • Example 18-1. n P ^ (a) ^ ,t p n Figure 18-4 Feynman diagrams for proton-neutron scattering through the exchange of a virtual n meson. In (a) the proton emits a 7C + meson, becoming a neutron, and the neutron absorbs the it+, becoming a proton. In (b) exactly the same process is described as the neutron emitting a 7E - to become a proton, and the proton absorbing the Tc - to become a neutron. We now return to inelastic electron-proton scattering. A qualitative result of the inelastic electron-proton scattering is that there was an excess of electrons scattered at large angles, reminiscent of the Rutherford scattering of particles which indicated the existence of the nucleus, as explained in Sections 4-1 and 4-2. Thus the electrons appeared to be hitting small, hard objects. A more quantitative analogy can be drawn between the inelastic electron-proton scattering and the inelastic proton-nucleus scattering discussed in Section 16-7. As discussed there and shown in Figure 16-27, the energy spectrum of protons emitted at a forward angle shows an elastic peak at high energy, followed by inelastic peaks at lower energy, corresponding to low-lying levels of the residual nucleus, and at still lower proton energy there is a continuum. The same features are shown in Figure 18-5a for electron scattering from a nucleus at forward angles, which means small momentum transfers from the electron to the nucleus. In terms of a diagram like Figure 18-3, the interaction is one in which the virtual photon transfers a small relativistic momentum. If the momentum transfer becomes large, as shown in the larger angle case of Figure 18-5b, the scattered electron spectrum becomes different. The elastic and inelastic peaks shrink, while the continuum becomes more important, being dominated by a broad bump. This bump is due to elastic scattering of the electrons from individual nucleons in the nucleus. It is not a sharp peak because the nucleons are in rapid motion due to their confinement in the nucleus. From the uncertainty principle, AxAp x ti h, if the nucleon is confined to a small region Ax it will have a large spread in momentum, Ap x . Sometimes this momentum, called the Fermi momentum, is directed toward the incident electron, giving a higher energy collision, and sometimes it is directed away from the electron, giving a lower energy collision. The result is an appreciable broadening of the elastic peak. For electron-proton scattering, we see in Figure 18-5c much the same features as in the electron-nucleus case. The proton elastic peak is followed at lower electron energy by inelastic peaks and then by a continuum. The inelastic peaks are due to the production of the short-lived nucleon-like N and A states (or pion-nucleon resonances) which were discussed in Section 17-7. Their masses, W, can be read off a scale antiparallel to that of the scattered electron energies. The most interesting part of the spectrum is the continuum. It corresponds to elastic scattering from the charged partons, which we shall identify as quarks in the next section. In this case the "bump" is too broad to be distinguished as such because the mass of the quark is about equal SNOla `dd1:103 3 0N 3 aU13 ^ MO RE ELEMENTARY PARTICLES Elastic Inelastic peaks from excited states E E' (a) E E' (b) Elastic E'(GeV) (c) 2.2 2.0 1.8 1.6 E 1.4 1.2 1.0 W (GeV/c2 ) Approximate representation of the spectrum of energies E' of a scattered electron of initial energy E for scattering at (a) a forward angle from a nucleus, (b) a larger angle from a nucleus, and (c) a relatively large angle from a proton. In case (c) inelastic peaks are seen at mass W of 1.24, 1.51, and 1.69 GeV/c 2 , and the quark elastic peak is smeared over the continuum by Fermi momentum. Figure 18 5 - to its Fermi momentum divided by c, resulting in a considerable spreading of the peak. In addition, the scattered electron energy E' is not the most appropriate kinematic variable to use to see the effect. Unlike the elastic and inelastic peaks, this continuum remains large as the momentum transfer is increased, which is characteristic of scattering from a pointlike object. Thus using both neutrinos and electrons, which are pointlike probes, to scatter from nucleons, it became increasingly evident in the late 1960s that the nucleons were 18 3 UNITARY SYMMETRY AND QUARKS - The experimental evidence for partons was obtained in a climate in which there had been proposed numerous theoretical models for composite particles. The first attempt along these lines was by Fermi and Yang in 1949. Although their model was not correct, it has in simplified form important features of a later successful model and hence will serve as a good introduction to that more complicated theory. If it is suspected that particles are composites, it is natural to assume that of the known particles a few are elementary and the rest are made up of combinations of those few. Taking this point of view, Fermi and Yang noted that the pion—the only other hadronic particle then established—could be considered a composite of the nucleon and the antinucleon. Another way of saying this, in terms of isospin T and its z component T2, is that a particle of isospin 1/2 (the proton p or neutron n) can be combined with an antiparticle of isospin 1/2 (the antiproton p or antineutron n) to form a particle (the pion n) of isospin 1. Recalling that p and n have T. = + 1/2 and p and n have T. = —1/2, we have the triplet combinations which are just like those for spin in (9-18): TZ = + 1 from (+ 1/2,+ 1/2) is equivalent to ph, which makes n+ TZ = 0 from [(+ 1/2, — 1/2) + (-1/2,+ 1/2)]/J is equivalent to (pp + nn)/V2, which makes n° T. = —1 from (-1/2, — 1/2) is equivalent to np, which makes It The n° is the symmetric combination of isospins (ignoring charge conjugation sign conventions which are irrelevant here), with 1/ \/-2- for correct normalization. The antisymmetric combination, (pp — nn)/-12-, would have T = O. This singlet could be associated with the n° meson, but that particle was not known in 1949. Note that if the nucleon and antinucleon have spins that are essentially antiparallel, the spin of the n is correctly 0 and its parity properly odd, since nucleon and antinucleon have opposite parities. To prepare for the more interesting and complicated model to be discussed shortly, we shall put the above results into the language of group theory, without actually using any group theory, which the student is not expected to know. Isospin plays a central role in making the particle combinations. Just as angular momentum conservation comes from rotational invariance in real space, so isospin conservation arises from isospin invariance in charge or isospin space. Now the rotational transformations in either real or isospin space form a group called the SU(2) group, which stands for the Special Unitary group in 2 dimensions. Under such a transformation a nucleus of A nucleons, of which Z are protons and A — Z are neutrons, would be changed into one with Z' protons and A — Z' neutrons, without any change in its properties so far as the strong (nuclear) interactions are concerned. This is what is S>1 1:1b110 a Nd A1:113WWASAI:Ib'lIN fl not elementary particles but that they had a structure. The results of these and other experiments could be explained to a surprising degree of accuracy by the simple parton model proposed by Feynman in 1969. In this model the partons acted as almost free, pointlike constituents. The partons participating in the electron or neutrino scattering discussed above are those which interact electromagnetically or weakly. However, the lepton-nucleon scattering experiments also demonstrated that there are some partons which are inert to leptons. It was found that the partons which were responsible for the scattering of leptons made up only about half the energymomentum available in the nucleon. The nature of these inert partons will be discussed in Section 18-5. In that section also there will be an explanation of how the partons, which must have large binding energies and be relativistic, can act like almost unbound, nonrelativistic particles, as required in the parton model. MO RE ELEMENTARY PARTICLES ti ^ meant by isospin invariance or rotational invariance in isospin space. The simplest representation of the group SU(2) is that having T = 1/2 and containing p and n. This is called the 2 representation from the number of components, since 2T + 1 = 2(1/2) + 1 = 2. The other simple representation is called the 2 and contains p and n, and hence also has T = 1/2. The one result of group theory that we need is that larger representations of that group can be made from these simpler ones. We have just seen that the 2 and 2 representation can make a singlet and a triplet, or 2 Qx 2 = 1Q + 3. The circles around the symbols indicate that although the results are like simple arithmetic, we are dealing with groups. The singlet and triplet are said to be irreducible because they cannot be transformed into each other. Thus (p,n) and (p,n) make the singlet r7° and the triplet ir + n°, - . This is just a fancy way of saying that two spins 1/2 (with 2 components each) can add to give spin 0 and spin 1 (with 1 and 3 components, respectively). Thus SU(2) classifies many of the hadrons just using T. However, when strange particles were discovered, SU(2) was obviously no longer adequate to classify the particles having strangeness. If it was to be useful at all, a group of greater dimensionality was needed, and in 1961 Gell-Mann and Ne'eman independently proposed using the group SU(3). This permitted introducing another quantum number, which could be strangeness. However, a related quantity which is called hypercharge Y and is just the sum of strangeness S and baryon number B (i.e., Y = S + B) is more convenient, since it treats baryons and mesons on an equal basis. Just as the 2 and 2 were the simplest representations of SU(2), so the 3 and the 3 are the simplest representations of SU(3), and we shall have much more to say about these shortly. For mesons the 3 and 3 can be combined to produce a singlet and an octet, or 3 Qx 3 = 1 Q+ 8. The octet of mesons having spin 0 and odd parity is of particular interest, and it is shown in Figure 18-6, which plots the hypercharge Y against T... It will be noticed that the Tc° and 17 ° both occupy the Y = TZ = 0 position, but we have already seen that one has T = 1 and the other T = 0. The singlet is the a7 ° ' (958) with T= 0. Note that all the members of the multiplet have the same spin and parity. In the limit of exact SU(3) symmetry they would also all have the same mass. Since the ir, K, and n masses are quite different, that symmetry is badly broken. This is our first example of what is called a broken symmetry, but we shall encounter more later. Regardless of the symmetry breaking, each such multiplet would have a different central mass. Several such multiplets are now known. One example is the spin-one, odd-parity vector mesons consisting of the p, w, and K* (1891). Baryons are formed in a different way, combining three 3 representations, or 3 0 3 Qx 3 = 1 O+ 8 O+ 8 O+ 10. The octets have exactly the same TZ and Y quantum numbers as for the meson octets, as Figure 18-7 shows in the case of the spin-1/2, K° +1 Y o 7r- K+ 0 71 0 K° K- I — 1 — 1/2 71+ I I 0 +1/2 +1 TZ Figure 18-6 The odd parity, spin 0 meson octet in a plot of hypercharge Y against the z component of isospin T.. +1 0 — I p - A° -o -1 -1/2 0 +1/2 +1 Figure 18 7 TZ - The even parity, spin 1/2 baryon octet. even-parity baryons. Again, the A ° with T = 0 and E° with T = 1 occupy the Y = TZ = 0 position. Since this octet also has the nucleons and E, it includes most of the baryons we have discussed so far, but other octets with different spins and parities are now known. The 10 representation, or decuplet, is particularly interesting for learning more about the structure of the particles, as we shall see. It is shown plotted in Figure 18-8. In the decuplet only the S2 - decays by the weak interactions, while the rest of the multiplet consists of strongly decaying particles, of which the A(1232) has been specifically discussed in Section 17-7. All of the particles in the decuplet have spin 3/2 and even parity. Even from this brief description we can see that SU(3) was useful in bringing some order out of the chaos of particles. However, this theory of unitary symmetry and, in particular, its specification of how SU(3) symmetry was broken, did much more in making successful predictions. Most impressive was the prediction of the quantum numbers and mass of the f2 before it was discovered in 1964. However, we no longer need to know about these details of the theory because it has been superseded by the hypothesis of quark constituents, and it is much easier to understand the successful result in terms of the quarks. In 1964 Gell-Mann and Zweig independently realized that the 3 representation could be more than a mathematical construct and could describe more fundamental 0 0 (1232) + 1 — A - (1232) 0 — E - (1385) A + (1232) p + (1232) 10(1385) i + (1385) Y g - (1530) ,1,7+ (1530) st - -2 I I I -3/2 -1 -1/2 I I 0 +1/2 I I +1 +3/2 T., Figure 18 8 - The even parity, spin 3/2 baryon decuplet. S>1 1:I t/f1O 4Md Aa13WWASAbIH1IN f1 Y n MORE ELE MENTARY PARTIC LES constituent particles. Gell-Mann called these particles quarks. Just as in the SU(2) case in which the 2 representation was a T. = + 1/2 particle (the p) and a T. = —1/2 particle (the n), so in the SU(3) case the 3 representation gave three fundamental particles. Unlike the Fermi-Yang model, these constituents could not be known particles. For example, if three of them are needed to make a baryon, they must each have baryon number B = 1/3. The decuplet of Figure 18-8 will be used to determine other quark quantum numbers. Since the S2 - has strangeness S = — 3, it must be made up of three quarks each having S = —1. Thus one of the quarks, which shall be called the s quark, has S = —1 and T. = 0, since the S2 - has T = TZ = 0. To make other members of the decuplet, the other two quarks, called the u quark and the d quark, must have S = O. To make the A ++ , which has T. = 3/2, would require three quarks each with T. = + 1/2; call this one the u quark. To make the A - with T. = — 3/2 would require three quarks each with TZ = — 1/2; call this one the d quark. If the quarks are really the constituents that make up these particles, they must obey the Gell-Mann-Nishijima relation, (17-36), just as the particles do. Using this, the charges of the quarks can be determined. The charge in units of the electron charge is given by the following: For the up (isospin direction) or u quark 11 Q=TZ + (2B+S)= 2+2(3+01=+3( 18-1) jjj For the down or d quark Q=TZ +2(B+S)= — For the strange or s quark + 2 (3+0 = — 3 (18 2) - -1 ) = — 3 (18-3) Q=0+2(3 We therefore get peculiar fractional charges. Experimental searches for quarks have sought this unique signature. Despite extensive attempts, the results have been generally negative. When QCD is discussed in Section 18-7, reasons will be presented for believing that quarks will never be detected directly, and that they are permanently confined to the hadrons they make up. To show that these charge assignments work, consider the S2 - which is sss (that is, it consists of three s quarks). Each s has charge —1/3, giving the correct total of —1. The A - is ddd, and again the —1/3 quarks add properly to —1. The A + + is uuu, and three charges of + 2/3 give the expected + 2. Example 18 2. Show that the quark quantum numbers give the corresponding quantities for the E 0(1385) particle. ^ The E °(1385) has Q = 0, B = 1, S = —1 (hence Y = 0), T = 1, and TZ = 0. It is made up of one of each kind of quark, or uds. Taking the u, d, and s properties in order, we have - Q=+2/3-1/3-1/3=0 B = 1/3 + 1/3 + 1/3 = 1 S =0+0-1=-1 T = 1/2 + 1/2 + 0 = 1 TZ= +1/2 — 1/2+0=0 • To give appropriate spin to all the particles it is necessary that each quark have spin 1/2. For instance, take the E ° (1385), which has spin 3/2. In this case if the three quark spins are essentially parallel, they will give the proper value of 3/2. Because they cannot be determined relative to anything else, the s quark parity and the parity Detailed quark models have been constructed which predict the masses of essentially all the hadrons, based on just a few constants which have to be determined from measured masses. The constants include not only the quark masses, but also the details of the potential well in which the quarks are placed and the degree of such effects as spin-spin and spin-orbit interactions. The process is very like that of finding nuclear binding energies in the shell model. The success of such models adds credence to the quark picture of hadrons. We close this section by discussing the quark content of mesons. In SU(3), mesons are combinations of 3 and 3. That is, they are combinations of a quark and an antiquark. For example, the n+ is ud. This is true since the antiquark, being a fermion, has opposite charge to the quark, so that d has Q = +1/3 and hence T = +1/2. This quark assignment correctly gives Q = + 1 and T. = + 1 for the n+, since u has Q = +2/3 and . T = + 1/2. The n - is the charge conjugate ûd, while the it° is a combination of uû and dd. Since the s will have S = + 1, opposite to that of the s, the K + meson is us, and the K ° is ds. The quark- antiquark pairs forming these pseudoscalar mesons are in a 1 S0 state, whereas the same combinations in a 3S, state form the vector meson octet. S>1 1:1b'f1 0 aNb Jl1:1131AIWAS AEIVlIN fl of either the u or d quark must be defined as even. Since the A , which is ddd, and A++, which is uuu, are two charge states of the same particle, they must have the same parity. Thus ddd and uuu have the same parity, so the u and d parities mus t. be the same, or all three quarks have even parity. Because the spin of the E °(1385) or of the A can be made 3/2 from just quark spins, no relative quark angular momentum. is required. Thus there is no angular momentum factor (i.e., (-1) 1 ) in determining the E° (1385) or A parity. It will be just the product of the three even quark parities, in agreement with experiment. While the s quark is an isospin singlet, since no other quark possesses strangeness, the u and d quarks form an isospin doublet. This implies that the u and d quarks are alike except for T. and Q. It would be more correct to turn this statement around. and say that the real basis of isospin is that there are two quarks which have, aside from electromagnetic effects, the same mass and interactions. Since isospin utilizes the well developed mathematics of spin, it is a very useful concept. But its content can always be reduced to this simple quark basis. Thus the proton and neutron have the same strong interaction because they are, respectively, uud and ddu, and substituting a d for a u quark makes no difference in the strong interaction. Understanding isospin and its conservation then means understanding why these two quarks exist which differ in just their electromagnetic properties, and so far there is no answer to that question. This similarity in the masses of the u and d quarks is apparent from the small mass differences among isospin multiplets, such as between p and n. The difference between the u or d quark mass and that of the s quark is responsible for the success, mentioned above, in predicting the mass of the S2 - . However, that was not known at the time, and the prediction was made on a different basis. In going from row to row in Figure 18-8, that is, from Y = + 1 to Y = — 2, each step means substituting an s quark for a u or d quark. Thus A has no s, /(1385) has one s, E(1530) has two s's, and Sr has three s's. Now strange particles are more massive than their ordinary particle counterparts, so the mass of the s quark must be greater than that of the u or d quark. Thus each step in Y means adding the mass difference between the s quark and a u or d quark. To avoid electromagnetic mass differences we can compare the differences between the masses of the A - , E - (1385), (1530), and S2 - . The first two give a mass difference of about 150 MeV/c 2, so in — m„ or d ^ 150 MeV/c2. We can then predict, correctly, that the Sr is more massive than the (1530) by about 150 MeV/c2. { MO RE ELEMENTARY PARTICLES ^r d > \ u u p > ( > > > ) u d dl s s - u _ d K° ,g ° (a) A° t u > u d > d s / p u (b) Figure 18 9 Quark flow diagrams showing (a) strangeness conservation (production of an s§ - + p --^^pairofquks)nthegracio i A° + K u and strangeness violation in the weak decays (b) A° — p+n and (c)K ° a rc + + ir In the decays the weak interaction is represented by a circle, but this will be treated more completely in Section 18-8. (c) Just as the meaning of isospin is simplified in the quark picture, so also is strangeness. The conservation of strangeness in the strong interaction, such as n + p ° + K ° , merely means that an ss pair must be created. This is a manifestation of A the requirement that any fermion has to be created in a fermion -antifermion pair. The process is shown in Figure 18-9a, which is a Feynman diagram on the quark level. It represents the history of the quarks as a function of time, which increases to the right. Note that the u and û quarks annihilate and an ss pair is produced. Strangeness nonconservation in the weak interaction, such as A ° p + it and K° --> rc + + zr- , is then the conversion of the s or the s to a nonstrange quark. This is shown in an oversimplified way in Figures 18-9b and c, where it is seen also that in each case a uû pair must be created. In Section 18-8 on the electroweak interaction this conversion of one type of quark to another, which must involve the W intermediate boson, will be treated more correctly. — - 18 4 EXTENSIONS OF SU(3) - — MORE QUARKS The unitary symmetry theory of SU(3) was successful in classifying particles then known and in predicting the existence of others found later. It was particularly useful in introducing the u, d, and s quarks. In a development in 1967 which will be discussed in Section 18-8 yet another type of quark was needed to explain some experimental results. It was not until 1974 that direct evidence for the new quark was found. The new quark has to possess a property like strangeness which was called "charm." In other words, there needed to be a new quantum number making this c quark different from the others. The u, d, s, and c quarks are then of different types, or `flavors," as these properties are usually designated. The 1974 experiments actually detected a meson which was the combination cc c has the charm quantum number ' = +1andheciotslfp harm,ince and c has = —1. The ce is a vector (spin 1, parity odd, charge conjugation eigen- Figure 18 10 Electromagnetic decay of the cc state IN into a p + p - pair. value negative) meson, just like the p, w, or çp, and just like those mesons it can decay electromagnetically (via a virtual photon) into a p + it - pair. This is shown in Figure 18-10 and was the means by which it, designated the J meson, was detected in an experiment at the Brookhaven National Laboratory. At about the same time an experiment at the Stanford Linear Accelerator Center (SLAC) also detected this particle, there designated the i/i meson, by quite a different means. At SLAC a colliding beam accelerator, called SPEAR, was used. In this device counter-rotating beams of e + and e - are guided in a ring by magnets, colliding at designated positions (two at SPEAR). Particle detectors in the interaction regions measure the products of the collision. These detectors have to be very large and complex to study the results of each collision, since the collisions are relatively few. The more usual type of accelerator, in which a beam hits a fixed target with an extremely large number of particles in it gives vastly more collisions. However the collisions are in the laboratory system, whereas they are in the center of mass system in a colliding beam accelerator. This makes a vast difference in the available energy. For example, an e +-e - collider with 10 GeV (=10,000 MeV) per beam gives a collision with 20 GeV in the center of mass. To get that same energy in the collision of an e + with an e - at rest would require a laboratory energy of about 4 x 10 5 GeV! When the e + and e - collide they produce a virtual photon, which then can turn into other particles. Because the tfi/J is a vector meson, it has the same quantum numbers as the photon, so it is readily produced. It can decay electromagnetically, as in Figure 18-10, but since it is so massive (3097 MeV/c 2) it would be expected to decay with a very short lifetime via the strong interaction into hadrons. Thus it would be expected to have a very large mass width ( 10 2 MeV/c2) like the strongly decaying particles discussed in Section 17-7. Instead it has a strikingly narrow width, which is the reason it was discovered. The mass width, which can be deduced from measurements although it is smaller than the experimental resolution, is only 0.06 MeV/c 2. Why does such a small width occur? The problem is that the cc state could decay readily into two mesons, one containing a c and the other a c but the masses of even the least massive mesons (called the D meson and the D meson) with such constituents are too large. That is, MD + MD > Mo p so the decay cannot occur. Any other hadronic decay, such as into 37t's, is greatly inhibited or, as it is said, Zweig forbidden. The e + -e - production of the '/J and its subsequent Zweig forbidden decay into 7C + + ic - + n° is shown in_Figure 18-11. The forbiddenness comes from the difficulty in going from the cc annihilation to the unconnected uû (or dd—one is drawn in the figure but both occur) pair production. It was the narrowness of the /i/J peak that indicated a new quantum number was involved. Figure 18 11 Production of the cc state electromagnetically by e + e - annihilation and its subsequent Zweig-forbidden strong decay into pions. - - SNad nO 31:1 01A1 - (E)f1S3OS NOIS N31X3 - MOR E ELE MENTA RY PAR TICLES This same reason for the inhibition of a strong decay was actually encountered before, in Section 17-7. There it was mentioned that the cp ° vector meson did not decay into pions. The reason is that the (p ° is an ss state, so it can decay readily only into two mesons, one of which carries the s and the other the s. In this case the mass of the two K mesons is slightly smaller than the mass of the (p ° so such a decay is allowed. An excited state of cc, called >y', at 3685 MeV/c 2 was discovered in 1975 at SPEAR, and subsequently another state t/i" at 3767. Since the i/i" is massive enough to decay into D + D, it has a large mass width. Subsequently other so-called charmonium states (that is, states of cc), shown in Figure 18-12(a), were discovered. The t/i states are (like the other vector mesons) 3S1 states of cc, whereas the x states shown are 3P and hence have opposite parity and charge conjugation quantum numbers. The II, state at 2976 is a pseudoscalar (1S( ) cc combination. If the quark model is correct, then the cc states are analogous to those of the e + e - in positronium (Sections 2-7 and 4-7)—both are (a) t/A3767) n = 2 DD 0 (3685) x(3507) x(3551) x(3414) 0(3097) '77(2976) n = 1 (b) 3s 1 n = 2 3P 3 P 3P2 0 ^o 3S 1 So n = 1 I I I 0 -+ 1-- 0++ I 1++ I 2++ JPC Figure 18 12 Energy levels of (a) charmonium (cd) and (b) positronium (e + e - ). The relative energy of the level is plotted against its quantum numbers, which are designated as JPC , where J is the spin, P is the sign of the parity, and C is the sign of the charge conjugation quantum number. The angular momenta of the fermion -antifermion cd system is the same as that given in spectroscopic notation for the corresponding state in the e + e system. - the W, <4, and „% quantum numbers are conserved in the strong and electromagnetic interactions and change by one unit in the weak interaction. This simply means that the number of quarks minus antiquarks for each of s, c, b, and t must remain constant in strong or electromagnetic interactions, while in the weak interaction there is a change of quark flavor with the preferred sequence being t —* b —p c —* s. Thus a favored decay is D° K - + n +, or cû -+ sû + ud, which has Ace = 1 and c —* s. Because of the uniqueness of their M or . quantum numbers, the b and t quarks must each be T = 0, hence TZ = 0. As in the casë of other quarks, they each have baryon number B = 1/3. The b quark with i = —1 has Q = —1/3, and the t quark with .J = + 1 has Q = + 2/3. These assignments are compatible with (18-5) Q=TZ +(B+S+ + +<%)/2 which, hopefully, is the final form of that relation, and which should now apply to all hadrons. The quark quantum numbers are summarized in Table 18-1. The b quark is well established. In 1977 at Fermilab narrow resonances in the mass range of 9.5 to 10.5 GeV/c 2 were seen in the mass spectrum of muon pairs, similarly to the discovery of the narrow J at Brookhaven. It was deduced that two or three bb resonances were present. The lowest mass state was called the upsilon, or T, and + the higher states the T' and T". One year later at the DORIS e e_ collider in Hamburg the T and Y'' were clearly resolved, and later at the CESR collider (Cornell) the T" was observed distinctly, and a fourth state (T"') was also identified. These are all 3S1 states of bb with different radial excitations, analogous to the principal quantum number of atomic physics. The four states are at 9.46, 10.02, 10.35, and 10.57 GeV/c 2 , with energy spacings well predicted by the quark model. The first three states EXTENS IO NS OF SU (3)-MO RE QUARKS fermion-antifermion pointlike particles in a potential well. Indeed Figure 18-12b shows that the positronium levels are remarkably similar, despite a difference in the energy scale of a factor of 10 8 ! This is strong evidence for the quark model. The charmonium (ce) states do not possess charm, and actual observation of a particle having that quantum number came later. Hints of the decay of such a particle were seen in a neutrino experiment at Fermilab, and one bubble chamber event at Brookhaven was interpreted as the A, particle. Charm was clearly seen at SPEAR in 1976 (the D° meson) and then in photoproduction at Fermilab (the A c baryon). A few more states have since been observed, but a large number are possible. If thought is given to extending SU(3) to SU(4) to include charm, the possible number of particles is greatly increased. Consider Figures 18-6, 18-7, and 18-8 made three-dimensional, with charm as the third axis. Because the c quark mass is much larger than those of the u, d, or s quarks, SU(4) is a much more badly broken symmetry than SU(3). Recall that the symmetry requires all the particles in a multiplet to have the same mass Thus it is better simply to consider the additional combinations that can be made with the added freedom of including one to three c quarks in making baryons and a c or a ë in making mesons. As examples, the D + is cd, the D° is eû, the A c is udc (i.e., like the A, but with c replacing s), and the F+ meson is cs. In making those combinations we note that the c quark must have Q = +2/3, like the u quark. Since it has charm ' = + 1, the now extended Gell-Mann-Nishijima relation (18-4) Q=TZ +(B+S+')/2 would (with B = 1/3 again and S = 0) properly give TZ = 0. The c quark must have TZ = 0, since as a singlet (the only quark with 9) it must have T = 0. The much-amended equation which is presently (18-4) is still not complete, for there are at least two more flavors of quarks. Each of these two quarks possesses a separate quantum number, analogous to strangeness or charm. One quark is labeled b for bottom or "beauty" and the other is labeled t for top or "truth." Like strangeness, Table 18 1 MORE ELEM ENTARY PARTICLES - Utilizing TZ .%)/2 +(B+S+W+.4+ Q= Quark Quantum Numbers, Quantum Number d u Charge, Q (in units of e) Isospin, T Isospin z component, TZ Baryon number, B Strangeness, S Charm, ce Bottom (beauty), a Top (truth), .9 — 1/3 1/2 —1/2 1/3 0 0 + 2/3 1/2 + 1/2 1/3 0 0 0 0 0 0 Quark Flavor c s —1/3 0 0 1/3 —1 0 0 0 + 2/3 0 0 1/3 0 +1 0 b t —1/3 0 0 1/3 0 0 + 2/3 0 0 1/3 0 0 0 +1 -1 0 0 are very narrow; e.g., the Y has the same width as the iJi 0.06 MeV/c2. The fourth is quite broad, indicating that its mass is above that necessary for decay into a BB pair of mesons, where the B + is bu and the B ° is bd. By running the CESR accelerator at an energy corresponding to the peak of the Y' mass, the B meson has been identified, and it has a mass of 5.27 GeV/c 2. , Thus quark masses get rapidly heavier in going from one flavor to the next. We can get a rough idea of the effective mass of the quarks inside a hadron from the hadronic masses. Thus the u and d quarks must have a mass close to one-third the nucleon mass, or about 0.3 GeV/c 2 . From the mass differences in the baryon decuplet we have seen that m s — mu or d = 0.15 GeV/c2 . Hence the strange quark mass is about 0.5 GeV/c 2 . We can check this since the 0 ° 2) is an ss state, so the s mass is about half of 1 GeV/c 2 . Similarly usingmeson(1.02GV/c the 0 masses, the c quark must be about 1.7 GeV/c 2. From the T mass the b quark must be about 5 GeV/c 2 . From this progression, the t quark can be expected to be quite heavy. Indeed, late in 1983 experiments indicated that it may be around 30 GeV/c 2 . One caveat must be introduced: What is meant by a quark mass depends on the application, since quarks are not observed in the free state. Although at the time of writing the evidence for particles possessing the t quark is not conclusive, there is strong reason to believe that this quark exists. The reason will be given in Section 18-8, but suffice it to say now that it has to do with a symmetry between quarks and leptons. Both classes of particles are, as far as it is known at present, pointlike and apparently elementary. The symmetry is that there should be equal numbers of quarks and leptons. There are 6 leptons (e, v e, µ, v µ, r, vt) and there then ought to be 6 quarks (u, d, s, c, b, t). One way that has been used to search for the t quark is to look at the total cross section for e+ + e- —* hadrons, because this goes through an intermediate step in which the virtual photon from e + - e - annihilation produces a quark- antiquark pair. This is shown in Figure 8-13a. The quark and antiquark subsequently become hadrons, which are observed experimentally. This process can be compared with e + + e- µ+ + µ-, shown in Figure 18-13b. The relative rates for these two processes can be obtained by closer examination of the diagrams. The first part of both, e+ - e - annihilation to produce a virtual photon, is the same and hence does not enter into the relative rates for the two processes. In an electromagnetic interaction the photon coupling is to the charge, which is e for the muon and Qe for the quark, where Q is 1/3 or 2/3. The diagram represents an amplitude, and the probability or cross section is the square of the amplitude. Note in passing that e2, which enters into the probability for a process, is usually expressed as the dimensionless coupling constant, e2/4rc€°hc, which is also called the fine structure constant. Hence the ratio of the cross sections for the two similar processes at a given energy will be just the ratio of their coupling constants (or the squares of the charges), that is Q 2. The photon Figure 18-13 Annihilation of e + with e - to produce a virtual photon. In (a), the photon produces a quark- antiquark pair, which subsequently forms hadrons. In (b) the photon produces a µ + f - pair. The cross section for the process depends on the coupling of the photon to the charge of the fermion -antifermop,whcsefron at each vertex. (b) as many quarks as is allowed energetically. Thus at a given beam energy the ratio + + e - —> hadrons) _ 2 (18-6 ) R = u(e Qa u + + ,u -) 6(e + + e- will couple to is the sum of the squares of the quark charges for all quarks which can be produced. It follows that at the threshold energy for producing the t quark, R ought to increase by (2/3) 2 = 4/9. This is appreciable, since Q? for u, d, s, c, and b quarks is just 2(2/3)2 + 3(1/3)2 = 11/9. We shall see in the next section how well the latter prediction is borne out. 18-5 COLOR AND THE COLOR INTERACTION With six leptons and six quarks there are already an appreciable number of elementary particles, but even this is not sufficient. Consider the difficulty encountered when the quark structure of three members of the 3/2 + baryon decuplet is examined closely. Recall that the A is made up of three d quarks, the A ++ of three u quarks, and the - of three s quarks. To get spin 3/2, the spins of all the quarks must be essentially parallel since the spin of each is 1/2. To then have even parity, the quarks must all have zero relative orbital angular momentum. Therefore, all the quarks would be in the same quantum state. Since the quarks are fermions, for them all to be in the same state would violate the Pauli exclusion principle. Each of the quarks must therefore have a different value of some new quantum number, and the quantum number must have at least three different values. Because this quantum number has never been observed, the A - A + +, and û - must not possess it, even though their constituents do. Thus the quantum numbers assigned to the three quarks have to cancel to give zero. These considerations suggest an analogy to color, since the three primary colors taken together are colorless. Then the observed A -, A + +, and 0 -, described as "color singlets," are "colorless," while each of their three constituent quarks possess a different "color." The three possibilities for the quantum number "color" will here be designated as the subtractive primary colors red, yellow, and blue, since these three mix as pigments to give colorless black. Often red, green, and blue are used, since these additive primary colors when mixed as light give colorless white. Note that this color analogy works for mesons as well as baryons, since the color of a quark will just cancel the anticolor of the antiquark to which it is bound. Since the resulting particles must be colorless, there are just two combinations, quark-antiquark and three quarks, which achieve this, and hence only these combinations of quarks produce bound , NOIlO`dI:131N I a01003H1 aNb' b10100 (a) MO RE ELEMENTARY PARTI CLES co (O states. Providing some understanding of the problem of binding and eliminating an apparent violation of the exclusion principle are both important gains. However, these gains are obtained at the cost of having 18 quarks (three colors of each of six flavors) and yet another quantum number. Is there experimental evidence for color? Returning to the subject at the end of the previous section, we see in Figure 18-14 measurements of R as defined in (18-6). At energies high enough to be above resonances for vector meson production (> 10 GeV center of mass beam energy), the measurements of R from the PETRA collider at DESY in Hamburg have the constant value of 11/3. At this energy the u, d, s, c, and b quarks can contribute, and the square of the charges adds to 11/9. However, if there are three times that number of quarks because of the color degree of freedom, then the value of 11/3 is expected. The excellent agreement between this expectation and the experimental result gives direct evidence for color. Note also from the figure that up to a mass value of about 37 GeV/c 2 the tT state has not appeared. This could produce a resonance, but also it would surely increase R by 3 (2/3) 2 = 4/3. The existence of the quantum number which is conveniently called color has a significance well beyond satisfying the exclusion principle or providing a rationale for the way in which quark combinations bind. The color quantum number is to the true strong interaction as the electric charge is to the electromagnetic interaction. Just as the electromagnetic interaction is the exchange of photons emitted and absorbed by electric charge, so the real strong interaction is the exchange of gluons emitted and absorbed by color "charge." This color interaction is to be distinguished from the interaction between hadrons, sometimes referred to as the nuclear interaction. The latter has been called the strong interaction, but the true strong interaction is that due to color. That which we have been calling the strong interaction is to the color interaction much as the van der Waals interaction (Section 13-2) between molecules is to the electromagnetic interaction. In other words, the basic strong interaction is that which binds quarks together to form particles, the exchange of which gives rise to the apparent strong interaction. It is ironic that because its manifestations are so indirect, the very existence of this fundamental interaction was not even guessed until the 1970s. 8 Paw J/ il< _f ¢ Y 6 R 4 ,d,s — 44f 0 0 +' i i I 5 t 1 I 10 11 1 I I i I 1 I I r r i, I 15 20 25 Center of mass energy (GeV) t I I i I 30 1 i i i I 35 i i I 1 I 40 Figure 18 14 The ratio R of the cross sections for e + + e - -> hadrons to e + + e -> µ+ + p - is plotted versus the energy E the e + and e - provide in their center of mass collision. The positions of the sharp vector meson resonances (p, w, c-p, Ili, Ili', T, T', T") are shown. The data - come from many storage ring experiments, with the points above 10 GeV from PETRA (Hamburg). In this upper energy region, if u, d, s, c, and b quarks, each with three colors, contribute, R should be 11/3. V = - kl + kZr (18-7) The first term is the expected Coulomb-like form due to the exchange of massless gluons, which are emitted and absorbed by color charge. The constant k 1 can be fixed from one level separation, and then it not only works for the other levels, but for those of the T states as well. The unexpected second term is all-important in providing the distinguishing features of the color force. First, being proportional to r, this term is small at small distances, a feature which is called asymptotic freedom. Thus the tli and T energy levels are determined mainly by the first term. The color potential is weak at small distances because k 1 is very small. This short-distance weakness is the feature that makes the parton model work. When they are close together, the quarks are in a rather weak potential, and hence they act as almost free, nonrelativistic particles. Another aspect of the parton model is now also explained: In Section 18-2 it was stated that the lepton-nucleon scattering experiments gave evidence for the existence of partons without weak or electromagnetic interactions. The gluons are those inert partons, since they possess color charge, but not weak or electric charge. Electrons and neutrinos cannot scatter from gluons. Returning to the term in (18-7) proportional to distance and going to large r, we find that the potential gets very strong. This is the feature that confines quarks and gluons to the hadrons. The quarks and gluons cannot escape to be detected in the free state, and hence color is never observed directly. Implicit in this statement is the information which will be discussed in Section 18-7 that gluons possess color. This is an important distinction between photons and gluons, since photons do not carry electric charge, while gluons do carry color charge. A qualitative picture can be given of the process by which quarks and gluons are confined and only colorless particles are detected. Consider trying to separate a quark from a proton. The gluon field binding that quark increases in energy as the quark moves away from the other two quarks. As that energy increases it becomes more likely that the gluon (which carries anticolor as well as color) will break up into a quark-antiquark pair. The new quark would reconstitute the proton, and the new antiquark would combine with the separating quark to NOI1J`da31NI 1:1O1O0 3E1 1aNt/1:1 O1 O0 Because of its importance as one of the four fundamental interactions of nature, it is obviously necessary to discuss the color interaction further. Important features of the color interaction will be described in this section, but the theory of that interaction will be taken up in Section 18-7 after necessary background information has been supplied in the next section. That theory is called quantum chromodynamics (QCD), combining the concept of color with guidance from the most successful theory in physics, quantum electrodynamics (QED). Since the theory will come later, let us seek the features of the interaction empirically, instead of deriving them from QCD. In Figure 18-12 the similarity between the energy levels of positronium and charmonium was seen. For this to be true it is necessary not only that the e + e - and cc both be pointlike fermion-antifermion pairs, but also that the potential which describes their interaction be of similar form. For positronium that Coulomb potential is proportional to the square of the electric charges and inversely proportional to the distance between them. Since there is a 8 difference in energy scale between charmonium and positronium, the factor10 strength factor (square of the charges) is obviously irrelevant to the similarity of the spectrum. However, the 1/r distance dependence is crucial. A potential with a 1/r dependence is obtained only if the exchanged particle is massless, which means that the gluon must be massless like the photon. If instead of merely exploiting the similarity between positronium and charmonium energy levels, a detailed fitting of the charmonium levels is performed, the form of potential needed turns out to be MORE ELEMENTARY PARTI CLES (a) (b) (c) Figure 18-15 (a) Electric lines of force between a positive and negative charge. (b) Color lines of force between a quark and an antiquark. The color lines are pulled together because of the interaction among the gluons carrying the color force. (c) Crude model of a meson in which the color force lines are drawn together into a rotating tube of force. form a meson. In this way, colorless particles are produced until all the available energy is dissipated, and the quarks and gluons remain confined and unobservable. The color potential providing confinement can become very strong indeed, as we shall see from a simple calculation in the next example. Because the gluon possesses color, there is a very strong interaction between gluons, giving a characteristic form to the color force field. This is best illustrated by contrasting it with the electric force field, such as that between two charges, which is shown in Figure 18-15a. Since the photon carries no charge, there is no interaction between electric lines of force. However, the lines of force between a quark and an antiquark, shown in Figure 18-15b, look quite different. The gluon-gluon interaction pulls these together. As the separation between the quark pair increases, the interaction energy increases, and the color lines get closer together. This is analogous to the quarks being tied together by rubber bands which stretch as the distance increases. Example 18 3. Determine k 2 in (18-7) from the energy in the color lines of force between a quark and an antiquark by determining the angular momentum of this meson. ■ Suppose the color lines of force have been pulled together until they form a tube, and the interaction energy is then so high that the masses of the quarks can be neglected in comparison - to it. If this system is now considered to be rotating, we have a crude model for a meson with angular momentum. We can use this to deduce k 2 which will be the energy per unit length of the force tube, and also the second constant in (18-7). For definiteness, assume the ends of the force tube rotate at velocity c and that the tube has a half length of p, as shown in Figure 18-15c. The total mass M of the system is given by , P o k 2 dr (18-8) ^1 — v2/c2 This is true since k 2 dr is the rest mass energy of an infinitesimal length dr so that its total relativistic energy is k 2 dr/.J1 — v 2/c 2 (see Appendix A). At a distance r from the center of the tube the velocity will be y = cr/p. Making this substitution in (18-8) gives dr = xk 2p — r2/p2 Mc 2 = 2k2 (18-9) o Now the angular momentum of the infinitesimal mass at the distance r from the center where the velocity is y is vrk2 dr/c 2 ,J1 — v2/c2 . Thus the total angular momentum of the tube in units of h is J= 2 P f h ,1 o vrk 2 dr 2k2 c2N11 — v 2/c2 h P o 2 r dr irk2p 2 cp Ji — r2/p2 2hc (Mc 2 2 ) 2nk 2hc (18-10) Although this is a crude model, the result that J cc M2 is in agreement with experiment:. If the mass squared of mesons of the same structure but differing in angular momentum is plotted against that quantity, a straight line is obtained with the slope dJ/d(Mc2)2 = 0.9 GeV -2 . A similar plot, Figure 18-16, for baryons is more spectacular because there are 19 2 15 2 t 11 2 / J 7 2420 13 2 • 195 0 2 ? 2455/585 ? 2 -- ^ '1232 3 ?2250// ' 2350 2030 / 9 2 ? 00 1765// /% 1830 5 2 1385, / 1115— 2 I I I I I 2 4 6 M2 (GeV/c 2)2 8 10 Figure 18 16 Baryon spins versus the square of their masses for three sequences: A has T = 3/2, S = 0, and spin J and sign of parity, P, expressed as JP = 3/2 + , 7/2 + , 11/2 + ; A has T = 0, S = —1, JP = 1/2 + , 3/2 - , 5/2 + , ; and E has T = 1, S = —1, JP = 3/2 + , 5/2 - , 7/2 + .... Particles for which the spin-parity is not well established at the time of writing have a question mark with their mass value in MeV/c 2 . - COLOR AN D THE C OLOR INTERACTION Mc2 = 2 MO RE ELEMEN TARY PARTI CLES more known examples. Again, straight lines and the same slope are obtained. According to the model this slope has the value dJ — 0.9 GeV - 2 d(Mc 2)2 = (2irk2 hc) -1 Solving (18-11) for k2 gives (18-12) k2 = [27(0.9 GeV -2)(0.2 GeV-F)] -1 = 1 GeV-F -1 where we have used the convenient value hic = 197 MeV-F. Is this result reasonable? Since the proton has a rest mass energy of about 1 Gev and a radius of about 1 F, this is indeed a correct order of magnitude energy density for a hadron. Accepting this value, we then find that at a distance of a typical hadron radius of 1 F the confinement energy of the quark is about 1 GeV, which is a hundred times nuclear binding energies. Put another way, the force, which is constant with distance, is 10 15 GeV/m (-10 5 newtos),rabu10nechpoitlkquar!• 18-6 INTRODUCTION TO GAUGE THEORIES In the previous section some of the features of quantum chromodynamics were discussed. This theory has provided a remarkably successful explanation of hadronic interactions. It is an example of a gauge theory. Another gauge theory is quantum electrodynamics, which has given more precise predictions than any other theory. Yet another gauge theory is general relativity. We shall be discussing an additional gauge theory which combines the weak and electromagnetic interactions and also has been extremely successful. In short, all the fundamental interactions in nature are described by gauge theories. Hence it is important to have at least a qualitative understanding of the content and approach of such theories. Since gauge theories stem from the concept of gauge invariance in classical electromagnetism, this subject will be explained qualitatively. Then a description will be given of how the ideas are extended to the quantum domain. (A simplified quantitative treatment of classical and quantum mechanical gauge invariance is given in Appendix R.) The final subject of this section will be a short description of a pioneering attempt to construct a gauge theory of the strong interactions. This was unsuccessful but was important to the later successful work, and it illustrates some of the needed procedures. The following section will provide some more information on QCD, followed by a section on the electroweak gauge theory, and then finally a brief discussion of grand unified theories. To start on familiar ground, we begin with classical electromagnetism. The fact that charge conservation is assured by gauge invariance has already been discussed in Section 17-8. In that demonstration only electric fields were dealt with. The indefiniteness of the scalar potential V is what is known as a global gauge symmetry. Changing the value of V everywhere has no physical effect. A squirrel can walk as safely on a high voltage transmission line as on a grounded one; he must simply avoid a large difference of potential. This global symmetry assures global charge conservation: the total charge in the universe is a constant. Can this global symmetry be converted into a local gauge symmetry, assuring local charge conservation? That is exactly what Maxwell did in 1868. While the details are spelled out in Appendix R, a summary of this point and other aspects of gauge invariance in classical and quantum electromagnetism covered in that appendix will be presented here. Maxwell noticed that Ampere's Law in differential form was not consistent with the continuity equation connecting current flow and the rate of change of electric charge. To restore charge conservation in an arbitrarily small volume, he had to add a term involving the electric field to Ampere's Law, which otherwise deals with just the magnetic field. In other words, to convert global charge conservation to local charge conservation it was necessary to couple together the electric and magnetic fields. Although we shall not go into it, relativity also follows this pattern. In brief, the global space-time coordinate transformations of special relativity are turned into local ones by the addition of a field, gravity. The result is the gauge theory of general relativity. We turn now to electromagnetic gauge invariance in quantum mechanics Akin to the indeterminacy of the absolute value of the potential V is the fact that thé absolute phase of a wave function cannot be measured. As discussed in Section 5-4, a physical observable is the expectation value O of an operator O o, given by Ô = I*(x,t) O0 (x,t) dx where x stands for x, y, and z. It is invariant under a global phase transformation P(x,t) 'P'(x,t) = eie 'P(x,t) (18-13) This is a global phase transformation because 8 is any scalar, not dependent on x or t. To demand local phase invariance would require the transformation 'P(x,t) —*'P'(x,t) = e` O( .t)P(x,t) It is left to the student to put P'(x,t) into a free particle Schroedinger equation, and show that Y''(x,t) will not satisfy that equation because of the space and time derivatives. How can local phase invariance be obtained? If the classical procedure is followed, this would be done by introducing a new field to provide compensating local changes. If that is done the appropriate Schroedinger equation will no longer be force free, and so will no longer describe a free particle. The invariance will be manifested in the inability to distinguish whether particle motion is due to the local phase change or the new field of force. The compensating field needed is just the electromagnetic field. In the phase transformation if 8 = Qx(x,t), where Q is the charge of the particle involved and x(x,t) is an arbitrary function, then P(x,t) ''(x,t) = eiQxcx,tnp(x,t) (18-14) Since the electromagnetic field is now included, it is necessary when (18-14) occurs to make the same correlated gauge transformation on the potentials A and V as in the classical case. If the gauge and phase transformations are made simultaneously, then the Schroedinger equation will be satisfied. That is, the Schroedinger equation will be invariant to these changes, and it is then said to be gauge invariant. However, SgIEIO3H1 3O f1bJ01 NO I10 f14OHlNI This result can be put in a different way. Recall that the indefiniteness of the scalar potential V is a global gauge symmetry and leads to global charge conservation (see Section 17-8). Since it was necessary to introduce another field to get local charge conservation, it is equivalently necessary to introduce another potential, the vector potential A, to produce the same result. Just as the electric field can be obtained from V, so the magnetic field can be obtained from A. Indeed, Maxwell's addition to Ampere's Law has its counterpart in changing the way the electric field is obtained from the potential, since now A is involved as well as V. The result is a local gauge symmetry: A and V are not unique for the given physical electric and magnetic fields. The corresponding local gauge invariance is that the equations determining the electric and magnetic fields, which are the only physical observables, are unchanged despite quite arbitrary, but correlated, changes in A and V. The correlation between A and V is important. Now V can be made different at any point (local symmetry), not just changed everywhere at once (global symmetry) because a compensating change can be made in A. To change a global symmetry into a local symmetry a new field had to be introduced, either A with V, or equivalently the magnetic field with the electric field. MO RE ELEMEN TARY PARTI CLES as promised, this is not the free-particle Schroedinger equation, but rather one which includes the electromagnetic field. This equation is obtained in Appendix R, but suffice it to say here that turning the free-particle Schroedinger equation into one containing the electromagnetic field involves inserting QA in the spatial derivatives and QV in the time derivative. This is important to note because a very similar substitution of derivatives works to insert the compensating fields in the other gauge theories we shall discuss. In fact, exactly the same substitution is needed in the relativistic wave equations, the Klein-Gordon equation (Section 17-4) and the Dirac equation (Section 5-2). To summarize in simplified form the procedure for setting up a gauge theory: (1) a global gauge symmetry (invariance) must be found which can be expressed by a transformation; (2) this global symmetry is converted to a local symmetry by changing the transformation so that it depends on space and time coordinates and contains something equivalent to a charge; and (3) the local transformation is coma° pensated by adding new fields which can be put into the field-free wave equation by a suitable substitution of derivatives. Since even the same substitution of derivatives works in relativistic wave equations, the relativistic quantum theory of electromagnetism follows along the same lines as the nonrelativistic case discussed above. This theory, quantum electrodynamics (QED), is interesting to understand qualitatively. The vector potential A becomes the wave function of the photon. The general idea is that a particle, say an electron, emits a photon and by that emission process the phase of its wave function changes. However, when that photon is reabsorbed by the same or a different electron, there is a compensating phase change. The photon emission and absorption correlates the phase changes, maintaining the overall symmetry because the electrons are indistinguishable. This process is directly equivalent in the nonrelativistic case to the simultaneous phase and gauge transformations. Since QED works so well, it was natural that it should be used as a guide in trying to develop a theory of the strong interaction. The pioneering work of Yang and Mills in 1954 is instructive to review in a brief, qualitative way. They sought to make a local symmetry out of the global symmetry of isospin invariance as a means of arriving at a theory of the strong interaction. The global symmetry is that, in the absence of the electromagnetic interaction, changing all protons to neutrons and vice versa would leave the world unaltered. The global symmetry can be expressed as a phase transformation similar to (18-13). However, in this case the wave function must have two components, one for the protons and one for the neutrons. This is most conveniently expressed by putting each wave function in a column matrix (Wp \kn The transformation then acts on both wave function components and so correlates the change in the number of protons and the number of neutrons. To make this transformation on a two-component wave function requires a 2 x 2 matrix instead of the simple phase angle of (18-13). This difference is important, making the electromagnetism case an Abelian gauge theory and the Yang-Mills theory a non-Abelian one. All subsequent gauge theories we discuss will be non-Abelian. An Abelian transformation is commutative: If two transformations are made in succession, the result is the same regardless of the order in which they are made. An example is a rotation in two dimensions; the angles add regardless of which comes first. Thus in the electromagnetic case successive phase shifts can be made without regard to order. Non-Abelian transformations are not commutative. An example is a sequence of three-dimensional rotations. An airplane flying horizontally which makes first a left turn and then dives downward will be 18-7 QUANTUM CHROMODYNAMICS Recall that the Fermi-Yang composite model of hadrons (Section 18-3) based on SU(2) of isospin had to be replaced by the unitary symmetry (and later quark) model based on SU(3) of flavor. Similarly the Yang-Mills theory of thè strong interaction based again on SU(2) of isospin had to be replaced by QCD based on SU(3) of color. Now SU(3) of flavor, underlying which are the u, d, and s quarks, is an inexact or broken symmetry because the s quark is more massive than the u or d quarks. However, SU(3) of color is an exact symmetry, because all three colors are equivalent. The global symmetry of color is that if every red quark became a yellow quark, every yellow quark became a blue quark, and every blue quark became a red quark, all hadrons would still be colorless. The symmetry is such that a total change in color can occur without its being observable. Once again this symmetry can be expressed as a transformation, but now three-component wave functions are needed, corresponding to the three colors. Therefore, 3 x 3 matrices are involved in the transformation itself. To convert the global symmetry to a local one the same prescription is followed as for electromagnetism or Yang-Mills. The transformation is altered to include a coupling constant and to make it a function of space and time. This transformation by itself would change the color of one quark without simultaneously altering others and hence give a hadron color. Thus, as before, compensating fields—called gauge fields must be added. Once more the fields are included in the wave equation by a — ■ SO IIN `dNAaOWO 1=1HOI f1lMdf10 traveling quite a different final direction than if it made first the dive downward and then the left turn. The Yang-Mills theory is non-Abelian because two isospin rotations will usually lead to different final numbers of protons and neutrons, depending upon the order in which they were done. We shall see, especially in the case of QCD, that the non-Abelian nature of the theory has important physical consequences. Returning to Yang-Mills, the next step after setting up the transformation which expresses the global symmetry is to turn it into one expressing a local symmetry. As before in going from (18-13) to (18-14), the global transformation is altered by (1) inserting a "charge" and (2) making the transformation depend on space and time. The "charge" in this case is a coupling constant, but that is the role charge plays in electromagnetism (i.e., a = e2/4irEOhc). Also as before, fields have to be introduced to compensate for the equivalent of a local phase change. Introducing the fields into the wave equation is done in a manner quite similar in form to the substitution of derivatives previously discussed, except that 2 x 2 matrices are involved. Just as 2 x 2 matrices are required for transforming the two-component wave functions, so also is it necessary in this case to introduce more than one compensating field. Recall from Section 18-3 that the symmetry group of isospin is SU(2) and that the simplest representations are 2 and 2. To compensate the phase changes in these simplest representations 2 ®x 2 = 1 Q+ 3 fields are needed. The singlet field is as in QED just A, which is the wave function of the photon. The triplet of fields are also massless like the photon. However, unlike the photon, these fields carry isospin, which means that they must have charges + 1, 0, and —1. This is the important distinction between an Abelian transformation and a non-Abelian transformation. In the Abelian case, as in QED, the result is a carrier of the field (photon) which does not possess the source of the field (charge). In the non-Abelian case, as in Yang-Mills, the carrier of the field also has the source of the field (isospin). The non-Abelian nature of the Yang-Mills theory destroys it, because charged massless fields or particles would have been detected, so they do not exist. However, it is just this feature which makes the theory valuable, since QCD and the electroweak theory, which build on this base, are non-Abelian theories. M ORE ELEMENTARY PARTICLES substitution of derivatives in the manner described above, but now 3 x 3 matrices are involved. Since, as was discussed in Section 18-3, the simplest representations of SU(3) are the 3 (corresponding to the three colors) and the 3 (corresponding to the three anticolors), we expect 3 Qx 3 = 1 O+ 8 gauge fields. The octet of gauge fields are the gluons, which have already been discussed. Each gluon possesses a color (red = r, yellow = y, blue = b) and an anticolor (r, ÿ, b). There are nine combinations of color and anticolor, of which six are obvious: rÿ, rb, yr yb, br, bÿ. The remaining three are not the obvious rr, y9, and bb, but rather the mixtures which form orthogonal eigenfunctions (see Appendix J), one of which has no net color and is the singlet. The other two combinations still have color and are (r? — yŸ)/-\/2 (18-16) and , (rr + yy — 2bb)/ J This is like combining three spins of 1/2, and so is reminiscent of the familiar combining of two spins of 1/2 to form spin 0 and 1. Recall that in the latter case the symmetric combination of spin up and spin down has spin 1 but zero projection on the z axis, while the antisymmetric combination has both projection and total spin of zero. For three combinations (of color and anticolor), the symmetry is opposite to that for adding two spins of 1/2. In the color case the singlet is the symmetric combination, (r? + yy + bb)/ 0, which would then violate the exclusion principle for the quarks in the A - A ++ , and Q. Recall that color was introduced to prevent such a violation by making the total eigenfunction of these fermions antisymmetric, since the space, spin, and isospin parts are symmetric. How does the octet of gluons provide local color symmetry? This is illustrated in Figure 18-17 for a baryon. The red quark becomes a blue quark by emitting a redantiblue gluon. When a blue quark absorbs that gluon its blue color is canceled, and it becomes red. Since the quarks are indistinguishable, the baryon remains colorless, and there is no way to observe the transformation. Color can then be changed differently at any point of space-time, and the gluon field restores the symmetry. The three colors of quark necessitate having eight gluons to bring this about. The gluons perform the necessary function of converting a global symmetry into a local one because they have color. That the carrier of the field possess the source of the field (color charge) is an attribute of a non-Abelian gauge theory, as was discussed in the Yang-Mills case. In Section 18-5 one of the physical consequences of gluons having color charge was stated. It was seen that the strong gluon-gluon interaction pulls the field lines together, unlike the electromagnetic case. This strong gluon-gluon force should produce binding, and meson-like glueballs probably exist. At the time of writing there are some candidates for glueballs, but it (a) (b) (c) Figure 18-17 Local color symmetry permits individual quarks to change color but leave the hadron colorless. In the illustration, the baryon is colorless because in (a) it has r, b, and y quarks. If the r quark changes to b by emitting a r5 gluon, as in (b), the b quark will absorb that gluon, turning into an r quark and leaving the baryon colorless as in (c). Gluons are usually represented by a coil-like line, as shown here and in subsequent figures. There is direct evidence for the existence of gluons. Mentioned in Sections 18-2 and 18-5• was the indirect evidence for inert partons from lepton-nucleon scattering which could be interpreted as due to gluons. The PETRA (Hamburg) e + e - colliding beam accelerator has yielded much more direct evidence for gluons. Recall Figure 18-13a, in which the e + and e - collide to produce a virtual photon, which then makes a quark- antiquark pair. The quark and antiquark start off back-to-back to conserve energy and momentum, since the e + and e - have equal energies in their head-on collision. The quark and antiquark each soon form other particles. At high incident energies the number of particles formed can be quite large and, because they are produced with relatively small momentum transverse to the beam direction, these particles can be close together. Thus the quark forms one jet of particles, and the antiquark forms another jet. This two jet structure is shown in Figure 18-18. It is interesting to note that the angular distribution of the axis of the two narrow jets with respect to the colliding beam direction is the same as for the axis of the ,u + e + + e - —> µ+ + µ- (see Figure 18-13b). Since the ,u has spin 1/2, thispairfom is direct evidence that the quark also has spin 1/2. Figure 18-18 Example of a two-jet event in e + -e collisions in the TASSO detector at PETRA (Hamburg). This is a computer reproduction of the measured particle tracks projected onto a plane. The particle tracks are curved because they are in a magnetic field. A small three-dimensional representation of the event is also shown. SJIWbNAd0IN0a HJW f11Nd f10 is experimentally difficult to distinguish these from quark-antiquark mesons, or worse, from possible mixtures of the two kinds of structure. MORE ELEMENTARY PARTIC LES Figure 18-19 Gluon emission in e t -e - production of a quark- antiquark pair. At large center of mass energies this process gives three jets of hadrons. Returning to the jet structure, as the energy of the beams is increased, one of the jets is increasingly often observed to be broad. This occurs because either the quark or antiquark radiates a gluon, from which another group of particles is formed. See Figure 18-19. As the beam energy is raised even more, this gluon-induced group of particles forms its own jet, and distinct three jet events are seen, as in Figure 18-20. 36152 Figure 18-20 Example of a three-jet event in e t -e - collisions at PETRA (Hamburg), as found in the TASSO detector. Show that a baryon made of a colorless combination of three quarks does bind. • Since a baryon will have to have a totally antisymmetric color eigenfunction for its three quarks, it will be of the form Example 18-4. [(rb — br)y + (by — yb)r + (yr — ry)b]/J (18 17) - Its antisymmetry can be seen by interchanging any two color labels. This eigenfunction is to be used to determine the interaction between quarks, which occurs by gluon exchange. Any one interaction must be between the two quarks exchanging the gluon, with the third quark not participating, but all possible two-quark interactions must be considered. The mathematical form expressing such an interaction involves the product of the initial state eigenfunction, the final state eigenfunction, and the interaction potential (it is a matrix element; see Appendix K). The part of the interaction potential relevant here is the gluon exchange color charge product, given in Figure 18-21. Equation (18-17) is the form of both the initial and final state eigenfunctions. SO IW `dNAd OIN OaH OWf11Mdf1 O At even higher energies two gluons often are radiated, causing four jet events. The energy and angle distributions of the jets correspond closely to QCD calculations, quantitatively confirming the existence of gluons. The gluons provide a simple quantitative explanation for the formation of quarkantiquark and three-quark hadrons but no other combinations. The qualitative explanation given in Section 18-5 is that only these combinations are colorless, but it is possible to go a step further and show why it is that the colorless combinations bind and other combinations do not. To do this it is first necessary to figure out the probabilities for various couplings between quarks due to gluons. In the electromagnetic cases associated with (18-6) we have seen that these probabilities depend on the charge involved. In the gluon case they will similarly depend on the color charge, which will be designated as x. The possible couplings are shown in Figure 18-21, where it will be noted that for an antiquark the color charge is denoted as — x, just as the sign of the electric charge reverses for an antiparticle. Starting with Figure 18-21a, a red quark couples to a blue quark by emitting a red-antiblue gluon (reversing the colors of the two quarks), and the resulting coupling probability is given by just the product of the color charges x 2 . For a red and blue quark interacting without changing their color, as in Figure 18-21b, the coupling is provided by that color nonchanging gluon having both red and blue, which is the second combination in (18-16). At the upper vertex r —+ r, so the part of the gluon eigenfunction which contributes involves rr, which is 1/N/6 of the whole eigenfunction. This coefficient multiplies the color charge x at the upper vertex, giving x// as the contribution to the coupling. At the lower vertex b —+ b and the bb part of the gluon eigenfunction has a coefficient of — 2/J. The lower vertex then contributes — 2x/J, giving a total color charge product of (x/N/6( — 2 x/J) = —x 2/3. For a red quark coupling to a red quark, as in Figure 18-21c, both color nonchanging gluons can contribute. At the upper vertex the ri part of one contributes x/ Nii, and the rr part of the other contributes x/\. Since the lower vertex is just the same, there will again be x// from one and x/J from the other. Thus the color charge product is x 212 from the exchange of one gluon and x 2/6 from the exchange of the other, for a total of x 2/2 + x2 /6 = 2x2 /3. Now the last three diagrams in Figure 18-21 involve the exchange of the same gluons as do the first three. So the color charge products are the same, but with opposite signs, since one vertex always involves antiquarks and hence has — x instead of x. We shall now use these results to calculate three examples. The first two will show that colorless combinations of three quarks bind and that a quark-antiquark pair bind. The last example will be of one simple case, a quark-antiquark combination with color, which does not bind. MO R E EL EMENTARY PARTICLES Diagram Color charge product X2 (b) - r x2/3 r \, (c) d (rr - +^)l^ d (rr + yy - 2b6,)1- 2x2 /3 - x r X (e) 2 r I ^ I c=: 2) (rr + yÿ - 2bb)l^ A.2/ 3 W o) r X r d (rr = +^)l,^ (fl ^ o) (rr + yy 2bb)lV-6- -42 / 3 r Figure 18-21 Gluon coupling between quarks. All possible types of gluon exchange are represented by these six diagrams. That is, all other exchanges just involve a permutation of color labels. The color eigenfunction is given for each exchanged gluon. The relative probability for each type of exchange is given by the "color charge product," where x is the color charge. Show that the gluon couplings give binding also for a colorless quark and antiquark. • Since the quark-antiquark pair, if bound, form a meson (which is a boson), it will have a totally symmetric color part to its eigenfunction (18-18) (rr + yy + The first term, rr > rr, contributes (1/ ,J)2(-2x 2/3) _ — 212/9 from Figure 18-21f, but each of the other two terms in (18-18) are identical in form with different color labels. All three then give a total of 3(— 2x 2/9) = —2x 2/3. Also rr bb or yÿ, each giving (1/\/) 2(— x2) from Figure 18-21d, for a total of —2x 2/3. However, yy —* rr or bb and b6—> rr or yy, giving the same contributions as the rr. So the total is — 2x 2 . The net coupling strength is — 2x2/3 — 2x 2 = — 8x2/3, giving a potential of — 8x 2/3r. Again, the minus sign indicates bind1 ing. But other quark combinations give positive signs and nonbinding potentials. Example 18 5. - — Suppose a quark-antiquark pair possess color. Then it would have coloranticolor like a gluon. For definiteness, say it is rb. Find the form of the potential. • The gluon exchange between r and b cannot involve swapping colors since r —> b is not possible because a quark cannot become an antiquark. Thus only a non-color-changing gluon can be involved. Of the two available, only one has both r and b color; it is (rf + yÿ — 2bb)/J. Thus only Figure 18-21e is involved. For that diagram, the red part of the gluon couples at the upper vertex with color charge x/J. The antiblue part of the gluon couples at the lower vertex with color charge (— x)(— 2/ /) = 2x/ / . The color charge product is then 1 (x/*)(2x/s) = x 2 /3. This gives the positive, non-binding potential x 2/3r. Example 18 6. - In addition to the question of the sign of the potential, there is its 1/r dependence to explain. Recall from Section 18-5 that the 1/r nature of the potential which is required to give cc and bb energy levels means the gluon must be massless. That is indeed the result QCD gives for the same reason the photon from QED and the gauge fields from Yang-Mills are massless. Gauge invariance requires them to be massless, and producing a mass would require adding something new to the theory. In the YangMills case this masslessness was in fact the feature that made the theory surely incorrect. However, for gluons the situation is different in two respects. First, in QCD the only gauge fields which get added to the free-particle wave equation are the gluons. There is neither electromagnetic nor weak interactions. Since the gluons do not possess such interactions, this helps make them unobservable. Second, the gluons are confined inside hadrons because they carry color charge, just as the colored quarks are confined. Since gluons cannot be observed directly, their masslessness is no problem. SOIWt/NAQOWOaHO Wf1lMdf1 O Consider first the interaction of an r and a b quark, with y not participating. This interaction comes from the first parentheses in (18-17), i.e., (rb — br). Since (18-17) appears in both the initial and final state eigenfunction, the interaction strength (or matrix element) involves the square of (18-17). Hence the interaction of the r and b quark is described by (rb — br)2 . We expand, and then investigate the two squared terms, each of which represents the process rb —> rb. This process involves the gluon exchange of Figure 18-21b, which has a color charge product of —x 2/3. This value is multiplied by (1/ / ) 2 from the square of the normalization factor in (18-17). Recalling that there are two squared terms, we find that the total contribution from rb —> rb is 2(1/6)(—x 2/3) = —x2/9. The cross term in (rb — br) 2 , which contains a factor of —2, describes rb —> br, for which Figure 18-21a gives a color char e product of x2. When we include the square of the multiplicative normalization factor, 1/V6, the total contribution from rb —> br becomes —2(1/6)x 2 = —x 2/3. This gives a total for both possible rb interactions of —x 2/9 — x 2 /3 = — 4x2/9. However, the other two color combinations, by and yr in the second and third parentheses, have exactly the same couplings as in the rb case, differing only in color labels. Thus the net contribution from all three sets of two-quark interactions is 3(-4x 2/9) = —4x 2/3. Just as —e 2 gives the strength of the coupling in the Coulomb potential between a positron and an electron, —e 2/4ire0r, so this result gives the strength of the Coulomb-like potential for quarks to be —4x 2/3r. The minus sign in both the positronium and the three-quark case shows that there is binding. Confinement and its accompanying feature at the other end of the distance scale, MO RE ELEME NTARY PARTICLES asymptotic freedom, have been discussed in Section 18-5 on the basis of an empirical term proportional to distance in the quark binding potential. These two features are absolutely essential to the success of QCD, and hence the origin of the k 2r term requires explanation. Starting again with electrostatics, we consider a negative charge Q in a dielectric such as water. The polar water molecules near the charge line up with their positive end toward the charge, as shown in Figure 18-22a. This presence of (a) (b) Antiquark Quark Gluons (e) (a) A polarizable dielectric screens a free charge. (b) Vacuum polarization resulting from virtual positron-electron pairs screens the charge around a real electron. (c) Because gluons carry color, they have an antiscreening e ffect, enhancing the color field between a quark and an antiquark. As shown in the figure, the antiblue quark "sees" more red due to the gluons. This effect increases with distance, since more and more gluons appear. Figure 18 22 - 18 8 ELECTROWEAK THEORY - With successful gauge theories of the strong, electromagnetic, and gravitational interactions, it is natural to suppose that such a theory must exist for the weak interaction as well. While such is the case, it is surprising that this theory is not just of the weak interaction, but it includes the electromagnetic interaction as well, giving a common origin to both. It is also rather unexpected that this electroweak theory would stem so directly from the Yang-Mills theory, which was an attempt to explain the strong interaction. Recall from Section 18-6 that the Yang-Mills theory produced four gauge fields. One of these could be identified with the massless photon. But the others had three values of isospin, + 1, 0, and —1, and hence three values of charge, also + 1, 0, and —1, like the pion. Such massless charged particles would have been detected, and hence the theory could not correspond to reality. The only way the charged particles could exist and not have been detected is if they were so massive that no accelerator yet had enough energy to produce them. The desired result of giving the gauge fields mass is doubly difficult. First, it cannot be done arbitrarily; a mechanism must exist to produce mass. Second, if a gauge boson did have mass, it would violate gauge invariance! At:1O3H1 Nb'3MO1d10313 positive charge decreases the effectiveness of the negative charge Q, reducing the electric field it produces. This could be described as saying that the effective magnitude of Q is reduced (say to Q'), provided the distance from Q at which the electric field is measured is larger than the size of a water molecule. For smaller distances the magnitude of the effective charge quickly increases from Q' to Q. Going next to QED, we find that the same sort of effect will occur even with a charge in the vacuum by a process called vacuum polarization. This occurs because an electron is always emitting and absorbing virtual photons, and often these are energetic enough to create virtual positron-electron pairs. The e + e - pairs align themselves with respect to the electron in the same manner as did the polar water molecules. Again the effective charge of the electron is reduced by this screening of the charge, as shown in Figure 18-22b. Because of the distribution of e +e - pairs with distance from the electron, the effective charge increases as distance to the electron decreases. The same vacuum polarization phenomenon occurs for the quarks, reducing the quark's effective color charge x, or strong coupling constant as = x2/4irhc (like a = e2/47tEOhc). This causes as to increase as distance decreases. (Because its value depends on distance, as is sometimes called a running coupling constant.) However, the non-Abelian color field behaves differently from the Abelian electromagnetic field. Because the gluon carries color charge, unlike the photon with no electric charge, the gluons the quark emits and absorbs produce a dominating opposite effect, shown in Figure 18-22c. The farther apart the quarks get, the more the gluons (which attract each other) crowd together, as was described in terms of lines of force in Section 18-5. This antiscreening effect increases as the distance between quarks increases. Thus the effective color charge, the coupling constant, and the potential become larger with distance, producing quark and gluon confinement. The fact that a s changes in this way, giving asymptotic freedom at small distances, was first worked out by Gross and Wilczek and independently by Politzer in 1973. The smallness of the potential at small distances enables the use of perturbation methods (see Appendices J, K, and L), and these QCD calculations agree very well with experiment. Calculations become difficult as the potential increases, and the details of confinement had not been worked out at the time this was written. However, every indication at that time was that there is at last a successful theory of the strong interactions. Figure 18 23 A virtual photon is emitted and reabsorbed by an electron in a time At. As the photon loop and hence At is made smaller, the energy associated with this process, AE h/At, becomes larger. MORE ELEMENTARY PARTI CLES - (Recall that gauge invariant electromagnetism has a massless photon.) Now gauge invariance is needed not just to have a gauge theory, but more importantly this gauge symmetry makes it possible to have a finite or renormalizable theory. A brief diversion is necessary to explain renormalizability. In the discussion of vacuum polarization in the previous section, the effect of virtual particles on the effective charge of the electron was described. The emission and reabsorption of virtual photons also affects the mass of the electron. Consider the diagram in Figure 18-23, in which a virtual photon is emitted and reabsorbed by an electron. The time, At, taken by this process limits the energy, AE, associated with it by the uncertainty principle, AEAt — h. As the photon loop gets smaller, At gets smaller and AE gets larger. As the loop size approaches zero, AE —> oo and the effective energy or mass of the electron can seemingly become infinite. This makes no sense physically, but such infinities appear in the calculation. The problem was finally solved for mass and charge infinities in QED in 1948, especially through the efforts of Feynman, Schwinger, and Tomonaga, who shared the Nobel Prize in 1965. This process, called renormalization, is to find one negative infinity for each positive infinity so that these cancel, leaving a finite residue which is defined as the observed mass or charge. The bare mass or bare charge of the electron are never observed, since the electron is always surrounded by a cloud of virtual particles. A highly symmetric theory is needed to get the canceling infinities, which is the importance of gauge symmetry in this connection. The previously available Fermi theory of the weak interaction was not renormalizable, but we shall return later to the problem of infinities in the weak interaction. It appears as if a miracle is needed to get a weak interaction theory. Consider the conflicting requirements. First, gauge invariance is needed to get a renormalizable theory. Second, the gauge bosons have to be sufficiently massive so they would not have been detected long ago. Third, massive gauge bosons break gauge invariance. Indeed a rather miraculous solution did appear in the form of what is called spontaneous symmetry breaking. This provided a mechanism for giving the gauge bosons mass as well as preserving gauge invariance. In mentioning SU(3) and SU(4), we have stated that they are broken symmetries because all quarks do not have the same mass. Now we are discussing a process that causes a symmetry to be broken spontaneously. To understand spontaneous symmetry breaking it is necessary to know about systems with hidden symmetries. A simple example is a rod under axial pressure. Although the equations describing this situation are symmetric under rotations about the axis of the rod, as the pressure on the rod increases it will suddenly buckle in some definite but arbitrary direction. Another example is a perfect ferromagnet. The spins of the atoms have a rotational symmetry above the Curie temperature (see Section 14-4), but as the magnet is cooled below the Curie temperature the spins of the atoms in a domain suddenly line up in a definite but arbitrary direction. In both of these examples it cannot be predicted which of the infinite number of equivalent nonsymmetric final states will be chosen, but all of them have a lower energy than the symmetric ones. The original symmetry of the equations of motion is hidden in observations of the final states. In both cases of hidden symmetry there exists a critical value of some quantity (pressure or temperature in the cases just discussed) beyond which spontaneous symmetry breaking will occur. The spontaneous symmetry breaking holds out the hope that the gauge V(T* 111 ) (a) V(T ) u Figure 18-24 The potential V = , I*111 + Aert11) 2 for the cases (a) µ 2 > 0 and (b) ft z < O. Re stands for "real part of" and 1m stands for "imaginary part of." A> iO3 Hl>Id3MO1:11031 3 invariance can still exist in the theory, but that the solutions in breaking the gauge symmetry will allow massive gauge bosons. As a step along the way to the desired solution, in 1961 Goldstone investigated spontaneously broken global symmetry. Consider a potential of the form µYS*1 + 2(`Y*`Y) 2 where and 2 are constants. This is plotted for the case p2 > 0 in Figure 18-24a. It clearly is a symmetric potential, and the ground state at W = 0 is symmetric under a global phase transformation 'I' -* 'Y' = ei ° W. However, as the parameter µ 2 is decreased, the critical value (like the pressure that breaks the rod, or the Curie temperature for the ferromagnet) is reached at µ 2 = O. For 12 2 < 0 (i.e., for it imaginary) the potential is still symmetric, as shown in Figure 18-24b. Now the phase transformation changes the relative amounts of the real and imaginary parts of W, which have become independent. There is now a ring of ground states, all nonsymmetric. Note that just like the ferromagnet or the broken rod, the system will be in a definite but arbitrary ground state, and the energy of any of the nonsymmetric states is lower than the symmetric one. Although the argument cannot be presented here, it is important to know that when the symmetry is broken the field `Y breaks up into two scalar fields, one of which is massless (the so-called Goldstone boson) but the other of which acquires a mass. The next step was taken by Higgs in 1964 when he investigated spontaneously broken local symmetry. He used a local phase transformation of the type discussed above for QED. It will be useful to note for later reference that the group of such transformations is the U(1) group, a unitary group in one dimension. The local phase transformation is compensated by a field which, like that of the photon, is a vector. Using the potential of Figure 18-24, Higgs again obtained from the spontaneous symmetry breaking two scalar fields, one with mass and one without, in addition to the vector field. Then came the amazing result: by a suitable gauge transformation, the massless Goldstone boson disappeared, and the vector field acquired a mass. This has been described as the vector particle eating the Goldstone boson and getting heavy. The form of the electroweak gauge theory was set up by Glashow in 1961, but he had no way then to make the gauge bosons massive. Independently in 1967 Weinberg and in 1968 Salam applied the Higgs mechanism to give mass to the gauge bosons MORE ELEMENTARY PARTICLES and produced a consistent theory. In 1971 t'Hooft proved the theory was renormalizable, after which it was taken more seriously. Glashow, Salam, and Weinberg received the Nobel prize in 1979 for their work on this topic. A qualitative account of the structure of the theory will now be given. The electroweak theory is based on the Yang-Mills theory already described. The latter theory was an attempt to make a local symmetry out of the global SU(2) symmetry of isospin. Since isospin is a property of the strong interaction only, what can this have to do with the weak interaction? Formally the two-component wave function for protons and neutrons, say (n), is like a similar two-component wave function for the electron and its neutrino, (ée) L . Here the subscript L denotes that only a lefthanded helicity for particles is considered, as required by parity nonconservation in the weak interaction. To introduce the equivalent of isospin for the p and n, a weak isospin Tw is defined for the leptons, with y e having Tw= = + 1/2 and e - having Tw= = —1/2. This weak isospin has nothing to do with the usual isospin, but from the standpoint of a Yang-Mills type of gauge theory it then makes (Qe)L equivalent to (n). Of course, the other leptons can similarly be arranged in weak isospin doublets as and (2t)L, but only one of these need be dealt with, since the results for the others will be the same. While the weak interaction produces left-handed particles, the e - with a righthanded helicity does exist and the theory cannot deal with one state and ignore the existence of the other. Since electromagnetism is not parity violating, it treats eL and eR on an equal footing. So to include eR , electromagnetism had to be built in. In the theory it is assumed that the neutrinos are massless, so there is then no vR possible (see Section 16-4). Thus a local phase symmetry with U(1) transformations as in QED was included, as well as a Yang-Mills-like local phase symmetry with SU(2) transformations. This is then often referred to as a U(1) x SU(2) theory. To compensate for these local changes, four gauge fields were needed; call them B (for the U(1) transformation), and W1 , W2 and W3 (for the SU(2) transformation). The object to be identified with the massless photon is actually a combination of B and W3i call it A, where A = B cos ew + W3 sin Ow The parameter O w, called the weak mixing angle, must be found from experiment. There is another linear combination of the B and W3 orthogonal to A called the Z °. It is Z° = W3 COS Ow - B sin Ow Like B, the W3 is electrically neutral, but W1 and W2 carry electric charge. The states of definite charge are the combinations , W ± =W1 ±iW2 Just as the field A is to be identified with the photon which carries the electromagnetic force, so are the W and Z fields to be identified with the particles which carry the weak force. In relativistic quantum mechanics the terms field and particle become interchangeable. The simplest way to give the particles W + , W- , and Z ° a mass via spontaneous symmetry breaking is to introduce four Higgs scalar fields, of which two are charged (+ and —) and two are neutral. The charged Higgs particles give the W ÷ and W masses, one of the neutral Higgs particles gives the Z ° a mass, and these three Higgs particles disappear with a suitable gauge transformation. The other neutral Higgs remains as a real particle. This remaining Higgs particle, the c °, plays an unusual role. Unlike any other known particle it has a nonzero vacuum expectation value. That is to say, normally the vacuum in its lowest energy state has no particle in it, Ab1O 3H1 )Ib'3MO1:110313 but such is not the case for the (1)°. Instead, it costs energy to make the 0 ° disappear from the vacuum. Because of this feature, which makes the vacuum grainy at a scale on the order of 10 -18 m, the hidden symmetry is preserved. The weak isospin direction is defined with respect to the T ° field direction, but the latter direction is arbitrary. To describe some of the consequences of the electroweak theory, we take up first the role of the weak gauge bosons. Recall in Section 18-2 that the neutrino-nucleon cross section was proportional to neutrino energy and that this was cited as evidence for the existence of partons. This is a useful result and causes no problems as far as measurements have gone, but it would be a disaster if such an energy dependence would continue. Not only would this weak interaction cross section soon become bigger than those for strong interactions, but it would continue to grow to infinite size, which hardly describes a nucleon. The infinity in the cross section arises because the weak interaction is assumed to occur at a point. Well before the development of the electroweak theory it was realized that a way to avoid such infinities was to have the weak interaction carried by a virtual particle, so as to spread out the interaction spatially. Because the range of the weak interaction is so small, this intermediate boson had to be very massive indeed, in accordance with the uncertainty principle. In the electroweak theory the particles necessary for this purpose, the W + - , are a consequence of the local gauge symmetry. These particles give the andW standard charge-changing weak interactions, such as beta decay, K decay or neutrino scattering. Quark-level diagrams for these processes are given in Figure 18-25a, b, and c. The second of these diagrams is the more complete K decay process promised in Section 18-3. Quarks are involved here, as well as leptons, and it is an interesting consequence of the theory that the coupling of the W + to both quarks and leptons is the same. That is, quarks and leptons have equal strengths of weak interactions. This point will be explained more fully shortly. While the W + and W - coming out of the theory fulfilled their expected role, the Z ° was not anticipated. This gauge boson would mediate non-charge-changing weak interactions, and none had ever been observed. An example of such a so-called neutral current process is shown in Figure 18-25d. These were searched for and eventually found in a CERN (Geneva) bubble chamber experiment in 1973. This was obviously a triumph for the electroweak theory. However, neutral current processes also raised a severe problem for the theory. To understand this point it is necessary to know a little more about the coupling of the quarks to the intermediate bosons. Comparing rates for various types of weak decay, Cabibbo in 1963 found that if the Fermi decay constant for a purely leptonic process like /2 - e - + ve + v is fi, then that for a non-strangeness changing process like ir - —> µ + vg is f3 cos 0,, and that for one in which AS = 1 like K µ + v is f3 sin O. Experimentally the Cabibbo angle 0, turns out to be about 0.23 rad. Thus the ratio of rates for AS = 0 to AS = 1 decays, aside from phase space factors, is tan g 0, ^ 0.06. Going to the quark level, this means that the s quark does not couple to the W + as strongly as, say, the u quark does. To be more specific, in the electroweak theory we have used two-component wave functions for the three lepton families, such as (ée) L, assigning weak isospin as (±iii). To determine their weak interactions, the quarks can be treated the same way. However, the doublet of particles is ( 1,0L, with u having weak isospin z component T,,, z = + 1/2 (and it also has isospin z component TZ = +1/2) and de having T11,= = —1/2. Now d, is not the state d which has TZ = — 1/2, but rather it is the mixture d cos 0, + s sin O. This then gives the correct Cabibbo couplings already discussed for AS = 0 and AS = 1 decays. This scheme works well with charged current weak interactions involving the W ± . However, the Z ° creates a problem. If the quark the Z ° interacts with is the u, there MOREELEMENTARY PARTICLE S (a) (b) (c) d (d) Figure 18-25 Quark diagrams for weak interactions, with (a) neutron decay, (b) K --> rc + + m , and (c) vµ + n -+ 1u + p, all charged current processes, and (d) vIL + p -* vu + p, a neutral-current process. Double lines represent the exchanged bosons. is no difficulty, but if it is the d„ this mixes d and s quarks. That makes possible s d (strangeness changing) neutral current processes, and these are known experimentally not to exist for ordinary weak interactions. A solution to this problem, now called the GIM mechanism, was proposed by Glashow, Iliopoulos, and Maiani in 1970 when they suggested that if a c quark existed there would be another quark doublet (sc)L, and that this would cancel s — d processes. This saving cancellation would occur because se = s cos 0, — d sin O is orthogonal to dc , and whenever one is present in a neutral current process the other can be also. When the c quark—the "charm" to ward off the evil strangeness-changing neutral current—was discovered in 1974, the electroweak theory and the GIM mechanism triumphed. When it was later found that there is another quark doublet, the (bc)L, the mixing among quarks became more complicated. It was expressed in terms of a 3 x 3 matrix by Kobayashi and Maskawa in 1972, but it is worked out similarly to the GIM The electroweak theory arranges the quarks and leptons in the following symmetric way, so far as their weak interactions are concerned: (v, (e)L^ ( l1 L^ and t L ((Jul' (a' (a These are all weak isospin doublets, and the right-handed helicity components are all weak isospin singlets. As alluded to in Section 18-4, there is a reason to believe, aside from its esthetic appeal, that even if more leptons or quarks are discovered, this type of symmetry will be preserved. The reason is that a process called a triangle anomaly, illustrated by a diagram in Figure 18-26, can give devastating infinities unless the sum of all the charges of left-handed fermions add to zero. Each quark doublet has charge + 2/3 and —1/3, adding to + 1/3, but there are three colors of quark, so the total charge is + 1, just canceling the —1 of the corresponding lepton doublet. Each paired quark and lepton doublet is called a generation, so for each generation the charges add to zero. So long as this symmetry holds within each generation the triangle anomalies disappear. This ties together quark-lepton symmetry, fractional quark charges, and color! To the bigger successes of the electroweak theory, the discovery of neutral currents and the c quark, can be added the discovery in 1983 of the W and also of the Z °. It is not just that these necessary particles have been found, but that they apparently have about the right masses. The electroweak theory predicts the gauge boson masses to be e 2 ,./i MW} -= ( 1/2 8/^ sin g BOw ) sin Ow GeV/c 2 = M L o cos Ow (18-19) That is, the masses of the W ± and the Z° are related, and both depend on just the strengths of the electromagnetic (e) and the weak (fi) interactions and on their mixing (Ow). The angle OW , while an undetermined parameter in the theory, is measurable in many different kinds of experiments. It is an important test of the theory that the results for Ow agree well from these diverse determinations. Examples of experiments are for charged currents, v e-e - scattering (involving only leptons) and v µ-nucleon scattering (leptons and quarks), and for neutral current processes the asymmetry measurements due to Z °-y interference in e + + e- —* p+ + u - (leptons) and electrondeuteron scattering (leptons and quarks). The results of all of these give sin e 0.23. Inserting that value in (18-19) gives a mass of about 80 GeV/c 2 for the W, in agreement with experiments with 270 GeV proton-antiproton colliding beams at CERN (Geneva). From the same experiments in 1983 there was also reported the discovery of the Z ° at about the expected mass of around 90 GeV/c 2 . At the time of writing, two accelerators (LEP at CERN and SLC at SLAC) are being built just to An example of a triangle anomaly graph. While any one graph would give infinity, the effects of graphs for each fermion within one generation cancel, if the sum of all left-handed fermion charges also add to zero. The solid lines are the Figure 18-26 fermions. ^ o cn m ^ ^ ^ ^ Ab1 O3H1 N `d3M0 1:110313 mechanism so that there is no flavor-changing neutral current process. It is important to note that this quark-mixing matrix has a phase which gives CP violation. At the time of writing it is widely believed, although not experimentally proved, that this is indeed the way in which CP violation occurs in K decay, and hence CP violation would not be seen in leptonic processes. The corresponding effect of time-reversal violation, which by the CPT theorem must accompany CP violation, would then result from this quark mixing. MOR E ELEMENTARY PARTI CLES co c explore the large amount of physics that can be done with e +-e collisions at the Z° mass. Much will undoubtedly be learned with the new accelerators, but already it is clear that the electroweak theory must be close to correct. The area about which there is the most uncertainty involves the Higgs particle t °. Unfortunately there is no prediction of its mass, but the (13 ,° is actively being sought. Except for noting the existence of the gauge boson, A, in the theory and identifying this massless particle as the photon, little has been said about the electro part of the electroweak theory. All of QED comes out of the theory, but that is old stuff. What is new is that a surprising relation results between the electric charge e expressing the strength of the electromagnetic interaction, and the weak charge gw expressing the strength of the weak interaction. It is remarkable that this works, since everything is determined once Ow is known. The simple relation, already put in (18-19), is (18-20) e = 2 \h g w sin Ow This shows that the electromagnetic and weak interactions are of about the same strength. What makes the weak interaction appear so weak are the large values of the masses M w . and MZ o, making the range of the interaction so short. The fact is clearly shown when gw is related to the Fermi weak interaction coupling constant f. The relation __ ^ V"gw M 2 (18-21) can be obtained by combining (18-19) and (18-20). The electroweak theory combines and relates, particularly through (18-20), the electromagnetic and weak interactions. 18 9 GRAND UNIFICATION OF THE FUNDAMENTAL INTERACTIONS - Although the results of the electroweak theory include a close relationship between electromagnetism and the weak interaction, that is a result of spontaneous symmetry breaking. The underlying symmetry of the theory, if not broken, would make these two the same interaction. At some high enough energy this symmetry should apply. To get some idea of the unification energy, we can look at the behavior with energy of the electromagnetic and weak coupling constants or charges. Instead of using the electric charge e appropriate to the photon (or gauge boson A) which results from symmetry breaking, it is more appropriate to use the corresponding coupling g' for the gauge boson B of the U(1) transformations before symmetry breaking. But g' = e/cos 9w , so the two are almost alike, except that the weak mixing angle Ow gw appropriate to the W ± andincreaslowythgSimarl,nsedofug Z ° after symmetry breaking, the coupling g for the W1 , W2 , and W3 of SU(2) before the symmetry breaking is to be used. Again the two are closely related: g = 2 \ gw . g starts out at low energy larger than g', since from the relations Howevr,thisman given above g' = g tan Ow . Now g decreases as the energy increases, while g' increases slowly as Ow increases with energy. Thus, g and g' approach each other as the energy increases. Note that this increase in energy corresponds to a decrease in distance. High energy behavior means short-distance behavior, as can be seen either from the uncertainty principle, Ax h/Ap x , or the de Broglie wavelength,). = h/p = hc/E (see Section 3-1 and Appendix A). Since the strong coupling constant a, also decreases as the distance decreases or the energy increases (see Section 18-4), it is interesting to find out if the strong interaction approaches the other two at high energy. Using x, (where we recall that aS = x 2/4ihc) to obtain a chargelike quantity as are g and g', we see the remarkable result in Figure 18-27. At an energy of about 2 x 10 14 GeV the three come together. This energy corresponds to a distance which is best specified as )./2rc = hc/E = Log energy Figure 18-27 The coupling constants of the strong (color) and electroweak (weak with electromagnetism) interactions seem to extrapolate to a single value at about 2 x 10 14 GeV. 0.2 GeV-F/2 x 10 14 GeV — 10 -30 m. At this extremely high unification energy or very small distance there is a strong possibility that all three interactions become the same. For this reason much effort has gone into developing grand unified theories (called GUTs for brevity) in which SU(3) of the strong color interaction, SU(2) of the weak interaction, and U(1) of the electromagnetic interaction result from a further symmetry breaking of a unified interaction. Many methods have been employed to incorporate the SU(3), SU(2), and U(1) symmetries into a more inclusive gauge symmetry. One such attempt used the larger group SU(5). This work of Georgi and Glashow (1974) is worth discussing briefly because it is the simplest to appear at the time of writing, although experimental evidence may rule it out. The procedure in obtaining this gauge theory is like that discussed before with the added complexity that there are 5-component wave functions and gauge transformations involving 5 x 5 matrices. Thus 5 Qx 5 = 1 O+ 24 gauge bosons are needed to compensate for the local phase transformations. As usual, the singlet is not of interest, but of the 24, 8 are the gluons for the color interaction, 4 others are the y, W± , and Z ° and the remaining 12 are the so-called X and Y bosons. The X and Y bosons are also called leptoquarks and have antiparticles X and Y. These four particles come in three colors, giving twelve particles in all. To give an idea of the relationship among the leptons and quarks, a typical 5 representation of the group and schematic of the reactions among the particles is shown: ve 1 1 e 5 = dr (18-22) db )019.0 dyl^ Of the reactions carried by gauge bosons between the 5 particles, two are as before: the ye combining with a W - to produce an e - , and a blue antidown quark emitting a blue-antiyellow gluon to become a yellow antidown quark. The new third reaction, carried by the X boson of charge —4/3, is between a quark dr of charge + 1/3 and a lepton e - of charge — 1; thus dr + X —* e - conserves charge. This last type of reaction would cause nucleons to decay, but the X and Y bosons have masses about equal to the unification energy, making this an extremely weak reaction. The SU(5) theory has a number of highly desirable features, some of which are shared by other unification theories. For example, the total electric charge of any GRAN DUNIFI CATI ON OF THE F UN DAMENTAL INTERA CTI ON S Coup ling co nstants SU(3) strong color field MO RE ELEMENTARY PARTICLES multiplet, such as the 5 given in (18-22), must add to zero. This condition, like that for eliminating triangle anomalies, works if the quarks have fractional charge and also have color. This would give a reason for the proton to have the same magnitude of charge as the electron. The leptoquark unification gives a reason for the weak lepton and quark doublet patterns, such as (ee)L and (dc)L, and the fact that the difference in charge within each doublet is the same; i.e., Q(v e) — Q(e - ) = Q(u) — Q(dc). More quantitatively, the SU(5) theory predicts with remarkable accuracy the weak mixing angle Ow , the important undetermined parameter of the electroweak theory. Unfortunately the theory has two serious difficulties. The first is called the hierarchy problem, resulting from the tremendous difference in the masses of the weak gauge bosons (10 2 GeV/c 2) and of the leptoquarks (10 15 GeV/c 2). To achieve that huge difference in masses requires an unbelievable fine-tuning of parameters, and there are added difficulties with the stability of these solutions under renormalization. The other problem is experimental: the predicted proton partial lifetime for the p e + + Tr° decay mode is 4.5 x 1029 ± 1 ^' years, while the limit from the experiment by the University of California Irvine, University of Michigan, and Brookhaven National Laboratory is > 10 32 years as this material is written. The question of experimental tests of grand unification deserves a little more discussion. For a long time it has been believed that lepton number and particularly baryon number were absolutely conserved quantities, like charge. However, as was discussed in Section 17-8, absolute conservation laws are connected with exact invariance principles and symmetries. We have learned that charge conservation depends upon gauge invariance and the existence of an associated massless field. This is a general result for charge-like conservation laws in gauge theories. There is no gauge invariance with a massless field that can be associated with the conservation of baryons or leptons. These are probably approximate conservation laws which appear to be so exact because the unification energy, perhaps expressed as leptoquark mass, is so large. Whatever the theory, if quarks and leptons are unified, baryons and leptons will not be conserved. Shortly after the universe began expanding, at a time when its thermal energy was comparable to the unification energy, these unification effects were large. Now these effects are extremely small because the thermal energy or temperature of the universe is so low. Two of these effects will be cited briefly as examples, the first being the already mentioned proton decay. Man could not exist with the radiation from his own body if the lifetime of the proton were not at least a million times longer than the age of the universe, which is about 10 10 years. To detect a proton lifetime in the 10 30 years range requires a great deal more material than the human body, as well as a much more sensitive detector. The experiment, which at the time of writing is giving the best limit of 10 32 years for the proton lifetime, uses about 8,000 tons of highly purified water viewed by particle detectors and held in a plastic container lining the walls of a huge pit dug in a very deep salt mine. It is necessary to go deep underground to eliminate the effect of cosmic rays, particularly extremely high-energy muons. Cosmic ray neutrinos cannot be absorbed out; they sometimes produce events difficult to separate from proton decays and these may set a limit of about 10 33 years on the sensitivity of the experiments. Besides the great experimental difficulty in detecting proton decay events in such a huge bulk of material, there is the problem of knowing for which decay to design the instrumentation. While the initial experiments were made particularly to detect p e + + 7c° as favored by SU(5), other theories suggest different decays. Other, more finely grained detecting systems may do a better job on some of these other decays. It may take some time to have definitive results, but the existence of proton decay is crucial to grand unified theories. Less crucial, because the effects could be unobservably small, but nevertheless important, is the issue of the violation of lepton number conservation. Experiments , Majorana neutrino. Because of parity violation in the weak decay, the neutrino emitted in the first decay will have a right-handed helicity. Because the e - , being a particle instead of an antiparticle, has to have left-handed helicity, angular momentum conservation requires the absorbed neutrino producing the second e - to be left-handed. There are two ways to provide the required helicity reversal of the neutrino. One way is if the weak interaction, through the existence of a very massive (» 100 GeV/c 2) righthanded W boson, can sometimes give particles (as opposed to antiparticles) a righthanded helicity. The other way is if the neutrino has a nonzero rest mass, since then its helicity is reversed simply by having a coordinate system which travels faster than the neutrino that no longer has y = c. (See the argument at the end of Section 16-4.) In principle, it is possible experimentally to separate these two helicityreversing effects and provide a measure of v e or WR mass. So far such decays have not been observed, the experiments setting lifetime limits of greater than about 10 22 e mass effect, this places a limit of < 10 eV/c 2. years.Expdulyv The possible existence of a neutrino rest mass would most likely be a consequence of the violation of lepton-number conservation. That is, all known mechanisms for giving a neutrino a mass require that it be a Majorana neutrino. Many experiments have been done to detect a neutrino mass. One Soviet experiment, examining closely the end-point energy spectrum of tritium beta decay, reported a nonzero mass and quoted a limit of > 20 eV/c 2. This result and the double beta decay limit are not necessarily incompatible, because of a possible mixing among the different types of neutrinos; but the Soviet result was contested on experimental grounds when this was written and similar experiments were being done as a check. Another class of experiments looks for neutrino oscillations, which require that at least one flavor of neutrino have a mass and that flavor changing can occur among different kinds of neutrinos. The oscillations are, from a mathematical point of view, closely related to K °-K ° oscillations discussed in Section 17-8. In the neutrino case the measurements, of which there have been many, give a product of neutrino mass and degree of neutrino mixing. The mass limits set are quite small, unless neutrino mixing is at least as small as quark mixing. One of the motivations for looking for neutrino P Figure 18-28 Neutrinoless double beta decay requires that the virtual right-handed antineutrino emitted in the first neutron decay becomes absorbed as a left-handed neutrino in order that the second neutron decay occur. 0 m m C) ^ ^ ^ GRAN DUNIFI CAT I ONOF TH E F UN DAMENTAL INTERACTI ONS on this topic are even more theory dependent, but the most sensitive test is nuclear double beta decay. Even-even nuclei are bound much more tightly than their neighboring odd-odd nuclei because of the pairing energy explained in Section 15-9. For many of the even-even nuclei, while single beta decay is energetically impossible, double beta decay, via a two-step weak interaction, could give a transition to the next even-even nucleus. The expected process, involving the emission of 2e - + 2ve, is highly improbable but has possibly been observed in one laboratory experiment and also indirectly by looking for noble gas decay products in billion-year-old rocks. Having a much larger phase space volume is the decay in which only 2e - are emitted. This neutrinoless double beta decay would obviously not conserve lepton number. As shown in Figure 18-28, in the first decay an e - and a virtual V e are emitted. To get a second e - from the other beta decay requires that a virtual v e be absorbed. Thus, this decay demands the condition v e - ve and if it is satisfied lepton number conservation is violated. A neutrino which is identical to its antineutrino is called a 0 T MORE ELEMENTARY PARTICLE S ^ oscillations is provided by the observation of Davis and others, who find only about one-fourth as many solar neutrinos as are expected to reach the earth. If oscillations among the three kinds of neutrinos exist, then at the earth-sun distance only about one-third of the v e's would be detected. The questions of baryon and lepton conservation and of neutrino mass apply generally in a qualitative way whatever the grand unification scheme, although quantitative predictions differ. With so much uncertainty in the theoretical area, there is little point in devoting much space here to rival theories. Other groups, such as SO(10), have been used. Perhaps, as in the case of SU(3) of flavor which introduced quarks, one of these groups will lead to the next level of fundamental particles. Much work has already been done on this topic of preons, which are supposed to be the constituents of quarks and leptons. Another alternative is the supersymmetry theory, which was designed to avoid the hierarchy problem. In this theory every boson has a fermion partner, and vice versa. At the time of writing, the theory is very popular, but there is no experimental evidence for these squarks, sleptons, photinos, gluinos, etc. Another version of this theory, called supergravity, has the appealing feature that gravity does the symmetry breaking. This theory extends the hope that all four fundamental interactions may one day be unified in a single theory. The manifestations of grand unification apply not only in particle physics, but also in cosmology. This is a large subject and so only a few topics will be touched upon briefly in the following paragraphs. Neutrino mass may play a role in explaining the "dark mass" of the universe. From the rotation rate of galaxies it is known that 80 to 90% of galactic masses are not observed. There are so many neutrinos that if one type of neutrino had a mass between 4 and 80 eV/c 2 , even this miniscule value could provide most of the dark (i.e., unobserved) mass of the universe. This would also provide a mechanism to produce galaxy formation, presently an unsolved problem, and to give stability to galaxies. If the neutrino mass were sufficiently large it would eventually stop the expansion of the universe and hence close it. While neutrino mass is a by-product of grand unification, there are more direct manifestations of this subject for cosmology. For example, the antibaryon-to-baryon ratio in the universe has been difficult to understand. At an early stage of the universe's expansion this ratio should have been unity. From observations of heavy cosmic ray nuclei and lack of observation of the x-ray emission which would result from the annihilation of galactic matter with intergalactic antimatter, it is known that this ratio is now <10 -4. Explanations for this large change have come from theories like SU(5) in which baryon nonconservation occurs and which has a baryon-creating process that is CP violating, so that more baryons than antibaryons are created. More generally, since the very early universe was controlled by unified interactions, it is to be expected that there are presently detectable results of that early era. About 10 -4° sec after the singularity that began the expansion of the universe (the big bang), its thermal energy was at the grand unification level, and the breakdown of unifying gauge invariance was just starting to appear. The gauge theories have produced impressive increases in our understanding at both ends of the distance scale, with applications to cosmology and to particles. Those simplifications and unifications give hope that all of physics is being brought together into an understandable whole. QUESTIONS 1. What is really meant by an elementary particle? Consider such properties as mass, lifetime, size, and reactions, especially decays into other particles and fusion to make other particles. 2. How would the cross section for antineutrinos scattering from nucleons depend upon laboratory energy? Why? From the reaction, how could you tell if a y or AT was incident? SNOIlS3 f10 3. The threshold laboratory kinetic energy for producing antiprotons by the reaction p+ p - p + p + p +pis 5630 MeV. If instead of a free proton target, protons bound in a nucleus are used, would you expect the threshold energy to be lower, higher, or the same, and why? 4. The elastic electron-proton cross section decreases rapidly with increasing electron energy, whereas the inelastic cross section does not. On the basis of the essential physical difference between those two processes, what is the reason for the disparity between the two energy dependencies? 5. The nucleon and antinucleon are each about 7 times more massive than the pion. How is it even conceivable that the 7E could be a combination of nucleon and antinucleon? 6. Why is isospin, like SU(3), a broken symmetry, and how is it broken? 7. What is the hypercharge of the u, d, and s quarks? 8. The 3 and 3 representation make a singlet and an octet. Would you expect the singlet to have the same spin and parity as the octet? Why? 9. If a strong decay mass width for a particle is -10 2 MeV/c2 , what would you expect an electromagnetic decay width to be? How does this compare with the width of the /J? 10. Explain why the mass width of the cp ° is much smaller than that of the other vector mesons p and co which have an even lower mass. 11. The decay D + -p K - + it+ +n+ is allowed, but D + K + + n° and D + -> K + + n + + IL are strongly suppressed. Why is this? 12. Out of the spin 3/2, even parity decuplet only three members (A , A ++ , and S2 - ) have been selected to demonstrate a need for the color quantum number. Why have the others not been utilized? 13. In what ways are electromagnetic and color charges similar and different? 14. The fact that the photon is massless makes the electromagnetic interaction one of long range. If the gluon is also massless, why is the strong color interaction also not of long range? 15. Suppose you have two dice, each of which you are going to rotate in some prescribed manner. Is the finite rotation of one die an Abelian or a non-Abelian operation? Is the choice of which die to rotate first an Abelian or a non-Abelian operation? 16. When a local phase transformation is constructed in the electromagnetic case, a charge is inserted and the phase angle is made to depend on space and time coordinates. In the Yang-Mills theory, what sort of chargelike quantity would be inserted? That is, what interaction would it relate to? 17. Local electromagnetic charge conservation depends upon gauge invariance and the existence of an associated massless field, the photon. Do similar conditions apply in the color interaction and is there a similar absolutely conserved quantity? 18. Why is vacuum polarization necessarily a quantum effect only? 19. The cross section ratio R of (18-6) is based on the quark-parton model. This result is altered slightly in QCD because of the appearance of gluons. Considering what happens to hadronic jets as the energy increases, in what direction would you expect R to change due to QCD corrections and why? 20. In what way is the non-Abelian nature of QCD essential in converting the global symmetry of color to a local symmetry? Why can the same result be achieved in Abelian QED? 21. What is the hidden symmetry in the electroweak theory? In answering this it may be useful to recall the Yang-Mills theory and the role of the Higgs boson. 22. Before the electroweak theory it was difficult to compare the weak coupling constant to the electromagnetic one because they have different dimensions. Explain these dimensions and how the electroweak theory gives an appropriate strength to a dimensionless weak coupling constant. 23. What is the relationship, if any, between a Goldstone boson and a Higgs particle? MOR E ELEMENTARY PARTIC LES 24. If neutrinoless double beta decay occurs, the neutrino is of the Majorana type, requiring y = v. In neutrino-nucleon scattering, beams of "y" and of "v" are utilized, and they produce different results. What physical characteristic makes an apparent "y" in a beam differ from a "v' and yet would allow these really to be Majorana neutrinos? PROBLEMS 1. Prove the relation p2 = mE/2 quoted in the third paragraph of Section 18-2. (Hint: Use results obtained in the last problem of Appendix A.) 2. (a) The intensity of a beam of particles diminishes fractionally by dI/I = -dx/) in a distance dx, if the mean free path for collision with n other particles per unit volume is = 1/n6 for an interaction cross section a. Using these relations, estimate the probability that a solar neutrino will pass through the earth along a diameter without interacting. Take o- = 4 x 10 -44 m 2/nucleon, and the radius and mass of the earth to be 6.4 x 106 m and 6 x 1024 kg. (b) For a flux of neutrinos from the sun of 4 x 10 14 m -2-sec -1 , make a rough estimate of the number of neutrino-induced reactions in your body per day. 3. (a) Draw a Feynman diagram for the pion charge exchange reaction, n: + p -> n° + n. In this case the exchanged particle is a p meson. Explain what latitude you have in choosing the charge of the p. (b) Redraw the diagram of part (a) as a quark flow diagram (a Feynman diagram on the quark level). 4. The meson octet of Figure 18-6 is formed by quarks q iqi, where q i can be u, d, and s and qJ their antiparticles. Show that the baryon octet of Figure 18-7, which is made up of giqigk, can have the same TZ and Y quantum numbers as that of the meson octet. Proceed by finding which combinations of gigk have the same TZ and Y quantum numbers as qi. 5. (a) Using Table 18-1, determine the quark structure of the antiproton (p), E + baryon, and p - meson. (b) Since the n has spin 0 and the p has spin 1, what is the internal structure of the 2r and p? The angular momenta should be specified in spectroscopic notation (e.g., 3 D 2). 6. In an e +-e - colliding beam accelerator, the ring radius is 350 m. Each beam has 15 milliamps of current, which can be considered as electrons or positrons (charge 1.6 x 10 -19 coulombs) traveling at velocity c. Determine first the number of circulating e + and e - . The luminosity L of the accelerator is defined so that there is a reaction rate of 6L per second for a process with cross section a. Now L depends on the particle density transverse to the beam (i.e., particles per unit area) of each beam, the beam area, and the frequency of revolution. Find L if each beam has an area of 10 -6 m2. 7. (a) Draw a quark-flow diagram for the strong decay 0(3767) D + + D. (b) Using the quark content as a guide, assign isospins (T and TZ) to the D +, D - , D°, and D°. In what way are these mesons similar to and different from the K mesons? 8. The D meson is a pseudoscalar and the D* meson is a vector with the same quark content. What would you expect to be the quark-antiquark states for the D and D*? Use spectroscopic notation. 9. A charmed baryon, E c, with T = 1 has been discovered. From its name, what would you expect its quark content to be? Consider all three charge states. 10. Using (18-5) find the isospin of the B meson. How is this like the K meson? 11. Draw a quark-flow diagram for T -> E+ + 7r° + 7 and state how this decay relates to the narrow mass width of the T. 12. Draw a quark- fl ow diagram for the decay T -> p+ + u . Recalling (18-6), determine the p+ + to that for T -> p+ + ratio of the probability for u 13. Show that the condition for local phase invariance, P(x,t) -* tP'(x,t) = eie cx ,t) LP(x,t) will not satisfy the free-particle Schroedinger equation; i.e., 'P'(x,t) is not a solution if `P(x,t) is. To save algebra, consider only one space variable x, although all three may be involved. 14. As an example of a possible particle possessing color, consider the color eigenfunction for a member of a "sextet" representation of color SU(3) made from a quark pair: 15. 16. 17. [rr + bb + yy + (rb + br) + ^ (ry + yr) + (yb + by)] From the quark couplings of Figure 18-21 find for this eigenfunction the (QQ) 6 potential. Draw a quark-flow diagram for the weak decay i - 3 µ +174 . Explicitly include the appropriate intermediate vector boson. (b) By considering the production of it - and vu in the rest frame of the vector boson, show from the necessary parity nonconservation that the boson is indeed a vector type, that is, that it has spin one. In neutrino-nucleon scattering, the actual interaction is mainly with u or d quarks. (a) Give Feynman diagrams for charged-current v µ and vµ scattering from u and d quarks, being sure to conserve all necessary quantum numbers. (b) Because gluons form virtual quark-antiquark pairs, scattering can occur with reduced probability from û and d quarks; give Feynman diagrams for v, and vµ scattering from û and d quarks. (c) For u and d quarks, give Feynman diagrams for neutral-current scattering with vµ incident. (d) For the processes in parts (a) and (c) and using proton or neutron (in a nucleus) targets, what would be the initial state nucleon and what would be the final state nucleon? Show why observation of the process v + e e + VI, provides proof of the existence of neutral currents while v e + e - > e - + v e does not. Among the gluons are the combinations with color charges (rF yÿ)/h and 2bb)l, These appear to treat the different colors unequally so that it would (rr + yy matter which color had a specific label. Show that this is not true by taking the specific case of Figure 18-21b; compute the coupling for the quark reaction r + y r + y and get the same coupling -x 2/3 as was the case for r + b -> r + b. A neutral-current coupling to a u quark can be pictured as a u quark emitting or absorbing a Z ° and going on as a u quark with a different momentum. This is equivalent to a u and û quark annihilating to form a Z °. Draw Feynman diagrams for both processes and state why they are equivalent. From the u + û -> Z ° point of view, the amplitude for the process will involve the wave functions for uû. Similarly if d, and s, are involved, the amplitude will be proportional to the sum dcdc + scsc . Show that the strangenesschanging part of this amplitude vanishes because s cs, has been added to dcdc ; i.e., show that the GIM mechanism works. - - - - 18. - - 19. sw318oad (QQ)6 = Appendix A THE SPECIAL THEORY OF RELATIVITY The object of this appendix is to develop those results of Einstein's special theory of relativity that we shall need in our study of quantum physics. Of course it is likely that many students will have worked with relativity, in studying classical mechanics and/or electromagnetism, before embarking on the study of quantum physics. For those students, this appendix can be useful as a review. For others, it should be useful as a concise treatment of the most important results of relativity. THE GALILEAN TRANSFORMATION AND MECHANICS In classical physics the state of a mechanical system at some instant can be described completely by constructing a frame of reference and using it to specify the coordinates, and the time derivative of the coordinates, for the particles comprising the system at that instant. If we know the masses of the particles and the forces acting between them, Newton's equations of motion make it possible to calculate the state of the system at any future time in terms of its state at the initi al time. Now, it is often desirable that during or after such a calculation we specify the state of the system in terms of a new frame of reference which is moving in translation (i.e., not rotating) relative to the first frame with constant velocity. Two questions arise: (1) How do we transform our description of the system from the old to the new frame? (2) What happens to the equations which govern the behavior of the system when we make the tr an sformation? These questions are the ones with which the special theory of relativity concerns itself. (In the general theory, which we shall not need in our study of quantum physics, transformations involving acceleration of one frame relative to the other are considered.) Figure A-1 shows a particle of mass m whose motion under the influence of force F is specified in terms of a primed and an unprimed frame of reference. The primed frame is moving relative to the unprimed frame with constant velocity y in a direction which, by construction, is the positive direction of their collinear x' and x axes. By definition, the times t' and t meay axis y' axis x' axis z axis z' axis An x', y', z', t' frame of reference moving in translation with constant velocity relative to an x, y, z, t frame. The x' and x axes are supposed to be collinear. Figure A-1 y A-1 THE SPECIAL THEO RY OF RELATIVITY sured in the two frames are both zero at the instant when the y'z' plane coincides with the yz plane. With these two frames there are two sets of four numbers, (x',ÿ ,z',t') and (x,y,z,t), that C N a can equally well be used to specify the coordinates of the particle at any instant of time. What are the relations between these sets of numbers? According to classical physics they are x' =x — vt (A-1) =Y z' = z t' = t These are known as the Galilean Transformation. The simple arguments of classical physics leading to them are: 1. If the zeros of the time scales used in different frames are defined to be the same at any time and location, then in classical physics both time scales will remain the same for all times and all locations, so t' = t. 2. Since by construction the x'ÿ and xy planes always coincide, we have z' = z; and similarly for y' = y. 3. Since in the time interval between zero and t' = t the y'z' plane moves in the positive direction a distance vt, the x' coordinate will be smaller than the x coordinate by that amount. Sox'=x— vt. The Galilean transformation constitutes the answer that classical physics gives to the first question posed earlier. The answer to the second question is given in classical mechanics by using the Galilean transformation to convert Newton's equations in the x, y, z, t frame 2 M dt2 = Fx 2 z ,, A-2) FZ ( m d t2 = F m d t2 = into whatever form these equations assume in the x', y', z', t' frame. Note that for (A-2) to be valid the x, y, z, t frame must be an inertial frame; i.e., one in which a body not under the influence of a force, and initially at rest, will remain at rest. By differentiating each of the first three of (A-1) twice with respect to t, and then using the fourth to write t = t', it is trivial to show that d2x'_ d2x d 2y' _ d2y d2z' _ d2z dt'2 dt2 dt'2 dt2 dt'2 dt2 In other words, the acceleration of the mass m measured in the primed frame is the same as it is when measured in the unprimed frame. Of course, the reason is that two frames related by a Galilean transformation are not accelerating with respect to each other, so the transformation does not change the measured acceleration. Furthermore Fe = FZ Fx. = Fx Fy = F3, because the component of the force F acting on m in the direction of the x' or x axis is the same as seen in either frame, and similarly for its other components. Evaluating the unprimed components of acceleration and force in (A-2) in terms of their primed counterparts, but doing nothing to the mass, since in classical physics mass is an intrinsic property of a particle whose value cannot depend on the frame of reference, we find the equations of motion in the primed frame 2 d2 2' m dt 2 = Fx, m =F. m die = FZ , (A-3) dt 2 Note that (A-3) have exactly the same mathematical form as (A-2). Thus part of the answer to the second question is that Newton's equations, which govern the behavior of the mechanical system, do not change when we make a Galilean transformation. The x, y, z, t frame was an inertial frame because 42x/dt 2 = d2y/dt 2 = d2z/dt 2 = 0 if F = O. From (A'3) we see that x', y', z', t' is also an inertial frame because d2x'/dt' 2 = d2ÿ /dt' 2 = d2z'/dt' 2 = 0 if F = O. Since Newton's equations are identical in any two inertial frames, and since the behavior of a mechanical system is governed by these equations, it follows that the behavior of all mechanical systems will be identical in all inertial frames, although these frames move at constant velocity with respect to each other. This prediction is verified by a wide variety of experimental evidence. THE GALILEAN TRANSFORMATION AND ELECTROMAGNETISM Might wrt moving frame = Vlight wrt ether — moving frame wrt ether (A-4) where wrt = with respect to, and Vlight wrt ether = C. The prediction agreed with two simple physical ideas: 1. Light propagates with a velocity of fixed magnitude c with respect to its propagation medium, the ether, just as sound waves propagate with a velocity of fixed magnitude with respect to their propagation medium, the air. 2. The velocity of light with respect to a frame moving with respect to the ether can be found from a normal vector addition of relative velocities. It should be pointed out that the arguments justifying vector addition of velocities are really the same as those justifying the Galilean transformation. For instance, in a case when all motion is along the x' or x axis, (A-4) can be obtained immediately by a time differentiation of the first of (A-1), using also the fourth one, t' = t. In summary, theoretical physics near the end of the nineteenth century was based on three fundamentals: Newton's equations, Maxwell's equations, and the Galilean transformation. Almost everything that could be derived from these fundamentals agreed well with the experiments that had been performed to that time. With regard to the questions we have been discussing, they predicted that reference frames in uniform motion with respect to each other were completely equivalent as far as mechanical phenomena were concerned, but in regard to THE GALILEAN TRANSFORMATION AND ELECTR OMAGNETI SM Next we inquire into the behavior of electromagnetic systems when we perform a Galilean transformation. Electromagnetic phenomena are treated in classical physics in terms of Maxwell's equations, which govern their behavior just as Newton's equations govern the behavior of mechanical phenomena. We shall not actually carry through the Galilean transformation of Maxwell's equations, as we have for Newton's, since the calculation is complicated. Instead we shall state the results: Maxwell's equations do change their mathematical form under a Galilean transformation, in sharp contrast to the behavior of Newton's equations. We shall also discuss the physical significance of these results. As the student probably knows, Maxwell's equations predict the existence of electromagnetic disturbances which propagate through space in the characteristic manner of wave motion. The nineteenth century physicists, who were very mechanistic in their outlook, felt quite sure that the propagation of waves predicted by Maxwell's equations requires the existence of a mechanical propagation medium. Just as sound waves propagate through a mechanical medium, air, so, according to their view, electromagnetic waves must propagate through a mechanical medium, which they called the ether. This propagation medium was required to have quite strange properties in order not to disagree with certain known facts. For instance, it would have to be massless since electromagnetic waves such as light can travel through vacuum; but it would have to have elastic properties to be able to transmit the vibrations inherent in the idea of wave motion. Nevertheless, physicists of that era felt the concept of the ether was more attractive than the alternative of electromagnetic waves propagating without the aid of a propagation medium. It was assumed that the electromagnetic equations in the form presented by Maxwell were valid for the frame of reference at rest with respect to the ether, the so-called ether frame. A solution of these equations led to a prediction of the magnitude of the propagation velocity of electromagnetic waves in vacuum. The result was 2.998 x 10 8 m/sec = c, in agreement within experimental error with the value of the velocity of light that had been measured by Fizeau. However, in a frame of reference moving with constant velocity with respect to the ether, Maxwell's equations changed form when the Galilean transformation was used to evaluate them in that moving frame. As might be expected, when these changed equations were used to obtain a prediction of the electromagnetic wave propagation velocity that would be measured in the frame moving with respect to the ether, the velocity was found to have a magnitude different from c. The complicated calculation which predicted the velocity of light measured in a frame of reference moving with respect to the ether, performed by making a Galilean transformation of Maxwell's equations to the moving frame and then solving them in that frame, led to the simple prediction THESPEC IAL THEO RY O F RELATIVITY electromagnetic phenomena they were not equivalent; there was only one frame, the ether frame, in which the velocity of light had a magnitude with the numerical value c. THE MICHELSON MORLEY EXPERIMENT - In 1887 Michelson and Morley carried out an experiment which proved to be of extreme importance. The experiment was designed to investigate the motion of the earth with respect to the ether frame. Since the earth is moving about the sun, it would seem unrealistic to make the a priori assumption that the ether frame travels with the earth and, as we shall indicate later, experimental observations arguing against such an assumption were known at the time. It would be much more reasonable to assume that the ether frame was at rest with respect to the center of mass of the solar system, or the center of mass of the universe. In the first case the velocity of the earth with respect to the ether frame would have a magnitude of the order of 104 m/sec; in the second case the magnitude of the velocity would be somewhat greater. The basic idea of the experiment was to measure the velocity of light in two perpendicular directions from a frame of reference fixed to the earth. A moment's consideration of the classical theory, as summarized by the vector addition (A-4), will show that the theory predicts the measured velocities should have different magnitudes for light traveling in different directions relative to the direction of motion of the observer through the ether. Although the difference in the two measured light velocities was expected to be small, because the velocity of the earth with respect to the ether is small compared to the velocity of light with respect to the ether, Michelson and Morley built a device incorporating an interferometer that should have been more than sensitive enough to detect and measure the difference. To their extreme surprise, they could not even detect a difference. They, and many other subsequent investigators, repeated the measurements with improved equipment, but an effect was never observed. Despite the predictions of the classical theory, the Michelson-Morley experiment showed that the velocity of light has the same magnitude, c, measured in perpendicular directions in a reference frame which is, presumably, moving through the ether frame. These results captured the attention of most physicists, and a number of them tried to devise explanations that would be consistent with the Michelson-Morley results and yet retain as much as possible of the physical theories then in existence. Notable among them were the "ether drag hypothesis" and the "emission theory." The ether drag hypothesis assumed that the ether frame was locally attached to all bodies of finite mass. It was attractive because it would explain the Michelson-Morley results and yet did not involve modification of the existing theories. But it could not be accepted for several reasons, the principal one having to do with an astronomical phenomenon called stellar aberration. It had been known since the 1700s that the apparent positions of stars move annually in circles of very small diameter. This is a purely kinematical effect due to the motion of the earth about the sun; in fact, it is the same as the effect causing a vertical shower of rain to appear to a moving observer to be falling at an angle to the vertical. From this analogy it is easy to see that stellar aberration would not be present if light were to travel with velocity of fixed magnitude with respect to the ether frame, and if that frame were dragged along by the earth. In the emission theory Maxwell's equations are modified in such a way that the velocity of light remains associated with the velocity of its source. This too would explain the MichelsonMorley results since their light source was fixed to the interferometer used to measure the light velocity difference, but it must be rejected because it conflicts with astronomical measurements concerning binary stars. Binary stars are pairs of stars which are rotating rapidly about their common center of mass. Consider such a pair at a time when one is moving toward the earth an d the other is moving away. Then, if the emission theory is valid, relative to the earth the velocity of the light from one star would be larger than that of the light from the other star. This would cause the stars to appear to move in very unusual orbits. However, in 1913 De Sitter showed that observed motions of binary stars are accurately accounted for by Newtonian mechanics when the velocity of the light they emit is taken to have a magnitude independent of their motion. All the experimental evidence (including evidence from a number of highly accurate contemporary experiments) is consistent only with the conclusion that there is no special frame of reference, the ether frame, with the unique property that the velocity of light measured in evidence: The velocity of light in vacuum is independent of the motion of the observer and of the motion of the source. EINSTEIN'S POSTULATE Einstein, in 1905, was the first to realize that physicists should abandon the fruitless and misleading concept of the ether. In essence, he accepted the fact that light propagates through vacuum, and that vacuum really is empty! With no ether frame, the only frame of reference that can have any significance to an observer measuring the velocity of light is the frame fixed relative to himself. Then it is not surprising that an observer in all cases obtains the same numerical result, c, when he measures the magnitude of the velocity of light. Einstein stated as a postulate: The laws of electromagnetic phenomena, as well as the laws of mechanics, are the same in all inertial frames of reference, despite the fact that these frames move with respect to each other. Consequently, all inertial frames are completely equivalent for all phenomena. This postulate required that Einstein modify either Maxwell's equations or the Galilean transformation, since the two together imply the contrary of the postulate. Although in 1905 the emission theory could still be considered acceptable, he chose not to modify Maxwell's equations. He was then forced to modify the Galilean transformation. This was a bold move. The intuitive belief in the validity of the Galilean transformation was so strong that his contemporaries had never seriously questioned it. Yet, as we shall see, the very different transformation that Einstein adopted in lieu of the Galilean one is based on realistic physical considerations, whereas the Galilean transformation is grossly unrealistic. Another indication of the boldness of Einstein is that our earlier considerations imply that any modification of the Galilean transformation would require some compensating modification of Newton's equations in order that the postulate continue to be satisfied for mechanics. We shall see soon what results this leads to, but first we must study the new transformation equations. SIMULTANEITY Consider the fourth of the Galilean transformation (A-1), which is t' = t The equation says there is the same time scale at all places and for all times in any two frames of reference moving uniformly with respect to each other. This is equivalent to saying that there exists a universal time scale for all such frames. Is this true? To find out we must realistically investigate the procedures used in time measurement. Let us first concern ourselves with the problem of defining a time scale in a single frame. Now the basic process involved in any time measurement is a measurement of simultaneity. As Einstein wrote, "If I say `That train arrives here at 7 o'clock,' I mean something like this: `The pointing of the small hand of my watch to 7 and the arrival of the train are simultaneous events'." Of course there is no problem at all in determining the simultaneity of events which occur at essentially the same location, like the train and the nearby watch or clock used to time its arrival. But there is a problem in determining the simultaneity of events which occur at separated locations. In fact this is the key problem involved in setting up a time scale for a frame of reference. In order to have a time scale valid for a whole frame of reference we must have a number of clocks distributed throughout the frame so that there will everywhere be a nearby clock which can be used to measure time in its vicinity. These clocks must be synchronized; that is, we must be able to say of any two of these separated clocks A and B: "The little hand of clock A and the little hand of clock B pointed to 7 simultaneously." A number of methods for determining simultaneity at separated locations are probably now suggesting themselves to the student. They surely all involve the transmission of signals between the two locations. If we had at our disposal a method of transmitting signals with in- D in A113Nt/ll flUVIS that frame alone has a magnitude equal to c. Just as for inertial frames and mechanical phenomena, all frames in relative motion with constant velocity are equivalent in that the velocity of light measured in each frame has the same magnitude c. To succinctly put the experimental THE SPEC IAL THEORY OF RELATIVITY Y x C a) a x2 x Illustrating Einstein's definition of simultaneity of separated events. Figure A 2 - finite velocity there would be no more of a problem in determining the simultaneity of events occurring at separated locations than there is of doing it for events occurring at the same location. This is where the Galilean transformation goes wrong by implicitly assuming the existence of such a method of synchronization. In fact, there is no such method. Since we have agreed to be realistic in developing a time scale, we must use real synchronization signals. Light (or other electromagnetic) signals are clearly the most appropriate because they have the same propagation velocity under all circumstances. This property enormously simplifies the process of determining simultaneity. Thus we are led to Einstein's definition of simultaneity of separated events: An event occurring at time t 1 and location x 1 is simultaneous with an event occurring at time t2 and location x2 if light signals emitted at t 1 from x 1 and at t2 from x 2 arrive simultaneously at the geometrically measured midpoint between x 1 and x 2 . This definition, illustrated in Figure A-2, makes the very reasonable statement that two separated events are simultaneous to an observer located at their midpoint if he sees them happening simultaneously. Note that in Einstein's theory simultaneity in time does not have an absolute meaning, independent of location in space, as it does in the classical theory. The definition intimately mixes the times t 1 , t2 and the space coordinates x 1 , x2 . A consequence of this is that two events which are simultaneous when observed from one frame of reference are generally not simultaneous when observed from a second frame of reference which is moving relative to the first. To see this, we consider a very simple "thought experiment," adapted from one used by Einstein. Figure A-3 illustrates the following sequence of events from the point of view of an observer 0 who is at rest relative to the ground. This observer has so placed two charges of dynamite C 1 and C2 that the distances OC 1 and OC2 are equal. He causes them to explode simultaneously in his frame of reference by simultaneously sending out light signals to C 1 and C2 which actuate detonators. (He is invoking a reciprocal of the definition quoted earlier.) Assume that he does this so that, in his frame, the explosions occur when he is abreast of O', an observer stationed on a train moving by at a very high velocity v. The explosions leave marks C'1 and C'2 on the side of the train. After the experiment, O' can measure the distances O'C1 and 0'C'2 . He must, and will, find them equal because otherwise space would not be homogeneous. The explosions also produce flashes of light. Observer 0 will receive the flashes simultaneously, confirming that in his frame the explosions occurred simultaneously. However O' will receive the flash which originated at C'2 before he receives the flash from C'1 simply because the train moved during the finite time required for the light V -^ 0' ^ 0 C2 V -->- C' Cl TL E 0 rC -^Gt' C2 Two successive views of a train moving with constant velocity y, from the viewpoint of a ground based observer O. The small arrows indicate flashes of light. Figure A-3 TIME DILATION AND LENGTH CONTRACTION We consider here a second thought experiment designed to facilitate the quantitative evaluation of two relativistic effects that were noted qualitatively in the preceding thought experiment. An observer O', moving with velocity y relative to observer O, wishes to compare a time interval measured by his clock with a measurement of the same time interval made by clocks belonging to O. They have already established that, when at rest with respect to each other, all the clocks involved run at the same rate and are synchronized. Now it is apparent that, even when in relative motion, the reading of an O' clock can be compared with the reading of an O clock that happens to be momentarily coincident with the former without any complication. Thus measurements of a time interval made with clocks in the two frames can be compared by the procedure illustrated in Figure A-4. O' sends a light signal to a mirror, which reflects it back to him. Both O and 0' record the emission of the signal with clocks C 1 and C', which are coincident at that instant. They use the clocks C2 and C', which are coincident when the light signal is received back from the mirror, to record the time of its reception. The two events defining the beginning and end of the time interval to be compared are the emission and reception of the light signal. The elapsed time between these two events measured by 0' is T' = 2At', where At' = l'/c with l' the distance to the mirror measured in his frame. The elapsed time measured by O is T = 2At. From the figure, and the Pythagorean theorem, it is apparent that c2At2 = v 2At2 + l 2 where l is the distance to the mirror as measured by O. Solving for At, we have l2 — l2 1 Ate = C 2 — v 2 c2 1 — v2/c2 or At = Mirror 1 c 1 1/1 — v2/c2 Mirror 1' O Figure A 4 The comparison of a time interval measured by two observers. Left: The figure shows the situation at the instant of emission of a light signal (the small arrow), from the point of view of O'. Right: The figure shows the situation at the instant of its reception, from the point of view of O. - TIME DILATIO N AND LENG THCONTRACT IO N to reach him. Since the explosions occurred at points equidistant from O', but the light signals were not received simultaneously, he must conclude that in his frame of reference the explosions were not simultaneous. Such disagreements concerning simultaneity lead to interesting results. From the viewpoint of O, C 1 C2 = CiC'2 . But according to 0', C'2 passed C2 before C'1 passed C 1 since he received the signal from C'2 first. Therefore O' must conclude that C 1 C2 < CiC'. If this is not apparent, it can be demonstrated by constructing diagrams showing the sequence of events from the viewpoint of O'. The simultaneity disagreement will also cause the two observers to disagree concerning the rates of clocks fixed in their respective frames of reference. As we shall see, the nature of their disagreements about the measurement of dist an ce and time intervals is such as to allow both O and O' to find the same value c for the velocity of the light pulses which came from C 1 or C2. THESPEC IAL THEO RY OF R ELATIVITY Now it is easy to show that observers in relative motion cannot disagree about the measurement of distances perpendicular to the direction of motion because disagreements about simultaneity concern finite synchronization signal propagation times for propagation in the direction parallel to the direction of relative motion. Thus we have 1 = l', and so 1 At' l' = At = — l — v2 /c 2 c / v2/c2 Therefore we obtain 1 (A-5) T' T= v2/c 2 We have found that a time interval between two events occurring at the same place in a certain frame is measured to be longer by a factor of 1/V1 — v 2/c 2 in a frame moving relative to the first frame and, consequently, in which the two events occur at separated locations. The time interval measured in the frame in which the events occurred in the same place is called the proper time. The effect involved is called time dilation. Next we consider the same thought experiment, but we imagine a measuring rod placed in the O frame with one end at clock C 1 and the other end at clock C2. Designate by L the length of the rod measured in the 0 frame, with respect to which it is at rest. We want to evaluate L' , the length of the rod measured from the O' frame. In this frame the rod is moving in a direction parallel to its own length. Since the velocity of O' with respect to 0 is v, the velocity of O, and also of the rod, with respect to 0' must be precisely — v. Otherwise there would be an inherent asymmetry between the two frames that is not allowed by Einstein's postulate. T' is the time interval between the instant when O' sees the front end of the rod pass his clock C' and the instant when he sees the rear end pass the clock. This time interval is related to the length L' of the rod as measured in the O' frame, and to the magnitude v of its velocity measured in that frame, by the equation L'=vT' We may also establish an equation connecting the corresponding quantities as measured in the 0 frame. In this frame C', which is moving with velocity of magnitude v, travels the distance L in time T. Thus L = vT From the last two equations we obtain T' T L' =L — But the time dilation argument shows that T= v2 /c2 L' = — v2/c2 L T Therefore (A-6) We have found that a rod is measured to be shorter by a factor \/1 — v 2/c2 when the measurement is made in a frame in which it is moving parallel to its own length, compared to its length measured in a frame in which it is at rest. The length of the rod measured in the frame in which it is at rest is called its proper length. The effect is called the Lorentz contraction. Note that a comparison of (A-6) with the equation immediately above it shows the factor relating the primed to the uprimed time interval is the same as (and not the reciprocal of) the factor relating the primed to the uprimed distance interval. It is not difficult to understand why the phenomenon of Lorentz contraction is unobservable in classical physics. Consider a railroad train which when stationary with respect to the ground has a measured length of 1 km. This is its proper length. If it is moving over the ground at velocity v = 100 km/hr = 27.8 m/sec and its length is measured from the ground, (A-6) predicts that the value obtained will be less than 1 km. But not by much. In fact, since v2/c 2 = (27.8/3.00 x 108)2 = 8.59 x 10 -15 , the value of the Lorentz contraction factor is V1 — v2/c 2 = ,/l — 8.59 x 10 -15 1 — (1/2) x 8.59 x 10 -15 = 1 — 4.30 x 10 -15 . Thus THE LORENTZ TRANSFORMATION Now we shall obtain the equations that are used in relativity theory to transform space and time variables from one frame to another moving with constant velocity relative to the first. Our argument will be guided by what we have already learned, but in the final analysis it is an independent derivation based on the experimental evidence that the velocity of light is independent of the motion of the observer and of the source. We consider a third thought experiment involving two observers O' and O, with 0' moving relative to 0 at velocity of magnitude y in the positive direction of the x' and x axes. Their x'y' and xy planes always coincide, as in Figure A-1, and the origins of their reference frames coincide at the instant t' = t = 0. At that instant O' ignites a flash bulb at his origin which produces a wavefront of light that expands away from the point of emission with velocity of magnitude c in all directions. Therefore, according to O' at time t', the wave front will be a ^o NOIlVWa O3 SNt/a1Z1N31:1O13 H1 the length of the train is predicted to be contracted by about four parts in 10 15 . Such an effect would be completely unobservable because the lengths of objects dealt with in classical physics cannot be measured with the necessary accuracy. However, time intervals occurring in classical physics can be measured with very great accuracy using atomic clocks. This makes it just possible to observe time dilation with classical objects. An experiment performed in 1971 did so by sending atomic clocks on a trip around the earth in commercial airliners, and comparing the readings of the traveling clocks with a reference atomic clock at the U.S. Naval Observatory. After various corrections were made to account for things having nothing to do with time dilation, the traveling clocks showed smaller readings, compared to the reference clock, which amounted to about 3 x 10 -7 sec for the entire round trip. This agreed, to the 0.2 x 10 - ' sec accuracy of the measurement, with the predictions of (A-5). Both length contraction and time dilation are easy to observe for objects moving at velocities whose magnitudes are an appreciable fraction of that of light. A particularly convincing example is found in the behavior of particles called muons. These are known to be formed at an elevation of around 10,000 m, near the top of the atmosphere, as a byproduct of collisions of rapidly moving cosmic rays with the molecular constituents of the atmosphere. The muons are projected toward the surface of the earth at velocities of about 0.999c. They are unstable particles; on the average each lives for 2.2 x 10 -6 sec, as measured in a reference frame in which the muons are stationary, before decaying into other particles. Now a particle moving at essentially 3.0 x 10 8 m/sec for 2.2 x 10 -6 sec will travel only 660 m. Hence it might seem that all muons would have decayed long before they are able to reach the ground, since they must travel around 10,000 m to do so. But, in fact, observations show that nearly all the muons formed at the top of the atmosphere reach ground level. Time dilation explains the observations. A prediction as to whether or not a muon can traverse the thickness of the atmosphere before it decays should not use 2.2 x 10 -6 sec for the time available. This value is the proper time the particles live, on the average, because it is measured in a reference frame in which they are at rest. Instead, the corresponding dilated time should be used since the observations are made in a reference frame in which the muons are moving at a very high velocity. For v/c = 0.999, the time dilation factor has the value 1/.,/1 — v 2 /c 2 = 1/\/1 — 0.998 = 1/0.045 = 22. Hence the dilated lifetime has the value 22 x 2.2 x 10 -6 sec = 4.9 x 10 -5 sec. A particle moving at 3.0 x 10 8 m/sec for this time will travel a distance of 14,000 m, more than enough to reach ground level before decaying. An alternative explanation of the observations concerning muons involves Lorentz contraction. It carries out the calculation in a reference frame in which the muons are stationary, instead of in one in which the atmosphere is stationary. The muons live their proper lifetime 2.2 x 10 -6 sec in this reference frame. But in it the proper thickness of the atmosphere is Lorentz-contracted by the factor ,/1 — v 2 /c2 = 0.045, and is only 0.045 x 10,000 m = 450 m thick. The time required for the atmosphere to move past the muons, as observed in the reference frame in which they are stationary, is its contracted thickness divided by its velocity, or 450 m/3.0 x 108 m/sec = 1.5 x 10 -6 sec. Since this is less than their proper lifetime, there is no difficulty in understanding how it happens. THESPE CIAL THE ORY OF RELATIVITY sphere, centered on his origin, of radius r' = ct'. The coordinates of any point on the wave front at that time will thus satisfy the equation of a sphere z 2 + ÿ 2 + z'2 = c2 t'2 (A-7) But it will be equally true that according to 0 the light is expanding away from the point of emission, his origin, with velocity of magnitude c in all directions. Thus from the point of view of 0 the wave front at time t is also a sphere of radius r = ct centered on his own origin, and satisfying the equation X 2 + y2 + z 2 = c 2 t2 (A-8) We shall find relations between the two sets of variables (x',y',z',t') and (x,y,z,t) which allow both (A-7) and (A-8) to be valid, i.e., which transform one equation into the other. We are guided by our earlier considerations to assume the following form for the transformation equations x' =y(x—vt) Y^ = Y z' (A-9) =z t' =y(t+ S) where y is a dimensionless quantity, presumably involving the relative velocity of the two frames, v, and the velocity of light, c, and where S is a quantity, also presumably involving these velocities, which must have the dimensions of time. Expressions for y and S will be determined soon, but we can say even now that we should have y -- ^^ 1 and S —* 0 if v/c -* 0. The reason is that for y = 1 and S = 0 (A-9) reduce to the Galilean transformation (A-1), which is as it should be since the Galilean transformation would be essentially correct if the relative velocity v of the frames is extremely small compared to the velocity c of the signals used to synchronize the clocks in the frames. We inserted the additive term S in the fourth equation when v/c is not small because according to 0' the time of some event measured by 0 must be corrected for a synchronization error between the clock used by O at the event and the clock used by 0 at his o ri gin, as discussed in our first thought experiment. Having accounted for synchronization, we put the multiplicative factor y in the fourth equation to account for the discrepancy in time intervals measured by 0' and O, as discussed in our second thought experiment. As was also discussed there, the same factor y should appear in the first of (A-9) to account for the discrepancy in distance intervals measured by the two observers. Since y and z are distances measured perpendicular to the direction of relative motion, we assumed that their values will not be changed by the transformation. Now let us see whether the forms assumed in (A-9) can actually transform (A-7) into (A-8) and, if so, what expressions for S and y are required to accomplish this. Using (A-9) to rewrite each variable in (A-7) in terms of the unprimed variables, we have y2(x2 — 2vxt + v2 t2) + y2 + z2 = c2y2 (t 2 + 26t + 62 ) As we must obtain from this (A-8), which does not contain a term with the combination of variables xt, the second term in the parentheses on the left side must be canceled by something on the right side. For the cancelation to be obtained for all values of the independent variable t, it must be due only to the second term in the parentheses on the right. Thus we must have —y22vxt = c 2y2 2St or S = —vx/c 2 " (A-10) Note that S has the dimensions of time, and that S —* 0 if v/c —* 0, as predicted earlier. A reconsideration of our first thought experiment will make it apparent why the synchronization correction S is linearly proportional to both v and x. Gathering the factors of x2 and t2 in the remaining terms of the equation after evaluating 6 2, we obtain x2y2(1 — v 2/c 2) + y2 + z 2 = c 2t2y2(1 — v2/c2) Comparing this with the required form, (A-8), we see that we shall obtain it if y 2(1 — v2/c2) = 1 or D _L 1 — v2/c2 (A-11) Note that y is dimensionless, and that y —> 1 if v/c —> 0, as also predicted earlier. Considering the results of our second thought experiment, it is not surprising that y involves the expression — v2/c2 . Finally, we use (A-10) and (A-11) to evaluate y and S in (A-9), and successfully complete our derivation of the Lorentz transformation 1 = 2 2 (x—vt) —v /c Ÿ =Y z' = z t' = (A-12) 1 /1 v2/c2 (t — vx/c2 ) The space-time variables transformation of relativity is called the Lorentz transformation for the historical reason that equations of the same mathematical form (but with a very different physical significance because v represented a velocity with respect to the ether frame instead of a velocity of any inertial frame with respect to any other inertial frame) had been proposed by Lorentz in connection with a classical theory of electrons some years before the work of Einstein. The Lorentz transformation reduces, as expected, to the Galilean transformation when the relative velocity of the two frames, v, is small compared to the velocity of light, c. But significant differences between the predictions of the Galilean transformation and those of the rigorously correct Lorentz transformation are found when v is comparable to c. These had not been observed in classical physics because the appropriate experiments had not been performed. Many experimental results of quantum physics, some of which are discussed in this book, show that the Lorentz tr an sformation is, in fact, the one that accurately describes nature. Note that for v larger than c the Lorentz transformation equations are meaningless, in that real coordinates and times are transformed into imaginary ones. Thus c appears to play the role of a limiting velocity for all physical phenomena. We shall obtain a better understanding of this as we go further into relativity theory. THE RELATIVISTIC VELOCITY TRANSFORMATION Consider the particle shown in Figure A-5, moving with velocity u as measured in a frame of reference O. We would like to evaluate the velocity u' of the particle as measured in the frame O', which is itself moving relative to O with velocity v. Measured in the 0 frame, the velocity vector of the particle has components dx dy dz ux = dt Y uy = dt uz = dt Y' O Particle v Figure A-5 A moving particle observed from two frames of reference O and O', with the latter moving relative to the former at velocity v. NOIl`dWa OdSN dalA1I0O13A 0I1SIAI1V-13 1:1 3H1 y ^1 N Q The velocity vector, as measured in the O' frame, has components , dx' , dy' , dz' lox — zi' = uZ dt' dt' = dt' To establish the required relationships, we take the differentials of the Lorentz transformation, (A-12), remembering that y is a constant. This gives' dx' = 1 — v2/c2 (dx — v dt) dy' = dy dz' = dz 1 (dt — vdx /c 2 ). — v2 /c2 dt' = So we obtain Q x v 1 dx __ v 2 2 (dx—vdt) , dx ^ ^/i — v2/c2 ux — v dt = u x =—= _ dt 1 ( v dx) v dx vux 1 — c2 1c c2 2 dt ,I1 — v2/v2 — âQ. â uy _ dÿ dt' dy — 1 22 1 2 ^/1 — v /c2 _ dz' uZ dt' dy dt — (dt — v dxl ( c2 ) ^1 — v /c ( dz dt dz ( /1 — v2/c2 uy \ 1 — v2 dx c dt ) (A-13) 1— v Zx c \ 1 — v 2/c 2 uZ dx) vdx vu x ) / 1-) c2 c2 dt 1 — v2 c2 \( c2Cdt These equations constitute the relativistic velocity transformation. Note that as v/c approaches zero (A-13) approach those which would be derived from the Galilean transformation. Another interesting property is that it is impossible to choose u and y such that u', the magnitude of the velocity measured in the new frame, is greater than c. Consider the example illustrated in Figure A-6. As measured by O, particle 1 has velocity 0.8c in the positive x direction and particle 2 has velocity 0.9c in the negative x direction. We evaluate the velocity of particle 1 as measured in a frame O' moving with particle 2 using the first of (A-13), with ux = u 1 = 0.8c and y = —0.9c. We obtain , 0.8c — (— 0.9c) = 1.70c ul = 1.72 = 0.99c (-0.9c)(0.8c) 1 2 1 / 1/1 — v2 c2 ` — C The velocity transformation equations demonstrate another aspect of the fact that c acts as a limiting velocity for all physical phenomena. y —0.9c ^o 2 +0.8c 1 2 o--^ n' 1 • o—^ 1 x y = — 0.9c E-- z Figure A-6 z' Illustrating an example of the relativistic addition of velocities. D It has been emphasized that Einstein's modification of the transformation equations would necessitate some compensating modification in the equations of mechanics, so that these equations continue to satisfy the requirement of not changing form in a transformation from one inertial frame to another moving relative to the first. Now we shall begin to develop the new mechanics, which is called relativistic mechanics. Clearly it is desirable to carry over into relativistic mechanics as much of classical mechanics as the circumstances allow. We shall see that it is possible to preserve Newton's equation of motion, in a form equivalent to the one originally given by Newton F =dP (A-14) dt where p is the momentum of a particle acted on by force F. It is also possible to preserve the very closely related classical law of momentum conservation for the particles in an isolated system C It all particles P initial [all particles P (A-15) final will even be possible to preserve the classical definition of the momentum of a particle p =mv (A-16) where in is its mass and v is its velocity. But to do all this it will be necessary to allow the mass of a particle to be a function of the magnitude of its velocity, i.e. m = m(v) (A-17) The form of this function is to be determined. However, we know a priori that we must have m(v) = mo if v/c « 1, where the constant m o is the classically measured mass of the particle. The reason is that when a characteristic velocity becomes very much smaller than the velocity of light the pertinent Lorentz transformation approaches a Galilean transformation and no modification of mechanics is necessary. In order to evaluate the function m(v), we consider the following thought experiment. As measured in the x, y, z, t frame indicated in Figure A-7, observers 0 1 and 02 are moving in directions parallel to the x axis with equal magnitude but oppositely directed velocities. These observers have identical particles, say billiard balls B 1 and B2, each of mass mo as measured when they are at rest. While passing, each throws his ball so as to hit the other's ball with a velocity which, from his own point of view, is directed perpendicular to the x axis and is of magnitude u. As observed in the x, y, z, t frame, B 1 and B2 will approach along parallel paths making angles e1, = 02 i with the x axis, and rebound on paths at angles e lf and 92 f to that axis. Assuming conservation of momentum and that the collision is elastic, it is easy to show that 01 f = 192 f and that the magnitude of the velocity of the balls is the same after the collision as y E-- Figure A 7 - 02 A symmetrical collision between two balls of identical rest mass. w SSt/W 0I 1S IAl lbr131:1 RELATIVISTIC MASS THE SPECIAL THEORY OF RELATIVITY V A symmetrical collision, as observed by O. Since u is supposed to be very much smaller than y, the angles made by the trajectories of B2 and the x axis are actually very much smaller than shown. Figure A 8 - before. The actual value of 0 1 f and 02f depends on the impact parameter d, which we assume to be such that 0 1 f = 0 1 , shown in the figure. Now consider the process from the point of view of 0 1 , as illustrated in Figure A-8. 0 1 throws B 1 along a line parallel to his y axis with velocity of magnitude u, which we shall take to be very small compared to c. It returns along the same line with velocity of the same magnitude but opposite sign. He sees B2 maintain a constant x component of velocity just equal to y the velocity of 0 2 relative to 0 1 , which we shall take to be comparable to c. The component of velocity of B2 along his y axis is observed by 0 1 to change sign during the collision but to maintain a constant magnitude. To evaluate this magnitude we realize that the y component of the velocity of B2, as measured by 0 2 , is u. Then we transform this to the 0 1 frame with the aid of the second of (A-13) and obtain u Ji — v2/c 2 for the magnitude of the y component of velocity of B2 as measured by 0 1 . The y momenta of both B 1 and B2, as measured in the 0 1 frame, simply change sign during the collision. Consequently the total y momentum of the isolated system of two colliding balls changes sign. If the momentum conservation law (A-15) is to be valid, the total y momentum before the collision must equal the total y momentum after. This can be true only if the total y component of momentum of the system measured by 0 1 is zero before the collision because zero is the only quantity which can change sign without changing value. Evaluating y components of momentum as the masses times the y components of velocity from the definition of (A-16), and equating their sum to zero, we obtain an equation that is obviously self-contradictory if we insist that both masses have the value m o that they have when measured in frames in which they are at rest. The reason is that according to 0 1 the magnitude of the y component of velocity of B 1 is u, while the magnitude of the y component of velocity of B2 is UN/1 — v 2/c 2 . However, if we allow the mass of a particle to be a function of the magnitude of its total velocity vector we can satisfy the momentum conservation law. Since u is very small compared to y, the magnitude of the velocity vector of B2 as measured by 0 1 is essentially y, as can be seen in Figure A-8. The magnitude of the velocity vector of B 1 according to 0 1 is just u. Thus 0 1 would write the requirement imposed by the momentum conservation law for y components as m(u)u — m(v)u,\/1 — v2/c2 = 0 or m(u) = m(v)J1 — v 2/c 2 Since u is very small compared to c, we may take m(u) = m o and obtain 1 m0 (A-18) •N/1 — v2/c2 A theory of relativistic mechanics consistent with momentum conservation demands that the mass m(v) of a particle measured when it is moving with velocity of magnitude y be larger than its mass m o measured when it is at rest by the factor 1/ N/1 — v2/c2 . The mass m(v) is called the relativistic mass of the particle and m o is called the rest mass. A reconsideration of our arguments will show that the two observers in the thought experiment measure different values for the mass of the particle because of the difference in their measurements of its velocity component perpendicular to the direction of their relative motion, and that this arises because of the difference in their measurements of time intervals. m(v) = D 1.8 01 1.7 — ^1_v 2/c 2 A9I:I3N 3 JIlSInI1b731:1 m 1.6 m0 ^ with c = 2.998 x 108 m /se^ 1.5 .i• g 1.4 ^r 1.3 1.2 1.1 —^ Ix • ^x 10 0 3 ^t x. / • M I I I I I I 0.4 0.5 0.6 0.7 0.8 0.9 v /c Figure A 9 - An experimental verification of the dependence of mass on velocity. For the quite high velocity y = 0.1c the relativistic m as s is only one-half of 1% greater than the rest mass. But with increasing y the relativistic mass rapidly increases since m(v) oo as y -* c if m o has any finite v al ue. It is apparent that the velocity of a particle cannot exceed c. The first experimental confirmation of the predictions of relativity theory concerning the dependence of mass on velocity was provided by Bucherer in 1909. He applied to electrons of high velocity a variation of the technique used by Thomson to measure the charge to mass ratio of slowly moving electrons (described in most elementary physics texts). Bucherer's results are shown by the crosses in Figure A-9, some more extensive results obtained in recent years are shown by dots, and the predictions of (A-18) are shown by the solid curve. Note that these results prove not only that (A-18) has the correct functional form, but also that the velocity c, which essentially enters the theory of relativity as the limiting velocity for the transmission of information, actually is equal to the velocity of light, 2.998 x 10 8 m/sec. RELATIVISTIC ENERGY Consider a particle of rest mass m o initially stationary at x = x =. A force of magnitude F is then applied in the positive x direction and the particle moves under the influence of the force. It is interesting to calculate the total work done by the force when the particle moves to x = x f . We shall label this work K. Taking the usual definition of work, we have Xf K = xJ F dx i In order to evaluate the integral we must know the relativistic form of Newton's equation of motion. With a relativistically acceptable expression for momentum p = mv, where m is the relativistic mass, we can with confidence take over into relativity Newton's equation in the form of (A-14). For the one-dimensional situation of interest here, it reads _ d(mv) _ dv dm F dt — m dt + v dt Hence we have xf K = xf (' JF C dx= J mdt+ v dt \I dx To obtain an easy evaluation of this integral, we go through the following sequence of manipulations. First we write the relation (A-18) between m and y in the form m2(1 — v2/c2 ) = mp This immediately yields m2 c 2 — m 2 v2 = m2 2 0C CO Next we differentiate each term with respect to time, to obtain THE S PECIAL TH EO RYOFRELAT IVITY Q Q X c a â C2 d(m2) d(m2v2) = dt dt 0 or 2c2m dm — 2m2v dt — 2v2m dm =0 or 2 dm 2 dm dt dm 2 dm 1 dv m— at +vdt— ^ dt _ —c dtdx — c dx We have used the fact that v = dx/dt so that 1/v = dt/dx. Now we can write J X K = mf 2dm dxc = c2 Xi i dm = c2(m f — mi) where mi and m f are the masses of the particle when it is at positions x i and x f, respectively. But m i = m0 since the particle starts from rest at x i and, according to (A-18), the mass of the particle as it moves past x f with velocity v is m f = m0/\/1 — v2/c2 . So we have moc 2 K= J1 — v2/c2 (A-19) moc 2 Now the classical law of energy conservation implies that the total work done by the force acting on the particle should equal its kinetic energy. Thus we would like to call K the kinetic energy of the particle. To check in the classical limit take v/c « 1, and expand the reciprocal of the square root, to obtain rr v2 ) -112 — 1 ll ^ m oc2 rrI 1 + 21 v 2 K m °c2 L(1 — C2 c2 — 1 J J or K m0v 2 ^ moc2 2 V2 — 2 c2 This agrees with the classical expression for kinetic energy, and confirms our identification of K in (A-19) as the relativistic kinetic energy. Continuing the interpretation of (A-19), we observe that K is a function of v which can be written as the difference between a term depending on v and a constant term, as follows K(v) = E(v) — E(0) = m where E(v)in oc2/ J1 — v 2/c2 = mc 2 , with m the relativistic mass; and where E(0) is the value of E(v) for v = 0, i.e., E(0) = m oc2. Since K is an energy, E(v) and E(0) must also be energies—E(v) being some energy associated with the particle when its velocity is v, and E(0) some energy associated with the particle when its velocity is 0. To identify these energies, we rewrite the equation as E(v) = K(v) + E(0) The conclusion is inescapable. We must interpret E(v) as the total energy of the particle moving with velocity v, since it is the sum of the kinetic energy K(v) of the particle and an intrinsic energy E(0) associated with the particle when it is at rest. The energy E(v) is called the total relativistic energy, and E(0) is called the rest mass energy. We have established Einstein's well known relations between mass and energy: The rest mass energy E(0) of a particle is c 2 times its rest mass m ° E(0) = moc 2 (A-20) and the total relativistic energy E of a particle is c2 times its relativistic mass m (A-21) Equation (A-19) tells us the relation between total relativistic energy E, relativistic kinetic energy K, and the rest mass energy m oc2 (A-22) E = K + moc2 E = mc2 It is often convenient to have an expression for the total relativistic energy that explicitly involves the momentum p. Such can be obtained by evaluating the quantity = 1 = m2c4 v 2/c 2 1 — v 2 /c 2 ° 1 — v2/c2 222 Y12°C 2 71 = c2 m2 v2 = c 2 p2 1 — v /c 2 Thus m 2 c4 = c2 p2 + mO C4 or E2 = c2p2 + m02c4 (A-23) As an example of the relativistic theory of energy, we will calculate the relativistic kinetic energy, total relativistic energy, rest mass energy, and relativistic momentum of a muon moving at velocity 0.999c, in terms of its known rest mass 1.9 x 10 -28 kg. The first thing to do is to calculate the rest mass energy. According to (A-20), it is m 0c2 = 1.9 x 10 -28 kg x (3.0 x 10 8 2 = 1.7 x 10 -11 joule. Now we can employ a result obtained in discussing time dilation m/sec) for muons moving at 0.999c, namely 1/0 — v 2/c 2 = 22. Using (A-18) in (A-21), we find that the total relativistic energy is mc 2 = 22 m 0c2 = 22 x 1.7 x 10 -11 joule = 3.8 x 10 -10 joule. The relativistic kinetic energy is then obtained from (A-19) to be K = mc 2 — m0c2 = 3.8 x 10 -10 x 10 -10 joule = 3.6 x 10 -10 joule. Finally, we use (A-16) to write the relativistic joule—0.17 momentum as p = my = mc2(v/c)/c = 22m 0c2(v/c)/c = 3.8 x 10 -10 joule x 0.999/3.0 x 10 8 x 10 -18 kg-m/sec. Another way would be to solve (A-21) for p in terms of mc 2,m/sec=1.3 m0c2 , and c. But the procedure we followed is easier in this case. A case in which (A-21) is truly useful is found in Section 2-4. Although the choices made in the theory of relativistic mechanics seem reasonable, their ultimate justification is found in comparing the predictions of the theory with appropriate experiments. Several very successful comparisons are given in the text, but it is worthwhile here to point out that the existence of a rest mass energy m 0c2 is not in conflict with classical physics. Since the experiments in that field all involve systems in which the total rest mass is essentially constant, the appropriate rest mass energies can he added to both sides of all classical energy balance equations without destroying their validity. The theory is, however, of more than academic interest because there are important processes in nature in which the total rest mass of an isolated system changes significantly. For such processes the experiments of quantum physics show that the change in rest mass energy is exactly compensated for by a change in kinetic energy in such a way as to conserve the total relativistic energy of the system. This is, of course, what happens in a nuclear reactor. Consequently, in relativity we must replace the separate classical laws of conservation of mass and conservation of energy by a single comprehensive law of conservation of total relativistic energy: As measured in a given inertial frame of reference, the total relativistic energy of an isolated system remains constant. We close our concise development of relativity by stating that explicit calculations demonstrate that neither Newton's equation as expressed in (A-14), nor Maxwell's equations, change form under a Lorentz transformation from one frame of reference to another moving relative to the first. However, these calculations show that the force in the case of the mechanical equation, and the electric and magnetic fields in the case of the electromagnetic equations, change when Lorentz transformed from one frame to the other. Although we cannot go into these matters here, their study elsewhere is recommended to the student as adding very worthwhile physical insight—particularly into the relationship between electric and magnetic fields. PROBLEMS 1. At what speed will the Galilean and Lorentz expressions for x' (see (A-1) and (A-12)) differ by (a) 0.10%; (b) 1%; (c) 10%? 2. (a) Construct diagrams, similar to those in Figure A-3, showing the sequence of events from the point of view of the observer 0' stationed at the center of the train. Use them sw 318oad = m2 c4 m 2 c4 m o 2 4— co THE S PEC IAL THEO RY OF RELATIVITY 3. 4. 5. 6. v aa) ° 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. to prove that C 1 C2 < C1C'2 . (b) Repeat the argument associated with Figure A-3, but letting 0' be the one who sends the light signals to detonate the two charges of dynamite, so that they explode simultaneously from his point of view. Present diagrams of the situation from his point of view and also from the point of view of O. Explain both the similarities and the differences for this case and for the case treated in the Appendix A. The distance to the farthest star in our galaxy is of the order of 10 5 light years. Explain why it is possible, in principle, for a human being to travel to this star within his lifetime, and estimate the required velocity. The length of a spaceship is measured to be exactly half its proper length. (a) What is the speed of the spaceship relative to the observer's frame? (b) What is the dilation of the spaceship's unit time? Two spaceships, each of proper length 100 m, pass near each other heading in opposite directions. If an astronaut at the front of one ship measures a time interval of 2.50 x 10 -6 sec for the second ship to pass him, then (a) what is the relative velocity of the spaceships? (b) What time interval is measured on the first ship for the front of the second ship to pass from the front to the back of the first ship? A passenger walks forward along the aisle of a train at a speed of 1.3 m/sec as the train moves along a straight track at a constant speed of 30.2 m/sec with respect to the ground. What is the passenger's speed relative to the ground? To the accuracy cited, do classical and relativistic predictions differ? One cosmic-ray particle approaches the earth along its axis with a velocity 0.80c toward the North Pole, and another with a velocity 0.60c toward the South Pole. What is the relative speed of approach of one particle with respect to the other? (Hint: It is useful to consider the earth and one of the particles as the two inertial systems.) In frame O, particle 1 is at rest and particle 2 is moving to the right with velocity u. Now consider a frame 0' which, relative to O, is moving to the right with velocity v. Find the value of v such that the two particles appear in 0' to be approaching each other with equal but opposite velocities. What is the speed of an electron whose kinetic energy equals its rest energy? Does the result depend on the rest mass of the electron? Compute the speed of (a) electrons and (b) protons that fall through an electrostatic potential difference of 10 million volts. (c) What is the ratio of relativistic mass to rest mass in each case? (a) What potential difference will accelerate an electron to the speed of light according to classical physics? (b) With this potential difference, what speed will an electron acquire relativistically? (c) What would its relativistic mass be at this speed? (d) Its relativistic kinetic energy? If m/m o = 40,000 for electrons emerging from the Stanford linear accelerator, what is their laboratory speed, in m/sec and in terms of c? (a) Show that when v/c < 1/10, then K/m oc2 < 1/200, and the classical expressions for kinetic energy and momentum may be used with an error of less than 1%. (b) Show that when v/c > 99/100, then K/m oc2 > 6, and the relativistic relation p = E/c for a zero rest-mass particle may be used for a particle of rest mass m o with an error of less than 1%. (a) Show that a particle that travels at the speed of light must have a rest mass of zero. (b) Show that for a particle of zero rest mass v = c, K = E, and p = E/c. The "effective mass" of a photon (bundle of electromagnetic radiation of zero rest mass and energy hv) can be determined from the relation m = E/c2. Compute the "effective mass" for photons of wavelengths (a) 5000 A (visible region), and (b) 1.0 A (x-ray region). (a) How much energy is released in the explosion of a fission bomb containing 3.0 kg of fissionable material? Assume that 0.10% of the rest mass is converted to released energy. (b) What mass of TNT would have to explode to provide the same energy release? Assume that each mole of TNT liberates 820,000 calories upon exploding. The molecular mass of TNT is 0.227 kg. (c) For the same mass of explosive, how much more effective are 6012 : 12.000000u 1H1: 1.007825u 1.008665u n: in terms of the atomic mass unit u = 1.66 x 10 -27 kg. How much energy would be required to separate a 6 C 12 nucleus into its constituent protons and neutrons? This energy is called the binding energy of the 6C 12 nucleus. (The masses, except for the neutron, are really those of neutral atoms, but the extranuclear electrons have relatively negligible binding energy and are of equal number before and after the breakup of the nucleus.) As observed in an inertial reference frame O, a particle of rest mass m o moves at velocity u in the positive x direction. The components of its total relativistic momentum in that frame are px = m ou/\/1 — u2/c2 , py = 0, pZ = 0, and its total relativistic energy is E = moc2/.1 — u 2/c2 . The inertial reference frame O' is moving relative to O in the positive x direction at velocity y, where y < u. In that frame the particle's components of relativistic momentum, and its total relativistic energy, are p' = moû /J1 — u' 2/c2, p3, = 0, p'2 = 0, and E' = moc2R/1 u' 2/c 2 , where u' is the velocity of the particle relative to 0'. Evaluate u' from the relativistic velocity transformation. Then use it in the expressions for pz, and E' to derive the following: 1 (px vE/c 2) Px = 1—v 2/c 2 — — pv = py Pz = Pz E' = ^ (E — vpx) 1 1 — v2/c2 These equations are called the Lorentz transformation for momentum and energy. Compare them with the Lorentz transformation for space and time, (A-12), and_show_ .that the quantitites px, py, pz, E/c 2 transform in ways that are identical to the ways the quantities x, y, z, t, respectively, transform. This fact forms the basis of a more advanced treatment of special relativity employing the "four-vectors" with components (x, y, z, t) and (pr, py , pz, E/c 2). ^ sw31 8oad fission explosions than TNT explosions? That is, find the ratio, fission/TNT, of the fraction of rest mass converted to released energy. 17. The nucleus 6C 12 consists of six protons and six neutrons held in close association by strong nuclear forces. The atomic rest masses are Appendix B THE RADIATION FROM AN ACCELERATED CHARGE Here we give a largely qualitative view of the classical theory of emission of electromagnetic radiation from an accelerated charge, restricting ourselves to the c as e of a stationary charge in vacuum that is suddenly accelerated to a non-relativistic velocity y « c. We know that a stationary charge has an associated static electric field E whose energy per unit volume is given by p = 2 Eo E 2 (B-1) This energy is stored in the field and is not radiated away. If the charge moves with a uniform velocity, there is a magnetic field B associated with it as well as an electric field. The total energy stored in the nonstatic field of a uniformly moving charge is larger than for the static field of a stationary charge, the additional energy being supplied from the work done by the forces that initially produced the motion of the charge. The energy density in this case is given by p = 1 E0E 2 + 1 B2 2µo 2 (B-2) and the energy stored in the field moves along with the charge. That the energy is not radiated away, even in this case, follows from transforming to a reference frame in which the charge is stationary and applying the relativistic requirement that the behavior of the charge, including whether or not it radiates, cannot depend on the frame of reference from which it is viewed. Hence for a charge having constant velocity, the electric and magnetic fields are able to adjust themselves in such a way that no energy is radiated, even though these fields are not static. For an accelerated charge, however, the nonstatic electric and magnetic fields cannot adjust themselves in such a way that none of the stored energy is radiated. We can understand this qualitatively by considering the behavior of the electric field. In Figure B-1 we describe this field by drawing some of the lines of force surrounding a charge which was at rest at the initial instant t, suffered a constant acceleration a to the right during the interval t to t', and then continued moving with a constant final velocity. The figure shows the lines of force at some later instant t", as viewed from the frame of reference moving at that velocity y. At small distances the lines of force are directed radially outward from the present position of the charge. At large distances they emanate from where the field would anticipate it to be if unaccelerated. The reason is that information concerning the position of the charge cannot be transmitted to distant locations with infinite velocity, but only with the velocity c. As a result, there are kinks in the lines of force found between a sphere centered on the anticipated position and of radius c(t" — t), which is the minimum distance at which the field can "know" the acceleration started, and a sphere centered on the actual position and of radius c(t" — t'), which is the minimum distance at which the field can know that the acceleration stopped. As t" increases, the region containing the kinks expands outward with velocity c. That is, each kink of adjustment propagates along its line of force in much the same way as a kink set up at one end of a long stretched rope propagates along the rope. The electric field in the region containing kinks has components which are both longitudinal and transverse to the direction of expansion. But, by constructing diagrams for several values of t", it is easy to see that the longitudinal B-1 N THE RAD IATION FROM AN ACCELERATED CHARGE m Figure B-1 The lines of force surrounding an accelerated charge. Only some of the lines are shown. component dies out very rapidly and can soon be ignored, whereas the transverse component dies out slowly. In fact, electromagnetic theory shows, by calculations based upon the same idea as in our qualitative discussion, that at large distances from the region of the acceleration (large t") the transverse electric field obeys the equation E1 = qa 2 sin 0 4irEOC r (B-3) In this equation, which is valid only if v/c « 1, r = c(t" — t) is the magnitude of the vector r from the region at which the acceleration a took place to the point at which the transverse field is evaluated, and 6 is the angle between r and a. The dependence of E 1 on B and r can be seen from Figure B-1 and comparable diagrams for larger values of t", and it should be clear from our discussion that E 1 must be proportional to q and a. Similarly, there is a transverse magnetic field moving along with E l, and at large distances from the region of acceleration its strength, if v/c « 1, is given by B1 = Yoga sin B 4ncr (B-4) These two transverse fields propagating outward with velocity c form the electromagnetic radiation emitted by the accelerated charge. The radiated field is polarized with E in the plane of a and r and with B at right angles to this plane. The energy density of the radiation is 1 2 1 Bl p =— €0E1 + 2 2 /j0 or, with c = 1//µ o€0 and B1 = E1/c p = 2 e0 El + 2 Eo E1 = €0E 1 (B -5) The "Poynting vector," which gives the energy flow per unit area (i.e., the intensity of radiation) is directed along r and has a magnitude S = pc = EocEi Hence, from (B-3) (B-6) which can also be obtained from the relation defining the Poynting vector 1 S= ExB Ito Notice that no energy is emitted forward or backward along the direction of acceleration (0 = 0° or 180°) and that the energy emitted is a maximum at right angles to this direction (0 = 90° or 270°). The radiated energy is distributed symmetrically about the line of accelerated motion and with respect to the forward and backward directions. We see also from (B-6) that the radiated intensity obeys the familiar inverse square law, S oc 1/r2 . To get the rate R at which total energy is radiated in all directions per unit time, i.e., the power, we integrate S over the area of a sphere of arbitrary radius r. That is R = J S(9) dA = J S(0)2mrr 2 sin 9 dû in which dA = 27rr 2 sin B dO is the differential ring-shaped element of area on the sphere in a range between 0 and 8 + dû. Carrying out the integration yields 1 2 g2a2 R= (B-7) 4ir€0 3 c3 which is the rate of radiation of energy from the accelerated charge. The rate of radiation is seen to be proportional to the square of the acceleration. It should be pointed out that energy must be supplied to maintain a constant linear acceleration of the charge, some of it simply to compensate for the energy radiated away. However, the radiation loss is usually negligible at nonrelativistic speeds. In the case of deceleration the radiated energy is supplied by the energy stored in the electromagnetic field of the charge whose velocity is decreasing. This is the bremsstrahlung radiation discussed in A frequent application of (B-7) is to a vibrating electric dipole. Let a charge q be vibrating about the origin of the x axis with simple harmonic motion. Then the displacement of the charge as a function of time is x = A sin wt where A is the amplitude of the vibration and w = 2.7ry its angular frequency. The acceleration of the charge is given by a = d 2x/dt 2 = —w2A sin wt = —w2 x. If we substitute this for a in (B-7) we obtain 2g 2 w4x 2 (B-8) 4it€03c3 Because x varies with time, the power radiated also varies with time at the same frequency as the vibration of the dipole. The average value of x 2 = A 2 sin 2 cot over one period of vibration, however, is simply A 2/2, so that the average rate of radiation is given by R= g 2 w4A2 R= 4rr€° 3c 3 or, with w = 2iry 1 6 n4v4gzAz (B-9) 47xE03c3 qx is the electric dipole moment of the vibrating Now dipole when the charge is at x. So qA is the amplitude of the electric dipole moment. Writing qA = p, we have the useful expression 4 3 V 4pP 2 R= (B-10) 3 R= 3€0c PROBLEM 1. According to the classical electromagnetic theory of Appendix B, what power is radiated by a single electron in a gold atom during the roughly 10 -12 sec that it takes to collapse from an orbit of radius 1.0 x 10 -1° m to the surface of the nucleus, the nuclear radius being about 6.9 x 10 -15 m? Assume that all the lost electrostatic energy is radiated, the electron's kinetic energy remaining unchanged during the motion. Chapter2. Appendix C THE BOLTZMANN DISTRIBUTION We present here a simple numerical argument that leads to an approximation of the Boltzmann distribution, and then an even simpler general argument that verifies the exact form of the distribution. Consider a system containing a large number of physical entities of the same kind that are in thermal equilibrium at temperature T. To be in equilibrium they must be able to exchange energy with each other. In the exchanges the energies of the entities will fluctuate, and at any time some will have more than the average energy and some will have less. However, the classical theory of statistical mechanics demands that these energies g be distributed according to a definite probability distribution, whose form is specified by T. One reason is that the average value / of the energy of each entity is determined by the probability distribution, and I should have a definite value for a particular T. To illustrate these ideas, consider a system consisting of entities, of the same kind, which can contain energy. An example would be a set of identical coil springs, each of which contains energy if its length is vibrating. Assume the system is isolated from the surrounding environment so that the total energy content is const an t, and assume also that the entities can exchange energy with each other through some mechanism so that the constituents of the system can come into thermal equilibrium with each other. Purely for the purpose of simplifying the subsequent calculations, we shall, for the moment, also assume that the energy of any entity is restricted to one of the values g = 0, AC 2M, 3M, 4M, .... Later we shall let the interval M go to zero so that all the values of energy are permitted. For additional simplicity, we shall at first also consider that there are only four (an arbitrarily chosen small number) entities in the system and that the total energy of the system has the value 3M (which is also chosen arbitrarily to be a small one of the integral multiples of M that the total energy must, by the above assumption, necessarily be). Later we shall generalize to systems having a large number of entities and any total energy. Because the four entities can exchange energy with one another, all possible divisions of the total energy 3Ag between the four entities can occur. In Figure C-1 we show all the possible divisions, the divisions being labelled by the letter i. For i = 1, three entities have g = 0 and the fourth entity has e = 3Ag, giving us the required total energy of 3M. Actually we can distinguish among four different ways of getting such a division, because any one of the four entities can be the one in the energy state e = 3M. We indicate this in the figure in the column marked "number of distinguishable duplicate divisions." A second possible type of division, labelled i = 2, is one in which two entities have g = 0, the third entity has e = Ag, and the fourth has g = 2Ae. There are twelve distinguishable duplicate divisions in this case, as we verify in the next paragraph. The third possible division, labelled i = 3, also has four distinguishable duplicate ways of letting one entity have e = 0 and the other three have e = Ag, giving the required total energy 3M. In evaluating the number of duplicate divisions we count as distinguishable duplicates any rearrangement of entities between different energy states. However, any rearrangement of entities in the same energy state is not counted as a duplicate, because entities of the same kind having the same energy cannot be distinguished experimentally from one another. That is, the identical entities are treated as if they are distinguishable, except for rearrangements within the same energy state. The total number of rearrangements (permutations) of the four entities is 4! = 4 x 3 x 2 x 1. (The number of different ways of ordering four objects is 4! since there are four choices of which object is taken first, three choices of which of the remaining C 1 - N THE BOLTZMANN DI STRIBUTIO N Û ek9 i =1 i =2 ✓✓ i =3 n'(& 4 4/20 12 12/20 4 4/20 ^ ✓✓ ✓ ✓ ✓✓✓ 40/20 24/20 12/20 4/20 Figure C-1 distribution. 0/20 Illustrating a simple calculation leading to an approximation to the Boltzmann objects is taken next, two choices of which is taken next, and one choice only for the last object. The total number of choices is 4 x 3 x 2 x 1 - 4!. For n objects the number of different orderings is n! = n(n 1)(n 2) • • • 1.) But rearrangements within the same energy state do not count. Hence, for example, in the case i = 2, the number of distinguishable duplicate divisions is reduced from 4! to 4!/2! = 12 because there are 2! rearrangements within the state g = 0 that do not count as distinguishable. In cases i = 1, or i = 3, the number of such divisions is reduced from 4! to 4!/3! = 4 since there are 3! rearrangements within the state e = 0, or the state e = MM, that do not count as distinguishable. We now make the final assumption: all possible divisions of the energy of the system occur with the same probability. Then the probability that the divisions of a given type (or label) will occur is proportional to the number of distinguishable duplicate divisions of that type. The relative probability, Pi , is just equal to that number divided by the total number of such divisions. The relative probabilities are listed in the column marked Pi in Figure C-1. Next let us calculate n'(e), the probable number of entities in the energy state e. Consider the energy state g = 0. For divisions of the type i = 1, there are three entities in this state, and the relative probability Pi that these divisions occur is 4/20; for i = 2 there are two entities in this state, and Pi is 12/20; for i = 3 there is one entity, and Pi is 4/20. Thus n'(0), the probable number of entities in the state e = 0, is 3 x (4/20) + 2 x (12/20) + 1 x (4/20) = 40/20. The values of n(g) calculated in the same way for the other values of e are listed on the bottom of Figure_ C-1, marked n'(‘). (Note that the sum of these numbers is four, so that we find a correct total of four entities in all the states.) The values of n'(tf) are also plotted as points in Figure C-2. The solid curve in Figure C-2 is the decreasing exponential function — — n(s) = Ae -gie0 (C-1) where A and go are constants which have been adjusted to give the best fit of the curve to the points representing the results of our calculation. The rapid drop in n'(g) with increasing e reflects the fact that, if one entity takes a larger share of the total energy of the system, the remainder of the system must necessarily have a reduced energy, and so a considerably reduced number of ways of dividing that energy between its constituents. That is, there are many fewer divisions of the total energy of the system in situations where a relatively large part of the energy is concentrated on one entity. ^ Figure C 2 - 30Z - 4 444° • A comparison of the results of a simple calculation and the Boltzmann distribution. Imagine now that we successively make M smaller and smaller, increasing the number of allowed states at the same time so as to keep the total energy at its previous value. The result of such a process is that the calculated function WV) becomes defined for values of e which are closer and closer together. (That is, we get more points on our dist ri bution.) In the limit as M -* 0, the energy 6' of an entity becomes a continuous variable, as classical physics demands, and the distribution n'(,) becomes a continuous function. If, finally, we allow the number of entities in the system to become large, this function is found to be identical with the decreasing exponential n(s) of (C-1). (That is, as the points become closer and closer together, they no longer scatter about the decreasing exponential but fall right on it.) To verify this, by a straightforward extension of our calculation to the case of a very large number of energy states and entities, involves some formidable bookkeeping in enumerating the distinguishable divisions that have the required values of total energy and number of entities, and then calculating the many relative probabilities. We shall verify the validity of the probability distribution given in (C-1) by a more subtle, but much simpler, procedure. Consider a system of many identical entities in thermal equilibrium with each other, enclosed in walls which isolate it from the surroundings. Equilibrium requires that the entities be able to exchange energy. For instance, in interacting with the walls of the system, the entities can exchange energy with the walls and so indirectly exchange energy with each other. Thus the entities interact with each other in that if one gains energy, it does so at the expense of the total energy content of the remainder of the system (all the other entities, plus the walls). Except for this energy conservation constraint, the entities are independent of each other. The presence of one entity in some particular energy state in no way inhibits or enhances the chance that another identical entity will be in that state. Now consider two of these entities. Let the probability of finding one of them in an energy state at energy g1 be given by p(gi). Then the probability of finding the other in a state at energy g2 will be given by the same probability distribution function, since the entities have identical properties, but evaluated at the energy g2. The probability will be p(g2 ). Because of the independent behavior of the entities, these two probabilities are independent of each other. As a consequence, the probability that the energy of one entity will be e1 and that the energy of the other will be g2 is given by p(gi )p(g2 ). The reason is that independent probabilities are multiplicative. (If the probability of obtaining heads in one flip of a coin is 1/2, then the probability of obtaining heads in each of two flips is (1/2) x (1/2) = 1/4, since the flips are independent.) Next consider all divisions of the energy of the system in which the sum of the energies of the two entities has the same fixed value Si + g2 as in the particular case just discussed, but in which the two entities take different shares of that energy. Since the total energy of the isolated system is constant, for all of these divisions the remainder of the system will also have a fixed value of energy. So for all of them there are the same possible number of ways for the remainder of the system to divide its energy between its constituents. As a consequence, the probability of those divisions in which there is a certain sharing of the energy g 1 + e2 between the two entities can differ from the probability of other divisions, in which there is a different sharing of that energy, only if these different sharings occur with different probabilities. If we again assume that all possible divisions of the energy of the system occur with the same probability we see that this cannot be, and we conclude that all divisions in which the same energy NOIlf1811ilSIa NNdWZ11O9 31-11 24e + g2 is shared between the two entities in different ways occur with the same probability. In other words, the probability of all such divisions is a function only of 62'1 + g2 and so can be written as, say, q(g1 + f2). However, we concluded earlier that the probability for a particular case can also be written as p(g1 )p(g2). Thus we find that p(A)p(g2) = q(g1 + g2). The essential point here is that the probability distribution function p(g) has the property that the product of two of these functions, evaluated at two different values of the variables, g1 and g2 , is a function of the sum, g1 + g2, of these variables. But an exponential function, and only an exponential function, has this property. Recall that the product of two exponentials with different exponents is an exponential whose exponent is the sum of the two exponents. Specifically, if we take the probability p(g) of finding an entity in a state at energy g to be proportional to the probable number n(s) of entities in that state, as it certainly should be, and use (C-1) to evaluate n(s), we have the function THE B OLTZMAN N DISTRI BUTION ^1 p(e) = Be -gig° (C-2) where B is proportional to the A in (C-1). This function demonstrates the required property since p(ei)p(ez) = Be-giIeoBe eZlgo = B 2 e -(gi±g2)lgo = q(g1 + g2) (There is no loss of generality in choosing e to be the base of the exponential function instead of some other number, such as 10. The reason is that an exponential function using any other base b can be transformed into an exponential with base e by the relation bx = ex In b Hence changing the base amounts to no more than changing the as-yet-not-evaluated constant go .) n(g) is a decreasing, instead of increasing, expo- Ouragmentdosclyprvetha nential, but an increasing exponential can be ruled out on physical grounds as its value goes to infinity for large values of g. Thus we have verified the general validity of (C-1). Now we shall evaluate the constant go in (C-1) n(s) = Ae-gle° By treating a system containing two different kinds of entities in thermal equilibrium, it is not difficult to prove that the value of go does not depend on the type of entities comprising a system. Thus we shall use in our argument entities with the simplest properties. Since n(s) is the probable number of entities of the system in an energy state at e, the number of entities whose energies would be found in the interval from g to e + dg equals n(s) times the number of states in that interval. If that number is independent of the value of g (i.e., if the states are uniformly distributed in energy), then the number will be proportional to the size de of the interval. This is the case if the entities are simple harmonic oscillators, like the coil springs mentioned earlier. So the probable number of simple harmonic oscillators with an energy from g to e + dg, in an equilibrium system containing many of them, is proportional to n(g)dg. If the multiplicative constant A is given the proper value, this probability can be made equal to n(s) dg. Then the average energy of one of the oscillators is CO The integral in the numerator has an integrand which is the energy weighted by the number of oscillators having that energy; the integral in the denominator is just the total number of oscillators. If we evaluate n(g) from (C-1), we have ^ J 1_ Age -gIgOdtô 0 J0 Ae - eIgOde P(s)= Ce -g1" providing the constant C is properly chosen. This is done by setting CO 00 09 J J P(ode = J Ce -674. de = C e - ele° de = 1 0 (C-7) 0 0 That is, we define p(e)dg to be the probability of finding a particular simple harmonic oscillator with energy from e to g + dg, and so for consistency we must then demand that the integral is just the probability of finding it with any f ô p(e)dg have the value onelgbecause in (C-7), and then solving for C, we find C = 1/kT. Then energy. By evaluating $ô e -e ° de we have a special form of the Boltzmann distribution P(0) _ which is used in Chapter 1. e - 8/^° kT (C-8) 0 ^n NOIlf181a1SI 0 NNtlIN Z11 O 8 3H 1 (Note that we do not need to know the actual value of A.) By proceeding in a manner completely analogous to what is done in Example 1-4, except that integrals are involved instead of sums, we find (C-3) e_4 But according to the classical law of equipartition of energy, as expressed in (1-16), for simple harmonic oscillators in equilibrium at temperature T (C-4) g = kT where Boltzmann's constant k = 1.38 x 10 -23 joule/°K. Combining (C-3) and (C-4), we have (C-5) 4 = kT This result is correct for entities of any type, even though we have obtained it for the particular case of simple harmonic oscillators. Therefore we may write (C-1) as (C-6) n(s) = Ae -/k T This is the famous Boltzmann distribution. Since the value of A is not specified, (C-6) actually tells us about a proportionality: the probable number of entities of a system in equilibrium at temperature T that will be in a state of energy 4 is proportional to e - gl kT Expressed in different terms: the probability that the state of energy f will be occupied by an entity is proportional to e - g/kT The value chosen for the constant A is dictated by convenience. In Chapter 1 we apply the Boltzmann distribution to a system of simple harmonic oscillators. As discussed here, in such a system n(g)dg is proportional to the probable number of oscillators with energy in the range e to g + dg, since the states of a simple harmonic oscillator are uniformly distributed in energy. Of course, n(g)de is also proportional to the probability PV) de of finding a particular one of the oscillators with energy in this range. Thus we have Appendix D FOURIER INTEGRAL DESCRIPTION OF A WAVE GROUP Section 3-4 presented a qualitative argument explaining how a single group of waves can be formed by combining an infinitely large number of component sinusoidal waves, each with infinitesimally different reciprocal wavelengths. Here the argument is made quantitative. The work depicted in Figure 3-9 amounts to evaluating, at time t = 0 and for a particular set of A K and K, the summation = E A K cos 2ir(Kx— Vt) (D-1) The A K are the amplitudes of the component sinusoidal waves of reciprocal wavelengths K which when added form the pattern at the bottom of the figure. The central group is the one of interest in representing the behavior of a freely moving particle. But auxiliary groups, such as the one shown partially on the right, are also formed by the addition because there are only a finite number of component sinusoidal waves. To prepare for adding an infinite number, we evaluate (D-1) for t = 0, obtaining = E A K cos 27rKx (D-2) K Then we make the transition by replacing the summation by an integration, as follows `Y =J cos 27(Kx d K A(K) (D-3) o In this integral the reciprocal wavelength K is treated as the va riable and the coordinate x is treated as a constant. The quantity A(K) is the amplitude of the component sinusoidal wave whose reciprocal wavelength is K, and there are an infinite number of them with reciprocal wavelengths differing by the infinitesimal amounts dK. The right side of (D-3) is a form of what is called a Fourier integral. A simple example of the Fourier integral is found in the case where the amplitude function A(K) has the form specified in Figure D-1. The amplitude has the value 1 for component sinusoidal waves whose reciprocal wavelengths lie in the range K o — AK to Ko + AK, and the value 0 for those whose reciprocal wavelengths lie outside this range. In this case (D-3) reduces immediately to KO +AK i = f cos 27EKx dK (D-4) K0 - AK This is equivalent to 27c(K0 + mK)x cos 2 rKx d(2xKx) 2K(K0 - AK)x D-1 N FOURIE R I NTEGRA L D ESCR IPTIO N O F A WAVE GROUP 0 1 0 Kp — Figure D 1 - A K Kp p + AK K A flat-topped amplitude distribution. which integrates to 1 ^ 2nx ^ sin 2Tr(K 0 + AK)x — sin 27c(K p — AK)x] Now sin 2ir(K0 + AK)x = (sin 221K0x)(cos 27rAK x) + (cos 27rKOx)(sin 2rrAK x) and sin 2ir(KO — AK)x = (sin 2irK ox)(cos 27tAK x) — (cos 2nK ox)(sin 2rrAK x) Therefore we have 1 ^ = — (COS 27rK0x)(Sin 21rAK x) Tcx or = 2AK cos 2mic0x sin 2nAK x 2nAK x 1.0 2AK (D-5) = cos2TrKpx sin 2^r^x x 27rIK x versus AK X, for AK = 0.1Kp sin 27rAK x 0.5 27rEK x w I versus AK . ^ nimmININA!'^^nAW^^ - ■ ^^nr^^^^^^^^ " —0.5 Q J —1 .0 0 05 1.0 1.5 AK Figure D 2 The wave group obtained from a Fourier integral of the amplitude distribution in Figure D-1. Since the group is symmetrical about the origin, only the right half is plotted. The continuous curve shows the detailed structure of the group, while the dashed curve shows only the factor responsible for its gross structure. - A(K) = e- [(K- KO)/1.201 ]2 But some rather complicated mathematics must be employed to evaluate the integral for this case. o c;o FOURIER INTE GRAL DESCRIPTI ON OF A WAVE GRO UP This result is illustrated in Figure D-2 by plotting `Y/2AK versus AK x for the typical case AK = 0.1KO. Since Y' has symmetry about the point AK x = 0, only positive values of AK x need be used in the plot. The rapid oscillations arise from the cos 2lrK Ox factor. The slow variation of their amplitudes, which forms the group, is due to the factor sin 27rAK x/27rAK x. Because of the x in its denominator, this factor becomes negligible for large values of x. Hence there are no auxiliary groups formed at values of x larger than those shown in the figure; there is only the central group. This is in contrast to the case illustrated by Figure 3-9 where there are an infinite number of uniformly spaced auxiliary groups formed, in addition to the central group, because there are only a finite number of component sinusoidal waves. Inspection of Figure D-2 shows that the amplitude of the group falls to half its maximum value when AK X = 0.30. Hence, if we define the length Ax of the group as its half width at half maximum amplitude, as in Section 3-4, this quantity has a value given by AKAx = 0.30, or AxAK = 0.30 (D-6) But Figure D-1 makes it clear that the AK in this result represents the range of reciprocal wavelengths used to compose the group, measured in terms of half width at half maximum amplitude. Therefore 0.30 is the value of the length-reciprocal wavelength product AxAK for the single group formed by combining an infinite number of component sinusoidal waves, using the "flat-topped" amplitude distribution of Figure D-1. This AxAK v al ue is larger than the value 1/12 = 0.083 found in the work depicted in Figure 3-9. The reason is that there the component sinusoidal waves have a "tapered" distribution of amplitudes where here it is flat-topped. A smaller AxAK value can be obtained from the Fourier integral, while still producing only a single group, by properly adjusting the form of the function A(K) specifying the amplitudes of the component sinusoidal waves. As is stated in Section 3-4, the smallest value that can be obtained is 1/47r = 0.080. It is obtained by using a Gaussian distribution Appendix E RUTHERFORD SCATTERING TRAJECTORIES Figure 4-4 shows the parameters for the scattering trajectory of a light particle of positive charge +ze by a heavy nucleus of positive charge +Ze. We saw in the text that the angular. momentum L = Mr2 dcp/dt is constant because the force on the particle is always acting in the radial direction. Let us apply Newton's law to the radial component of the motion, therefore, to determine the particle's trajectory. From F = Ma we obtain zZe dr dcp dt2 — r C dt ) J (E-1) 47rE0r2 wherein the left-hand term is the Coulomb force and the right-hand terms are as follows: d2r/dt2 is the radial acceleration due to the change in the magnitude of r and —r(dcp/dt) 2 = —w 2r is the centripetal acceleration (which is also radially directed) due to the change in the direction of r. To get the trajectory we need to find r as a function of cp. It simplifies the solution of (E-1) to write it, not in terms of the coordinates r, cp, but instead in terms of the coordinates u, cp, where r = 1/u (E-2) Then —M[ dr dr dcp dr du dcp dt dcp dt du dcp dt or dr dt 1 du Lu 2 u 2 dcp M L du M dcp and d2r d (dr)dcp dt2 dcp dt dt L d 2u Lu 2 M dcp2 M or d2r dt2 L2u2 d2u M2 d2 9 Substituting this into (E-1), we have L 2u2 d2u 1 ( Lu2 )2 — zZe 2u2 M2 dcp 2 u M 4 hE0M 7 or d2u zZe 2M _ +u=— 47r€0L2 42 zZe2M 471E0M2v2 b2 //E-3 l ) since L = Mvb, where y is the initial speed of the particle and b is its impact parameter defined in Figure 4-4. If we let D = (zZe 2/47te 0)/(Mv 2 /2), as in (4-4), this simplifies to d2u D + u = — 2b2 (E-4) 2 d9 E-1 N RUTHERFORD SCATTER ING TRAJECTORIES w This is a second order ordinary differential equation for u as a function of go. general solution to (E-4) is The (E-5) u = A cos q + B sin 9 — D/2b 2 which contains the two arbitrary constants, A and B. We can prove that (E-5) is, in fact, the solution to (E-4) by evaluating du = —A sin 9+B cos cp d ^P and d2u = —A cos 9 — B sin 9 492 and substituting these into (E-4). This gives us —A cos 9—B sin cp+A cos 9+B sin ce-2b2 —D/2b2 This identity proves the validity of the general solution. To get the particular solution we must evaluate the constants A and B. We require that (E-5) conform to the initial conditions: cp - 0 as r —* co and dr/dt —> —y as r —> co. Thus u= 1 =O= A cos O+B sin O- 2b2 w c a^ or Q D A = 2b2 and dr L du _ —v =— M d9 dt L (—A sin O+ B cos O) or _My_ My B L Mvb 1 b Therefore, the particular solution is u= D 1 cos cp+ b s i n 9 2b2 D - 2b2 or = 1 - sin cp+ 2b2 (cos cp-1) (E-6) This is the orbit equation, giving r as a function of cp. We see that the trajectory is hyperbolic, since (E-6) is the equation of a hyperbola in polar coordinates. Appendix F COMPLEX QUANTITIES The imaginary number i is a unit defined so that i2 = 1 or i = —1 (F-1) The name is appropriate because none of the real (i.e., ordinary) numbers have squares which are negative. A complex number z can be written in the general form z=x+ iy (F-2) where both x and y are real numbers. The number x is called the real part of z, and the number y is called the imaginary part of z (even though y is real). Note that z reduces to a pure real number if y = 0, while it reduces to a pure imaginary number if x = O. Complex numbers obey the same laws of algebra that apply to real numbers, except for the property specified in the definition (F-1). Also, the definition of equality is extended so that two complex numbers are equal if, and only if, the real part of one equals the real part of the other, and the imaginary part of one equals the imaginary part of the other. That is (F-3a) z1 = z2 implies (F-3b) x1 = x2 Y1 = y2 and vice versa. The complex conjugate of the number z = x + iy is written as z*, and is defined as z* =x — iy (F-4) From the definition it follows that z*z = (x iy)(x + iy) = x 2 i2y2 ixy + ixy = x 2 i2y2 So z *z = x2 + y2 (F-5) That is, the product of a complex number times its own complex conjugate always equals a real number. Equation (F-5) is suggestive of the Pythagorean theorem. In fact, there is a very useful geometrical representation of complex numbers shown in Figure F-1. The location of a point P, relative to what are called the real and imaginary axes of the complex plane, is used in the manner defined in the figure to specify the real part x and the imaginary part y of the associated complex number. The location of the representative point P can also be specified by the polar coordinates r and 0, called the modulus and phase, which are defined in the figure. The two sets of coordinates are related by x=r COS 8 y = r sin B (F-6) and r2 = x 2 + y 2 y x cos 9=— sinB= — (F-7) — — — r — r From (F-2) and (F-6), we see that the general complex number can be expressed in polar coordinates as z = r(cos B + i sin B) (F-8) F-1 N LL i COMPLEX Q U ANTITIES Figure F 1 The geometrical representation of a complex number. The relations between the rectangular and polar coordinates of the representative point P can be determined by inspecting the figure. - Real axis Note also that (F-9) Important relations can be developed by considering rotations in the complex plane of the representative point P. In Figure F-2, z is a complex number that is represented by a point P lying on the real axis. If the representative point is rotated at constant r through an angle dB, the corresponding complex number becomes z + dz. It is apparent from the figure that z *z = r2 dz = iz dB or dz = idB z As this relation can be seen to be true independent of the initial location of the representative point, it can be integrated as follows Zfinal J dzz =iJ dB Zinitial This yields In Zfinal 0 o = i0 Zinitial or Zfinal = Z initiale i® Zfinal = cos O + i sin O. Thus (F-10) Imaginary axis If we take r = 1, then Z initial = 1 and, from (F-8), we also have we obtain an evaluation of the complex exponential e`® = cos O + i sin O z +dz de z Figure F 2 point. - dz Real axis Illustrating a rotation, at constant distance from the origin, of a representative Rotation in the negative sense yields sin (—Co) which is ®— i sin O By adding and subtracting (F-10) and (F-11), it follows immediately that e i® = cos cos Co = e`® + e-10 (F-11) (F-12) 2 and e i® — e - i® (F-13) 2i Comparison of the definition of (F-4) with (F-10) and (F-11) shows that the complex conjugate of a complex exponential is obtained by reversing the sign of the i appearing in the exponent. That is (e i®) * = e-i® (F-14) Applying (F-9) and (F-14) to a complex exponential, we find r2 = z*z = (e i®) * ei® = e -i®ei® = eo = 1 Thus a complex exponential maintains a constant modulus r = 1, even if its phase is changing. But its real and imaginary parts, which are from (F-2) and (F-6) equal to cos O and sin O, are oscillatory functions of the phase O. If its phase is continually increasing from 0 to n/2 to x to 3n/2 to 2nc, and so on, a complex exponential changes in value from +1 to + i to —1 to —i to + 1, and repeats this cyclically. In this sense it is an oscillatory function of its phase. In differentiating or integrating a complex quantity, the standard procedures of calculus are used with i treated as any other constant. An example of integration in found in the calculation leading to (F-10). As another example, the first derivative of the complex exponential is sin O = de i® (F-15) = ie`® dO Although the geometrical interpretation leads naturally to writing the phase of a complex exponential as an angle O, it can actually be any quantity which, like an angle, is dimensionless. In quantum mechanics, complex exponentials frequently used are eikx e ikx- cot) e - iEt/Ii In the first of these, for example, the wave number k has the dimensions of (length) -1 , so k times the length x is dimensionless. All relations quoted for e`® have obvious extensions to eikx, and the others. For example, application of the rules of differentiation to e ikx , with k constant, yields dei" dx = ike ikx (F-16) S31111N `d f1 0X31dWO0 e - `® = cos (—O) + i Appendix G NUMERICAL SOLUTION OF THE TIME-INDEPENDENT SCHROEDINGER EQUATION FOR A SQUARE WELL POTENTIAL In quantum mechanics, as in other fields of science and engineering, many of the calculations that arise in current professional work are carried out on computers using numerical techniques. In some cases the potential energy function of interest is of such a form that its time-independent Schroedinger equation cannot be solved by even the most general analytical techniques (for reasons explained in Appendix I). In other cases analytical solutions can be obtained, but numerical solutions can be obtained more conveniently. As a simple illustration of the numerical techniques, an d of the "thought calculations" of Section 5-7, we shall obtain here a numerical solution of the time-independent Schroedinger equation for the potential energy function x < — a/2 or x > + a/2 V0 , a constant x = +a/2 (G-1) V(x) = V0/2 — a/2 < x < + a/2 0 This is called a square well potential, for reasons that are apparent from inspection of its form plotted in Figure G-1. (The figure implies that V(x) has no definite value at x = + a/2. In the V(x) = Vo E x = + a/2 u = +0.5 Figure G 1 - A square well potential. G-1 NUMERICAL SOLUTION FORA SQ U AREWELL POTE NTIAL C7 analytical work with a square well potential found elsewhere in this book, there is no need to define its value at these two points. But this is not true of numerical work, and so V(x) is defined to have the reasonable value V0/2 at x = ± a/2.) For this potential a numerical solution can be found quite easily on any computer. The time-independent Schroedinger equation for a square well potential can also be treated with fairly simple analytical techniques (see Appendix H), so we shall be able to compare the resulting exact solution with the results we obtain from our numerical solution. Using the square well potential (G-1), we seek a numerical solution to the time-independent Schroedinger equation (5-45) for the eigenfunction >'(x). The equation is d20(x) (G-2) h2 [V(x) — E]0(x) Since numerical calculations can deal only with pure numbers, the first step is to switch to the dimensionless coordinate x u =— a (G-3) The relation between the second derivatives with respect to x and u is d20(x) 1 d2 0(u) dx2 a2 du2 Thus we have d2 0(u) u 2 h2 [V(u) — EN(u) = Evaluating V(u) from (G-1) gives us 2ma2 Vo r E — 1] ti/(u) h2 L V0 11 2ma 2 Vo FE ^ (u) h2 LV0 2 2ma2 Vo E — h 2 Vo O(u) x C N d20(u) du2 u < — 1/2 or u > + 1/2 u = ±1/2 —1/2 < u < +1/2 We write this as d2>/i =F du2 (G-4) where u < —1/2 or u > +1/2 (G-5a) u = ± 1/2 (G-5b) —1/2 < u < +1/2 (G-5c) —/3(E — 1)0 F = — /3(E — 1 /2)0 —/30 with f3 = 2ma2 Vo E (G-6) V0 h2 The dimensionless parameter f3 = 2ma 2 Vo/h2 is a measure of the "strength" of the square well potential, and E = E/Vo is a dimensionless measure of the total energy of the system. The quantity F specifies the functional dependence of the second derivative on u and t/i. From the arguments of Section 5-7, we know that the behavior of a solution i to the timeindependent Schroedinger equation (G-2), with given values of the potential parameter f3 and the energy parameter E, should be completely determined for all values of u by the form of the equation and by the assumed initial values of 0 and d0/du. A procedure for doing this follows: First calculate _ di/i Au F (G-7a) du 1,2 du ]o + 2 E= Then calculate d>y Au = Ifro + du 1/2 (G-7b) Then set (G-7c) u 1 = u0 + Au Next calculate [f Then calculate u 3/2= ] [f 11:4 ] 1/2 + FAu (G-8a) Au (G-8b) ^ 1 + [ du d^ ] 3^2 Then set (G -8 c) u2 = u1 + Au Next calculate [^ u]5/2 [c/01 FAu (G -9a) Au (G-9b) 3/2 Then calculate I ^r3 = 2 + C/11/ du 5/2 Then set u3 = u2 + Au (G-9c) Etc. In these equations Au is a small increment in the independent variable u. The quantity F, being the second derivative with respect to u of the dependent variable ,Ii, is the derivative with respect to u of the first derivative d^Ii/du. Initial values of the independent variable, dependent variable, and the first derivative are written as uo , >Ji o , and [dpi/du]o . The first equation evaluates [d0/du] 1/2 , the derivative for u greater than its initial value by (1/2)Au. It does so by adding to the initial value of the derivative the product FAu/2 of its rate of change with respect to u and the change in u. Then in the second equation t/i i , the dependent variable for u greater than the initial value by (1)Au, is found by adding to its initial value its rate of change with respect to u, at the midpoint of the increment in u, times the change in u. Then the value of u is updated in the third equation. The second set of three equations is similar. But in the first set the value of F is fixed by the initial values of the variables u and 0 on which it depends, whereas in the second set the value of F is fixed by the values of u and i/i obtained from the first set. The third set of equations, and all subsequent sets, are identical to the second set except that in each the F that is used is fixed at the value calculated from the latest values of u and i/i. For sufficiently small Au, these equations provide good approximations to the values of 0 and dpi/du. Tables G-1 and G-2 list a computer program in BASIC which carries out the numerical procedure. Several comments should be made about this program: 1. It consists of a main program, listed in Table G-1, plus two related subroutines, listed. in Table G-2. The main program is a universal one, which can be used to solve any second-order ordinary differential equation. This is true because the numerical procedure it follows is universal; all such equations can be written in the form of (G-4) if u represents any independent variable, 0 represents any dependent variable, and F represents any function of the independent variable and/or the dependent variable and/or the first derivative. As an example, the NUMER ICAL SOL UTION FOR A S QUARE WELL POTENTIAL Y' 1 Table G 1 NUMERI CALSOLUTION FOR A SQUARE WELL POTENTIAL - A Universal Program in BASIC for Solving Second-Order Ordinary Differential Equations 100 REM UNIVERSAL PROGRAM FOR SOLVING SECOND-ORDER DIFFERENTIAL EQUATIONS 110 REM REQUIRES SUBROUTINES TO INPUT PARAMETERS AND INTIAL CONDITIONS AND TO CALCULATE THE SECOND DERIVATIVE 120 REM PROGRAM IS WRITTEN IN THE IBM PERSONAL COMPUTER DIALECT OF BASICMINOR CHANGES MAY BE REQUIRED TO TRANSLATE IT TO ANOTHER DIALECT 130 DEF FNR(A)=INT(10"P*A+.5)/10"P: REM FUNCTION R ROUNDS ANY VARIABLE A TO P DIGITS PAST THE DECIMAL PLACE 140 GOSUB 1000: REM INPUT PARAMETERS AND INITIAL CONDITIONS 150 CLS: REM CLEAR MONITOR SCREEN 160 PRINT "TO CONTINUE RUN AFTER A SET OF VALUES ARE DISPLAYED, PRESS C. PRESS ANY OTHER KEY TO HALT": REM PUT INSTRUCTIONS ON SCREEN 170 PRINT: REM PUT BLANK LINE ON SCREEN 180 LET N=0: REM ZERO INDEX COUNTING SETS OF VALUES DISPLAYED 190 PRINT "INDEPENDENT VARIABLE","DEPENDENT VARIABLE": REM PUT TABLE HEADINGS ON SCREEN 200 PRINT 210 GOSUB 2000: REM CALCULATE SECOND DERIVATIVE D2 220 LET D1=D1+D2*DEL/2: REM INCREMENT FIRST DERIVATIVE D1, FOR CHANGE DEL/2 IN INDEPENDENT VARIABLE, USING (G-7A) 230 PRINT FNR(I) ,, FNR(D0): REM DISPLAY ROUNDED VALUES OF INDEPENDENT VARIABLE I AND DEPENDENT VARIABLE D0 240 LET N=N+1: REM INCREMENT INDEX N 250 LET D0=D0+D1*DEL: REM INCREMENT DEPENDENT VARIABLE USING (G-7B) OR (G-8B), ETC. 260 LET I=I+DEL: REM INCREMENT INDEPENDENT VARIABLE 270 GOSUB 2000 280 LET D1=D1+D2*DEL: REM INCREMENT FIRST DERIVATIVE USING (G-8A) OR (G-9A) , ETC. 290 IF N<10 THEN 230: REM IF <10 SETS OF VALUES DISPLAYED, CALCULATE ANOTHER 300 PRINT 310 LET N=O: REM REZERO INDEX N 320 LET A$=INKEY$: REM LABEL KEY PRESSED ON KEYBOARD AS A$ 330 IF A$="" THEN 320: REM IF NO KEY PRESSED TRY AGAIN 340 IF A$="C" THEN 230: REM IF C PRESSED CALCULATE 10 MORE SETS OF VALUES 350 END: REM TERMINATE PROGRAM AND RETURN TO COMMAND LEVEL Table G 2 - Subroutines Adapting the Universal Program to the Solution of the TimeIndependent Schroedinger Equation for the Square Well Potential 1000 REM FINITE SQUARE WELL SCHROEDINGER EQUATION-INPUT PARAMETERS AND INITIAL CONDITIONS 1010 CLS 1020 1030 1040 1050 1060 1070 1080 1090 1100 1110 1120 1130 1140 1150 1160 1170 1180 2000 2010 2020 2030 2040 2050 2060 2070 2080 PRINT "FINITE SQUARE WELL SCHROEDINGER EQUATION": REM PUT TITLE ON SCREEN PRINT PRINT "INITIAL PSI = ";: REM PUT QUERY ON SCREEN INPUT DO: REM ADD QUESTION MARK, AWAIT INPUT, ACCEPT IT AND LABEL AS D0 PRINT "INITIAL DPSI/DU = "; INPUT D1 PRINT "INITIAL U (USUALLY 0) _""; INPUT I PRINT "DELTA U (MUST DIVIDE EVENLY INTO .5) = "; INPUT DEL PRINT "BETA = "; INPUT B PRINT "EPSILON = "; INPUT E PRINT "NUMBER OF DIGITS PAST DECIMAL POINT TO BE SHOWN (USUALLY 3) = "; INPUT P RETURN: REM TERMINATE SUBROUTINE AND RETURN TO PROGRAM REM FINITE SQUARE WELL SCHROEDINGER EQUATION-CALCULATE THE SECOND DERIVATIVE IF ABS(I)>.50001 THEN 2070: REM TEST IF OUTSIDE WELL IF ABS(ABS(I)-.5)<.00001 THEN 2050: REM TEST IF AT EDGE OF WELL LET D2=-B*E*D0: REM CALCULATE SECOND DERIVATIVE USING (G-5C) RETURN LET D2=-B*(E-.5)*D0: REM CALCULATE SECOND DERIVATIVE USING (G-5B) RETURN LET D2=-B*(E-1)*D0: REM CALCULATE SECOND DERIVATIVE USING (G-5A) RETURN differential equation for a damped, sinusoidally driven, classical oscillator can be written as d2x =F where F s in wt mx m dt + m with m the mass, C the force constant, f the frictional constant, a the amplitude of the driving force, and w its angular frequency. Hence (G-7), and the following equations, can be applied to solve this differential equation if u is replaced by t, Ili is replaced by x, and F is evaluated from the equation immediately above. 2. To make the universal character of the main program apparent, and to conform to the restrictions on variable names in BASIC, the symbols it uses internally to represent the independent variable, dependent variable, first derivative, and second derivative are I, DO, D1, and D2, instead of those used externally, that is: u, tfr, dpi/du, and F. 3. The subroutines listed in Table G-2 cause the main program to solve the differential equation specified by (G-4) through (G-6). One of the subroutines inputs initial values of the variables and values of the parameters. The other calculates the second derivative. In doing this, they connect the symbols used internally and externally for the variables and do the same for the parameters, which are represented by B, E, and DEL internally and by 16, E, and Au externally. A different set of subroutines must be written if the main program is to be used for a different differential equation. 4. Both the main program and the subroutines are liberally documented with REMark statements. But they can be deleted, for the sake of rapid keyboard entry, if desired. For the purpose of the illustrative calculations that we shall perform, any reasonable value of the parameter f3 specifying the strength of the square well potential can be used. So we take, rather arbitrarily (G-10) /3 = 64 We also must specify a numerical value of the energy parameter E to use in the calculations. Now we know, from the qualitative arguments of Section 5-7, that in the interior region of the square well the lowest energy eigenfunction will look something like half of a cosine wave fitted into the region. However, it will have a longer wavelength since it does extend for some distance into the exterior regions. By evaluating the momentum p corresponding to a half wavelength 2/2 = a just fitting into the interior region, from de Broglie's relation p = h/A = h/2a, we can use the corresponding energy E = p 2/2m = h 2/8ma2 = t 2h2 /2m a2 to help us estimate the actual value of E, and save effort in the numerical calculations. In terms of E, the estimated value of E is E = E/Vo = (n2h2/2ma2)/(32h2/ma2) = i1 2/64 = 0.1542. Since A is an underestimate, E and E are overestimates. We therefore make an educated guess and try, in the initial calculations, the value E = 0.1000. In consideration of what was learned in the qualitative arguments, it is apparent that the eigenfunction for the lowest allowed energy in the square well potential should be symmetrical about the point u = 0, relative to which the potential itself is symmetrical. This very much simplifies things because we need only carry out calculations in the range u > 0, and because the symmetry immediately leads to the conclusion that di/i(u)/du = 0 at u = 0. We shall therefore start the calculations at u = 0. Since the choice of ifi(u) at u = 0 is immaterial because of the linearity of the differential equation, we shall take iii(u) = +1.000 at that point. Sufficient accuracy will be obtained by taking Au = 0.025. The results of the calculations are shown by the dots labeled E = 0.1000 in Figure G-2. The calculations were terminated at u = 0.950 because >/i was rapidly going to — oo. This happened because the chosen value of E was too large. As a result, >/i bends too rapidly in the interior region, and consequently it goes through zero just a little way outside this region. Once it goes through zero, nothing can prevent it from going to — oo. In an attempt to prevent the divergent behavior of >/i a second set of calculations were performed. Because of the obvious sensitivity, the value of E was reduced by only 2%, to E = 0.0980. The results are shown in Figure G-2 by the crosses labeled with this value of E. These calculations failed also, but in the opposite sense, because ,/i bent away from the axis in the exterior region and began to go to + oo. — , cn NUMERICAL SOLUTI ON FORASQUAREWELL P OTENTIAL dt2 G) co 1.0 .I. + 0.9 e * t t t t 0.8 t t 0.7 t +. 0.6 1+. +• -' 0.5 +• +• 0.4 -I• . NU MERICAL SOLUTIO N FOR ASQ UARE WELL POTENTIAL Ci * 0.3 t.* 0.2 + E = 0.0980 + ; + t •t • tt 0.1 + + + 0 —0.1 + • —0.2 0.0981 + ++ + +i * • • t ii^++++++ ..... + +++ • • .•• • • '•. • . • t^ e= •. • • •. • •• e = 0.1000 • • e = 0.0982 0 0 1 0 2 0 3 0 4 0.5 0 6 0.7 0 8 0.9 1.0 1.1 1.2 1 3 1 4 1 5 1.6 1.7 1 8 u ^ edge Solutions to the time-independent Schroedinger equation for a square well potential with four values of the energy parameter E. Figure G 2 - The figure also shows results obtained in two more sets of calculations, using E = 0.0981 and E = 0.0982. None of them produced a solution to the differential equation which never diverges to infinity But it is apparent that the divergence can be postponed more and more by getting closer and closer to a certain value of E, and that that allowed value of the energy parameter lies between 0.0981 and 0.0982. Additional calculations can be used to narrow the limits, but it would be necessary to decrease the value of Au in order to reduce the numerical inaccuracy of the calculation. A solution to the time-independent Schroedinger equation for this potential using analytic methods (see Appendix H) yields E = 0.0980 for the lowest allowed value of the energy parameter. The agreement with our numerical calculations is very good, PROBLEMS 1. (a) Repeat the numerical integration of Appendix G for assumed values of E of higher energy, and find the first excited state of the potential treated there. (Hint: (i) For this state, iji = 0 at u = 0. (ii) Take diji/du = + 1 at that point, since linearity allows it to have any value. (iii) The eigenfunction looks something like a full sine wave fitted into the region of the well.) (b) Find the second excited state by numerical integration. 2. Find, via numerical integration, an acceptable solution to the square well potential equation for a value of the energy parameter E greater than one. Comment on the difference between the results obtained here and those obtained for the bound states. 3. (a) Use the numerical integration procedure, developed in Appendix G, to find the lowest allowed energy value E 1 , and the form of the corresponding eigenfunction ik 1 (x), for a particle of mass m moving in the potential V(x) = co 0 x < — a/2 or x > + a/2 — a/2 < x < +a/2 As is proven in Chapter 6, since V(x) increases without limit when x is outside the region of length a, the particle is strictly prohibited from being found outside that region. Therefore tÿ 1 (x) goes to zero at x = ± a/2. Symmetry arguments show that for the lowest eigenfunction dIi 1 (x)/dx is zero at x = 0. (Hint: The parameter fl cannot be defined in this problem, but the function F can still be defined directly in terms of E 1 .) (b) Compare the value of E 1 you obtain with the exact solution to this problem obtained analytically in Example 5-9. 4. Make the same calculation indicated in Problem 3, except for a potential containing a rectangular bump of height v ° and width a/2, centered at the bottom of the binding region. That is no x < — a/2 or x > + a/2 V(x) = 0 — a/2 < x < —a/4 or + a/4 < x < + a/2 v° — a/4 <x< +a/4 v 0 /2 x = ±a/2 Take v° to have the value 2^2 8ma2 Problem 3 of Appendix H asks for an analytical solution to the time-independent Schroedinger equation for this potential. (Hint: A guess concerning an appropriate initial choice of E 1 can be obtained from the qualitative considerations of Problem 25 of Chapter 5.) 5. Use the numerical integration procedure developed in Appendix G to find the first two eigenfunctions and eigenvalues of a simple harmonic oscillator potential. (Hint: Use (I-7) from Appendix I to write the time-independent Schroedinger equation in the form d2 ^///du e = —(E — u2)0.) Compare the results you obtain with those obtained in Examples 5-3 and 6-7. 6. Use the numerical integration procedure developed in Appendix G to find the first three eigenfunctions and eigenvalues for an anharmonic oscillator with potential energy of the v0 ^) ^ SWd1808d but not perfect due to the numerical inaccuracy just mentioned. The analytic solution also shows that there are two additional bound allowed energies, corresponding to E = 0.383 and E = 0.808. Of course, any unbound energy, corresponding to E > 1, is allowed. The procedure we have just used is sometimes called numerical integration. The second word is appropriate because we started with an equation containing d 2/i/dx2 and finally obtained i'/ itself; therefore, we have carried out a process which is the inverse of differentiation. If the student has access to a computer, of even the smallest size, he will find that by performing numerical integrations for bound and unbound states in various potentials he can rapidly develop a real intuitive feeling for many of the important features of quantum mechanics. NU MERICAL SO LUTION FOR ASQ UARE WELL POTENTIAL Û form V(x) = — 2 + 2 ^ x4 Convert the time-independent Schroedinger equation to the dimensionless form d2 ti= -(E — u 2- (Su4 )4^ due Then express b in terms of D. Make calculations for the particular case S = 0.25. Compare the eigenvalues you obtain with the corresponding harmonic oscillator eigenvalues (that is, with those that would be obtained for 8 = 0). There is no analytical solution to the anharmonic oscillator time-independent Schroedinger equation; it can only be solved numerically. Appendix H ANALYTICAL SOLUTION OF THE TIME-INDEPENDENT SCHROEDINGER EQUATION FOR A SQUARE WELL POTENTIAL Here we develop the general solution of the time-independent Schroedinger equation for the bound states of a square well potential of finite depth, following the procedure that is discussed in a qualitative way in Section 6-7. Then we apply the results to the particular case of a square well potential with the same parameters that were used in the numerical solution of Appendix G. The description of the classical motion of a particle bound by a square well suggests that it would be most appropriate to look for solutions to the Schroedinger equation in the form of standing waves. Thus we take, as a general solution to the time-independent Schroedinger equation in the region — a/2 < x < + a/2 where V(x) = 0, the free particle standing wave eigenfunction of (6-62), which we write here as —a/2 < x < +a/2 (H-1) ,G(x) = A sin kix + B cos kix where ki = -\I2mE/h In the regions x < — a/2 and x > + a/2 the time-independent Schroedinger equation has the general solutions displayed in (6-63) and (6-64). These are 0(x) = Cek ux + De - kux x < a/2 (H-2) and - kiix x > +a/2 (H-3) t/i(x) = FekIix + Ge where with E < Vo k11 = f m(Vo — E)/hi To determine the arbitrary constants first impose the requirement that the eigenfunctions remain finite for all x. Consider (H-2) in the limit x — oo. It is apparent that this requirement demands D =0 (H-4) Similarly, it is necessary to set F= 0 (H-5) — H-1 N ANALYTICAL SOLUTION FOR A SQ UARE WELL PO TENTIAL 2 in order that (H-3) remain finite in the limit x -> + co. Next impose the requirement that the eigenfunctions and their first derivatives be continuous at x = - a/2 and x = + a/2. Four equations are obtained. They are (H-6) - A sin (kw/2) + B cos (kia/2) = Ce - k"al2 -ki`a/2 (H-7) Ak i cos (kIa/2) + Bk i sin (kia/2) = Ck ii e kiia/2 A sin (kw/2) + B cos (k 1(1/2) = Ge (H-8) — kjIa/2 Ak i cos (kia/2) - Bki sin (kIa/2) _ - Gkiie (H-9) Subtracting (H-6) from (H-8) yields 2A sin (ki a/2) = (G C)e kiia/2 Adding (H-6) to (H-8) yields 2B cos (kia/2) = (G + C)e -kiia/2 Subtracting (H-9) from (H-7) yields 2Bki sin (kia/2) = (G + C)kiie -k"11/2 Adding (H-9) to (H-7) yields 2Aki cos (k ia/2) = -(G - C)klie -k iia/2 Provided B 0 and (G + C) Provided A 0 0 and (G - C) (H-10) - - (H-11) (H-12) (H-13) 0, we may divide (H-12) by (H-11) and obtain if B 0 and (G + C) kI tan (k ia/2) = k11 0 (H-14) 0, we may divide (H-13) by (H-10) and obtain kI cot (kw/2) = - kii if A 0 0 and (G - C) 0 (H-15) It is easy to see that both (H-14) and (H-15) cannot be satisfied simultaneously. If they could, the equation obtained by adding these two kI tan (kia/2) + k1 cot (kw/2) = 0 would be valid. Multiply through by tan (kia/2). Then the equation becomes ki tan 2 (k1a/2) + ki = 0 or tan 2 (k ia/2) = -1 But this cannot be valid as both k i and a/2 are real. Thus it is only possible either to satisfy (H-14) but not (H-15) or to satisfy (H-15) but not (H-14). The eigenfunctions of the square well potential form two classes. For the first class k1 tan (kia/2) = k11 A= 0 (H-16) G-C= 0 Then (H-8) reads B cos (kia/2) = G = B cos (kia/2)ekiia'2 = C and the eigenfunctions are [B cos (kia/2)e kiia/2]ekiix i/i(x) = [B] cos (kix) [B cos (kia/2) ek iia/ 2]e -k iix < - a/2 -a/2 < x < a/2 (H-17) x x > a/2 For the second class kI cot (ki a/2) = (H-18) G+C=0 Then (H-8) reads A sin (ki a/2) = Ge- k na/2 G = A sin (kia/2)ekiial2 = -C and the eigenfunctions are ik [A sin x < —a/2 —a/2 < x < a/2 (H-19) x > a/2 Consider the first of (H - 16). Evaluating k1 and k11 , and multiplying through by a/2, the equation becomes (H-20) .mEa2/2h2 tan ( \/mEa2/2h2 ) = Vm(Vo — E)a2 /2h2 For a given particle of mass m and a given potential well of depth Vo and width a, this is an equation in the single unknown E. Its solutions are the allowed values of the total energy of the particle—the eigenvalues for eigenfunctions of the first class. Solutions of this transcendental equation can be obtained only by numerical or graphical methods. We present a simple graphical method which will illustrate the important features of the equation. Let us make the change of variable e - N/mEa 2/2h 2 (H-21) so the equation becomes (H-22) g tan e= VmVoa2/2h2 _ e2 If we plot the function p(e) = e tan g and the function q(f) = JmV0 a2 /2h 2 — g 2 the intersections specify values of f' which are solutions to (H-22). Such a plot is shown in Figure H-1. The function p(s) has zeros at e = 0, it, 2rr, ... and has asymptotes at g = 77/2, 3n/2, 5rc/2, .... The function q(g) is a quarter-circle of radius \/mV02/2h 2 . It is clear from the figure that the number of solutions which exist for (H-22) depends on the radius of the quarter-circle. Each solution gives an eigenvalue for E < V o corresponding to an eigenfunction of the first class. There exists one such eigenvalue if N imVo a2/2h 2 < it; two if TE < \/mVo a2/2h 2 < 27t; three if 2rn < \/mV0 a2 /2h2 < 3n; etc. The case JmV0 a2 /2h2 = 4 is illustrated in the figure. Note that this corresponds to 2mV0a2/h2 = 64, the value used in the numerical integration of Appendix G. For this case accurate graphical (or numerical) work shows that there are two solutions: g 1.252 and e 3.595. From (H-21), the eigenvalues are 2 1.^2^ 2 22 E = e2 ma 2— e2 mVa 0 2Vo^ 7r ^ Vo^ 0.0980Vo 27r g —)- Figure H-1 A graphical solution of the equation for eigenvalues of the first class of a particular square well potential. Solution of e tan e = JmVoa2/2h2 — g 2 or p(e) = g(e). ANALYTI CALSOL UTIO N FOR A S QUARE WELL P OTENTIAL [— A sin (kIa/2)ek" 2]ektix 0(x) = [A] sin (kI x) ANALYTICAL SOLUTION FOR A SQ U AREWELL PO TENTIAL A graphical solution of the equation for eigenvalues of the second class of a particular square well potential. Solution of Figure H-2 —S cot 6= v/mVo a 2 l2h 2 —S 2 or r(S) = g( 6')• and ^2 2h 2 (3.595)2V E o^ o ^ 0.808 Vo mVo a2 V The eigenvalues corresponding to eigenfunctions of the second class are found from the solutions of an analogous equation obtained from (H-18), which is —6' cot = JmVoa2/2h 2 — S2 (H-23) Figure H-2 illustrates the solution of this equation. It is apparent that there will be no eigenvalues for E < Vo corresponding to eigenfunctions of the second class if ✓mV0a 2 /2h 2 < n/2; there will be one if 7r/2 < */mVo a2 /2h 2 < 3ir/2; two if 3m/2 < ,/mVo a2 /2h 2 < 57r/2; etc. The figure illustrates the case \An Vo a 2 /2h 2 = 4. The single solution to (H-23) is ' 2.475, and the eigenvalue is E ^2 2h 2 mYoa2 Vo ti (2.475)2 Vo 0.383 Vo We see that for a given potential well there are only a restricted number of allowed values of total energy E for E < Vo . These are the discrete eigenvalues for the bound states of the particle. On the other hand, we know that any value of E is allowed for E > Vo ; the eigenvalues for the unbound states form a continuum. For a potential well which is very shallow or very narrow or both, only a single eigenvalue of the first class will be bound. With increasing values of .mVoa2/2h2 an eigenvalue of the second class will be bound. For even larger values of this parameter an additional eigenvalue of the first class will be bound. Next, an additional eigenvalue of the second class will be bound, etc. As an example consider the case T Continuum Vo (1st class) (2nd class) (1st class) Figure H 3 - E3 = 0.808V0 E2 = 0.383V0 E, = 0.0980V0 The eigenvalues of a particular square well potential. The value of the constant A or B must be adjusted so that each eigenfunction satisfies the normalization condition. For the case .,/mV0 a2 /2h2 = 4, the three normalized eigenfunctions corresponding to the eigenvalues E 1 , E2, and E3 are s. so — âZ 17.9 1 x < — a/2 e >/i l (x) = 1.26 1 17.9 a —18.6 cos 11.25 a/2 â2 e - 3.so - x>_a/2 1 e3. 16 Qiz a 1J1 2(x) = 1.23 ^ sin (2.48 18.6 1 —5.80 x < —a12 /2 ) - 3'16 aiz a e 1. ^ 4 â2 a/2 —5.80 1 e -1 . 74 a!z Va - — a/2 < x < a/2 (H-25) x > a/2 e >/i 3(x) = 1.13 ^ cos (3.60 Figure H 4 potential. —a12 < x < a/2 ) x < —a/2 —a/2 < x < a/2 x>_a/2 The eigenfunctions for the bound eigenstates of a particular square well 1d I1 N310d 1 13M3 abfl OSb' aO3 N OI If1i OS1dO 11A1 b'N b' VmVoa2 /2h2 = 4. The potential and the discrete and continuum eigenvalues are illustrated to scale in Figure H-3. We have used the quantum numbers n = 1, 2, 3, 4, 5, ... to label the eigenvalues in order of increasing energy. For this potential only the first three eigenvalues are bound. From the solutions g, of (H-22) and (H-23) for a given value of JmVoa2/2h2, the explicit forms of the eigenfunctions, (H-17) and (H-19), may be evaluated. The required relations are a and kII 2 = VmVoa 2 12h2 — e2 (H-24) k12 = ANALYTICAL SOLU TI ON FO R ASQUARE WELL POTENTIAL The eigenfunctions, multiplied byj, are plotted in Figure H-4 as a function of x/(a/2). PROBLEMS 1. Use a trial-and-error numerical procedure to find with three- decimal-place accuracy the solutions to the transcendental equations (H-20) and (H-23) for JmV0a 2 /2i`î 2 = 4. Thereby verify the values quoted in Appendix H. 2. Use a graphical procedure to find with one-decimal-place accuracy all the solutions to the transcendental equations (H-20) and (H-23) for JmVoa 2 /2hî 2 = 5. (Hint: Additions to Figures H-1 and H-2 will yield results of sufficient accuracy.) 3. Obtain an analytical solution, as in Appendix H, to find the first eigenvalue of the potential co x < —a/2 or x > +a/2 V(x) = 0 — a/2 < x < — a/4 or + a/4 < x < + a/2 vo —a/4 < x < + a/4 where vo ^2^2 8ma2 Compare with the numerical integration of Problem 4 of Appendix G. (Hint: (i) Because of the symmetry of V(x), the first eigenfunction i/i must be of even parity. This means there can be no sine term in the form assumed by i/i in the region — a/4 < x < + a/4 surrounding x = 0. (ii) Because of this symmetry, it is necessary only to match i/i and dpi/dx at x = + a/4, and to make i i = 0 at x = + a/2.) Appendix SERIES SOLUTION OF THE TIME-INDEPENDENT SCHROEDINGER EQUATION FOR A SIMPLE HARMONIC OSCILLATOR POTENTIAL In this appendix we shall use analytical techniques to solve the time independent Schroedinger equation for a particle of mass m bound in the simple harmonic oscillator potential V(x) = 2x2 (I-1) where C is the force constant of the corresponding linear restoring force. These techniques are worth studying not only because of the importance of the simple harmonic oscillator, but also because the solution of the time-independent Schroedinger equation for the even more important one electron atom involves techniques which are almost identical. Mathematically inclined students will, furthermore, find them to be quite interesting. The time-independent Schroedinger equation for the potential is h2 d2çl C 2 (I-2) 2m dx2 + 2 If we evaluate the force constant C in terms of the classical oscillation frequency the equation becomes h2 d2 + 2rc 2mv 2x 20 = Et/i 2m dx2 ^ or d20 2mE i2 dx 2+ [ ^ 27cmv 2 1/1 =0 ( h ) x Introducing the parameters a = 2zzmv/hi and f = 2mE/h2 (I-4) -1 SERIESSOLUTION FOR A HAR MO NICOSC ILLATO R the equation assumes the more compact form dx 2 + (fi a 2x2)0 = 0 It is convenient to express this in terms of the dimensionless variable u= ^ x _ 2rrm C 1/z J 1/2 x= - [ht2ir \m) (Cm)1/4 h1 /2 x We have dt/r - dudtfi = dx dx du r difi V a du and d2 1Ji du d di/i dx 2 dx du (dx) -a d 2 ]f due So the equation becomes d22 + (R—au2)=0 or 13 g + (du - u2)0= 0 (I-7) We must find solutions for which 1i(u) and its first derivative are single valued, continuous, and finite, for all u from - co to + co. The first two conditions will automatically be satisfied by the solutions we shall obtain. However, it will be necessary to take explicit consideration of the requirement that i/i(u) remain finite as lul -> co. For this purpose it is useful first to consider the form of 0(u) for very large values of ^ul. Now for any finite value of the total energy E, the quantity f3/cc becomes negligible compared to u2 for very large values of lul. Thus we may write, from (I-7) d20 = u2,/ lul -> a) (I-8) due The general solution to this differential equation is 2/2 = Ae-u212 + Beu where A (I-9) and B are arbitrary constants. We verify that this is a solution to (I-8) by calculating dpi = A(- u)e - u 212 + Bueu212 du and d2 tk du 2 212 A(-u) 2 e - u 212 - Ae - u212 + Bu2 eu + Beu212 = A(u 2 - 1)e - u2/ 2 + B(u2 + 1)eu 212 Since, for lid -> oo, this is essentially d2 = Au 2 e - u2/2 + Bu2eu212 due or d20 0 du2 = u 2 (Ae — u 212 + Be u 212 ) = u2 4'! it is obvious that it satisfies (I-8) identically. Next we apply the condition that the eigenfunction must remain finite as lul -> co. It is apparent from (I-9) that this requires us to set B = 0. Thus the form of the eigenfunctions for very large lul must be tli(u) = Ae-u212 1141' oo (I-10) The form we have found in (I-10) suggests that we search for solutions to the full-fledged d>ji du — and Aue-"212H + Ae-u2/2 dH du d2 /le -112/2H + Au2 e - u 2l2 H — Aue_u212 dH due du — Aue -u2i2 dH du + Ae "2 ^ 2 Ae - u212 (— H + u2 — 2u dH du + d2H du2 d2 H^ du Then we substitute iJi and d21/î/du 2 into (I-7), to obtain Ae-'212 —H + u 2H — 2u C dH 2 e H - Au 2e -142"2 H = 0 + + A U212 d Hl a Dividing by Ae '2/2, and cancelling the terms involving u2H, we have d2H du2 (I-12) 2u dH + f -1 H=0 a du This differential equation determines the functions H(u). Let us recapitulate. We started with the time-independent Schroedinger equation, (I-7). For reasons that will be explained, this equation cannot be directly solved. However, by writing the solutions to the equation as products of the function Ae - 142 /2, which is the form of the solutions for 1141 —> oo, times the functions H(u), we transform the problem to one of solving (I-12). This equation is solvable by means of the power series technique. In this, the most general technique available for the analytical solution of a differential equation, we begin by assuming that the solution can be written as a power series in the independent variable. That is, we assume CO H(u) = 1= 0 a1ul -ao +ai u+a2u2 + a3 u 3 +••• (I- 13) The coefficients ao, al , a2 , ... are then determined by substituting (I-13) into (I-12), and demanding that the resulting equation be satisfied for any value of u. Calculating the derivatives dH = E latu1- 1 - la i + 2a2u + 3a3u2 + • • du 1=1 and d2H co du2 1E (1-1)la1u1 2 - 1. 2a2 +2.3a 3 u+3.4a4u2 +••• and substituting them into the differential equation, we obtain 1.2a2 +2.3a 3 u+3.4a4u2 +4. 5a5 u3 + •• — 2- M i tt —2.2a 2u2 -2.3a 3 u3 —••• + (fi/a 1)a0 + (Na — 1)a 1 u + (Na — 1)a2u2 + (fl/a — 1)a 3 u3 + • • = 0 — Since this is to be true for all values of u, the coefficients of each power of u must vanish individually so that the validity of the equation will not depend on the value of u. Gathering the coefficients together, and equating them to zero, we have 1 2a2 + (/3/a — 1)a0 = 0 u° : 2 3a3 + (f3/oc — 1 — 2 1)a1 = 0 ul: u2: 3 4a4 + (fl/a — 1 — 2.2)a 2 = 0 4.5a5 + (f3/a — 1 — 2.3)a 3 = 0 u3 : T) SER IES SOLUTIO N FOR A HA RMONIC OSC ILLATOR differential equation, (I-7), that can be written (I-11) iji(u) = Ae -u212H(u) These solutions are to be valid for all u. So the H(u) must be functions which are slowly varying compared to e - 142 /2 as co, in order that (I-11) agree with (I-10). Elsewhere, the H(u) must have whatever forms are required to yield the correct forms for the Ji(u). To evaluate the H(u), we calculate SERIESSOLU TIO N FOR A HARMONIC OSCILLATOR For the lth power of u, the relation is X ^ C a) az. a u1: or (1 + 1)(l + 2)a1+ 2 + (/3/a - 1 - 21)al = 0 (13/a - 1 - 2 1 ) a1 (l + 1)(l + 2) a1 + 2 (I-14) This is called the recursion relation. The relation allows us to calculate, successively, the coefficients a 2 , a4 , a6 , ... in terms of ao , and the coefficients a 3 , a 5 , a7 , ... in terms of a l . The coefficients ao and a l are not specified by the recursion relation, but this is as it should be. Since the differential equation for H(u) contains a second derivative, its general solution should contain two arbitrary constants. We see then that the general solution splits up into two independent series, which we write as H( u)=ao (l+ a2 uz + a4 a2 u4 + a6 a4 a2 u6 +...1 ao a2 ao a4 a2 ao J 7 + all u + a3u 3 +asa3 u s + a^ asa3u +...1 (I-15) al a3 a1 a5 a3 al l The ratios a1+2/a1 are given by the recursion relation. The first series is an even function of u, and the second series is an odd function of that variable. The reason why (I-7) cannot be directly solved by application of the power series technique is that it leads to a recursion relation involving more than two coefficients. The student can show this immediately by applying the technique. If he then attempts to write an equation analogous to (I-15), he will see that the technique fails because there can be only two arbitrary constants in the solution of an equation containing a second derivative. We were able to circumvent the difficulty by transforming the problem to one of solving (I-12). Essentially the same tri ck is successful for the differential equations that arise from the time-independent Schroedinger equation for the Coulomb potential, V(r) cc r - 1 , of a one-electron atom. There are other potentials for which the trick does not work, and there is no analytical solution. Of course, any potential can be treated by the numerical techniques of Appendix G. For an arbitrary value of /3/a, both the even and the odd series of (I-15) will contain an infinite number of terms. As we shall see, this will not lead to acceptable eigenfunctions. Consider either series, and evaluate the ratio of the coefficients of successive powers of u for large 1. This gives a1+ 2 (/3/a— 1 — 21) 2l 2 a1 (1+l)(l+ 2) - 12 1 Let us compare it with the same ratio for the power series expansion of the function e" 2, which is "2 2 u e = 1 + u + 4 u1 u6 + + u1 +2 + 2! 3! (1/2)! + (1/2 + 1)! For large 1, the ratio of the coefficients of successive powers of u is 1/(l/2 + 1)! (1/2)! (1/2)! 1 1 2 1/(l/2)! (//2 + 1)! (1/2 + 1)(//2)! 1/2 + 1 - 1/2 l The two ratios are the same. This means that the terms of high power in u in the series for e"2 can differ from the corresponding terms in the even series of H(u) by nothing more than a multiplicative constant K. They can only differ from the terms in the odd series of H(u) by u times another constant K'. But, for lul -> oc, the terms of low power in u are not important in determining the value of any of these series. Consequently, we conclude that H(u) = aoKe" 2 + aiKtueu2 lul ' ao According to (I-11), the solutions to the time-independent Schroedinger equation are i/i(u) = Ae - "212H(u) Thus, if the series of H(u) contain an infinite number of terms, the behavior of these solutions for lul-> cois Ae - "212H(u) = aoAKe" 2/2 + al AK'ue" 212 an+2 (/3 /a -1- 2n) = (2n + 1 - 1 - 2n) (n + 1)(n + 2) an a° = 0 (n + 1)(n + 2) The coefficients an+4 , an+6' an+s' • • • will also be zero since they are proportional to an+ 2. The resulting solutions Hn(u) are polynomials of order un, called Hermite polynomials. Each Hn(u) can be evaluated from (I-15) by calculating the coefficients from the recursion relation with /3/a given by (I-16) for that value of n. The first few Hermite polynomials can be seen in Table 6-1. They are the factors multiplying A ne - n212 in the entries of the table. (In each case the arbitrary constant ao or a 1 has been chosen so that the coefficient of each power of u can be written as a simple integer.) For the polynomial solutions to the Hermite differential equation, (I-12), the corresponding eigenfunctions (147) Y'n(u) = Ane n2"2Hn(u) will always have the acceptable behavior of going to zero as 1141 -> oo. The reason is that, for large (u^ the exponential function e-u212 varies so much more rapidly than the polynomial Hn(u) that it completely dominates the behavior of the eigenfunctions. Evaluating a and /3 from (I-4), we obtain immediately from (I-16) - , h _ 2E _ 2E _ 2n + 1 h2 2irmv 27 by by 2mE or E = Cn+Zf hv n= 0, 1, 2, 3, ... (I-18) These are the eigenvalues of the simple harmonic oscillator potential, expressed in terms of its classical oscillation frequency v. PROBLEMS 1. Determine the forms of the first five simple harmonic oscillator eigenfunctions by evaluating the coefficients of the polynomials from the recursion relation developed in Appendix I. 2. Carry through, as far as possible, an attempt to make a direct series solution of (I-7) of Appendix I. Explain clearly why the attempt fails. ^ ^ aOldT11 JSO OINOMIbH V 1:1 O3 N OIlM OSS31 1:13 S -> oo, which is not acceptable behavior for an eigenBut this increases without limit as function. Acceptable eigenfunctions can be obtained, however, for certain values of Na. We set either the arbitrary constant ao , or the arbitrary constant al , equal to zero. Then we force the remaining series of H(u) to terminate by setting (I-16) f3/a = 2n + 1 where if ao =0 n= 1,3,5,... if a l = n= 0,2,4,... It is clear from (I-14) that such a choice of /3/a will cause the series to terminate at the nth term since we shall have, for 1 = n Appendix J TIME-INDEPENDENT PERTURBATION THEORY The technique Appendix I employed to solve the time-independent Schroedinger equation for the simple harmonic oscillator potential will not, in general, be of use in the case of a potential of arbitrary form V(x). What happens is that the recursion relation is found to involve more than two coefficients, making it impossible to find analytical solutions to the differential equation. In such cases the equation can always be solved by numerical integration in the manner described in Appendix G. In addition, there are approximation techniques that are very useful for treating certain potentials. The study of one of these techniques forms the subject of timeindependent perturbation theory, to which this Appendix is devoted. TIME-INDEPENDENT PERTURBATIONS Consider a potential V'(x), for which it is either difficult or impossible to solve the timeindependent Schroedinger equation analytically, but which can be decomposed as follows (J-1) V'(x) = V(x) + v(x) where V(x) is a potential for which the time-independent Schroedinger equation has been solved, and where v(x) is a potential that is small compared to V(x). We shall develop expressions from which it will be easy to obtain good approximations to the eigenvalues and eigenfunctions of the perturbed potential V'(x), in terms of the perturbation v(x) and the known eigenvalues and eigenfunctions of the unperturbed potential V(x). An example of (J-1) is illustrated in Figure J-1. The potential V'(x) has been decomposed into a square well potential V(x), plus a perturbation v(x) which is small compared to V(x). Let us write some particular perturbed eigenfunction tJi;,(x) as a linear combination of the unperturbed eigenfunctions i/i i(x). That is, we write (J-2) Y' n(x) = E and i(x) The coefficients an, specify how much of each of the i/i i(x) is contained in Iin(x). The summation runs over all the values of the quantum number 1, including those in the continuum. The unperturbed eigenfunctions are solutions to the time-independent Schroedinger equation for the potential V, which is h2 (J 3a) + VIP / = EA 2m dx 2 i The perturbed eigenfunctions are solutions to the same equation for the potential V', which - is h2 d2Wn 2m Using (J-1) this can be written h2 2 d2 dx + V ' Y' n = En4'n , m dx2n + Vi^In + v4'n = EnY'n (J-3b) J-1 Continuum Continuum TIME- INDEPENDENT PERTU RBATION THEORY 4 —^ 4 3 2 3 2 1 1 = ^— V' (x) + V (x) y (x) Figure J-1 Illustrating the decomposition of a perturbed potential into an unperturbed potential plus a perturbation. Here El and En are, respectively, the unperturbed and perturbed eigenvalues. Now substitute (J-2) into (J-3b), to obtain E ant p m dx2l + V111 11 + E a1vY', = an1En't 1 According to (J-3a) the bracket is equal toE 11Ji1 .Thus we have EI ;1E1`V 1+ Ianlv'4' 1= E1 an1EnY' or N â 'J' / E1 ;1(4— E1)Y' 1 = E an1v4' 1 :1:73 I Multiplying through by the complex conjugate of a certain unperturbed eigenfunction Y'm, and integrating over all x, we have L, a1(E — E1) J fi7Iijdx = E and op J - (J-4) Y' mvtfrl dx co The unperturbed eigenfunctions are, necessarily, orthogonal. That is, they have the property described by the equation ortho- gonality ^1m Pidx = 0 m l (J-5) This is true for any two different eigenfunctions of any particular potential. See Problem 27 of Chapter 6, Example 9-la, and, particularly, Problem 10 of Chapter 9. We also assume the unperturbed eigenfunctions have been normalized. (This involves box normalization for the continuum eigenfunctions. In so doing, the continuum eigenvalues actually become discrete, although very closely spaced. This removes any difficulty of interpreting the summation E 1 1.) With this assumption, the integral on the left side of (J-4) will forthecniumvalsf be equal to zero if 1 # m, and equal to one if l = m. Thus there will be only one non-vanishing term in the summation on the left side of the equation, and CO , anm(En — Em) = E ant I tPmvt1 dx - Let o0 us define the symbol 00 vm! = J tfrm(x)v(x)01(x) dx (J-6) Then we can write , anm(En — Em) = Ean lvm l (J-7) This equation is exact, but it is not very useful. In order to obtain one that is useful, we shall employ the condition that the perturbation v(x) is small compared to the unperturbed ant «1 1 l n l n = (J-8) If we also require that v(x) be small compared to the eigenvalues of V(x), it is clear that the v,nt must then all be small compared to the unperturbed eigenvalues because, according to (J-6), these quantities are just certain averages of v(x). Now let us divide both sides of (J-7) by the unperturbed eigenvalue E n . We have anm (En — Ern) vml = law E.t Em Every term in the summation, except the term l = n, is the product of two small quantities ant and vml/Em. We shall neglect such terms, keeping only the term for l = n. Then we have (En — Em) anm E _ ann m vmn or ^ N anm(E n — Em) — annvmn (J-9) Now take m = n. We obtain , ann(E n — En) ti annvnn SO (J-10) r En — En If we take m n, ti vnn we obtain anm N nn E vmn nr — m Setting ann = 1 because of (J-8), this becomes (J-11) vmn anm E —E En m Using (J-10) to evaluate E;,, we find that nm vmn vmn En — Em +vnn (En — E.)(1 + vmn ( 1 + vnn 1 En — Em`\ En — Em) 1 vnn En — Em/ vmn (1 N En — Em vnn En — Em We have taken the first term in the binomial expansion of 1 plus the quantity vnn/(En — Em)• Next we shall drop the term involving the product of the two quantities vmn/(En — Em ) and vnn/(En — E m ). The validity of these two steps depends on the additional requirement that v(x) be small even compared to the difference between En and any other eigenvalue Em which enters into our calculations. We have finally an . ^— vmn En — Em (J-12) Equations (J-10) and (J-12) are the expressions which provide good approximations to the eigenvalues and eigenfunctions of the perturbed potential V'(x). Consider (J-10), and evaluate vnn from (J-6). This yields r 1.1/n (x)v(x)i/i n(x) dx J ^ En — En ^— vnn = (J-13a) o0 This gives an approximation to the nth perturbed eigenvalue in terms of the nth unperturbed eigenvalue and a certain integral involving the corresponding unperturbed eigenfunction and the perturbation v(x). The integral is the expectation value of v(x) for the nth unperturbed SNO11V8aflla3d 1N3oN3d34NI -31A1I1 potential V(x), so that the perturbed potential V'(x) differs only slightly from V(x). For such a situation it is reasonable to assume that the perturbed eigenfunctions will differ only slightly from the unperturbed eigenfunctions. In terms of (J-2), this means that we assume ^ TIM E- INDEP E ND ENT PERTU RBATION THEO RY ^ eigenstate. To see this, consider (5-29), with V(x,t) = v(x) and'F(x,t) = ¶ n(x,t) = e - °E"trn tit n(x). That equation reads 00 v(x) = J e`EntrnY'n(x)v(x)e- aE„t/fii,l,n(x) dx — co or v(x) = J (J-13b) i//n (x)v(x)t in(x) dx — OD Thus perturbation theory gives the very reasonable result that the shift in the energy of the nth eigenvalue, due to the presence of the perturbing potential v(x), is approximately equal to the value of v(x) averaged over the nth unperturbed eigenstate with a weighting factor equal to the probability density tin(x)On(x) for that eigenstate. Succinctly put, the energy shift in any state is approximately the expectation value for that state of the perturbing potential. Next consider (J-12), and evaluate the symbol v mn to obtain Go 1 anmEn — Em 114,(x)v(x)tli n(x) dx m n (J-14) - co This equation gives the approximate value of the coefficients anm which specify how much of each of the unperturbed eigenfunctions 1l/ m(x) is mixed in with the dominant unperturbed eigenfunction tfi n(x) to form the perturbed eigenfunction ifr (x). Then in the series (J-2), with I replaced by m (J-15) W n(x) = E anml/ m(x) we may use (J-14) to evaluate all the coefficients except From (J-8) we know ann 1. Its exact value can be determined by requiring that ilin(x) be normalized. Note that anm is proportional to 1/(E n — Em). Thus the perturbation v(x) will mix in with the unperturbed eigenfunction O n(x) only a negligibly small amount of any unperturbed eigenfunction O m(x) whose eigenvalue Em is very different from the eigenvalue En . This has the important consequence that a good approximation to the series (J-15) may be obtained by taking only the term for m = n, plus a few terms for m not very different from n. The coefficient a nm is also proportional to the quantity CO vmn = J V'm(x)v(x)Y'n(x) dx w This is a certain average of v(x), with a weighting factor i/im(x)t/i n(x) which depends on the eigenfunction for the mth unperturbed eigenstate as well as the eigenfunction for the nth unperturbed eigenstate. The quantities vmn , for m = n as well as m n, are called the matrix elements of the perturbation v taken between the state n and the state m. This terminology is used because in advanced treatments of quantum mechanics it is convenient to consider a matrix in which each element is one of the quantities v mn . Such a matrix /Vll V13 Vi n V21 V22 V23 • V2n V31 V32 Vml vm2 t'12 V33 • • L 31d WdX3NV V1 a/ 2 x-- Figure J-2 A V-bottom potential. contains all possible information concerning the application of a perturbation v(x) to a system whose unperturbed eigenfunctions are >/i i(x), 4i 2(x), i/i 3(x), tp4(x), ... . AN EXAMPLE Let us illustrate the use of (J-13a) and (J-14) by doing a simple perturbation calculation. We shall evaluate the first eigenvalue and eigenfunction for the potential indicated in Figure J-2 and specified by the equation S — a/2 < x < +a/2 (J-16) /2 V'(x) = x < —a/2 orx> + a/2 We consider this as the sum of an unperturbed potential — a/2 < x< V(x) = 0 x CXD + a/2 < —a/2 orx> + a/2 which is an infinite square well, plus a perturbation v(x) = S I /2 According to (6-79), (6-80), and Example 5-10, the normalized unperturbed eigenfunctions can be written 'J' / ^2/a cos (mltx/a) m = 1, 3,5, 4'mlx) J2/a sin (micx/a) m = 2, 4, 6, According to (6-81), the unperturbed eigenvalues are Em = x2h2 m2/2Ma2 m = 1, 2, 3, 4, where we use M for the mass of the particle. If S is small compared to the first eigenvalue E1 = x 2h2/2Ma2 the perturbation technique should be applicable. To evaluate tli'1(x), take n = 1 in (J-14). This gives CO aim = 1 m 0m(x)v(x)iffi(x) dx E1 — Em - 1 o0 which is a/2 atm _ 8M6 1 ^ 2fz 2 (1 — m2) 8MS cos —a/2 a/2 1 1r2^12 (1 — m2) sin —a/2 m"x a mx i a Ixl cos ^xl cos \a x x a dx m = 3, 5, 5,7, 7, . . . dx ) m = 2, 4, 6, . , . For m = 2, 4, 6, ... the integrand is an odd function of x. Since the integral is taken over a range symmetrical about x = 0, the integral will vanish. Thus we have m=2,4,6,... a im =0 TIME- INDEPEND ENT PE RTURBATIO N THEORY For m = 3, 5, 7, ... the integral is an even function of x; it gives aim = 16 MS ( a/2 1 2) (^ 1—m J cos Cmnxl a JJ o xcos —/Idx a/ Cox m = 3,5, 7,. . Let Z = nx/a; then this becomes rz/2 1 aim =ng a 2 J cos (mZ) Ei (1 —m) o Z cos Z dZ where we have introduced the convenient dimensionless ratio 6/E 1 = 2Ma28/n2 h2. The integral can be evaluated easily by writing cos (mZ) = (1/2)(e+imz + e -tmz ). The result is 8 8 1 (cos [(m 1)n/2] — 1 cos [(m — 1)n/2] — 11 + )j m= 3,5,7,... aim — 2 E1 (1 — m2) l 2 (m + 1) 2 2(m — 1) 1)2 The first few non-vanishing coefficients have the values 18 b a13 32 n 2 E i 18 8 a 15 = 864 n 2 E 1 18 ô a 17 = 1728 Tc 2 E 1 1 8 (5 a19 = 8000 7E 2 E 1 + It is not surprising that a im = 0 for m = 2, 4, 6, .... The perturbed potential V'(x) is symmetrical about the origin, and so its first eigenstate must be of even parity. Consequently there can be no odd parity unperturbed eigenfunctions mixed into the first perturbed eigenfunction, and the odd parity unperturbed eigenfunctions are precisely those for m = 2, 4, 6, .... The perturbed eigenfunction I/4(x) is obtained by substituting the aim in the series (J-15). Since the a im decrease rapidly with increasing m (owing partly to the 1/(E 1 — Em) term and partly to the vmi term), it is apparent that we can get a very good approximation to the series by taking only the terms for m = 1 and m = 3. Thus /4(x) ti aiitfri(x) + 32 8 Ei 03(x) (J-17) Finally the coefficient al 1 must be adjusted so that Vi(x) is normalized, but we leave this as an exercise for the student. Figure J-3 illustrates (J-17). The relative amount of ' 3 (x) has been exaggerated for the sake of clarity. Fixing our attention on 1f/1(x) and 0 1 (x), we see that the second derivative of the perturbed eigenfunction is relatively small near the ends of the region — a/2 to + a/2, and relatively large near the center, compared to the second derivative of the unperturbed —a/2 0 a/2 x Figure J-3 Illustrating the composition of the first eigenfunction for a V-bottom potential. a/2 El — E 1 = a J ^ Ixl cos2 — a/2 axl dx a/ 2 El —E 1 =a ^ J (' xcos 2 ^^)dx o R/2 E1 — E 1 = 8(5 Z cos2 ZdZ o 8(5 (7r2 1) 16 4 E1 —E1 = ^2 which is E'1 — E 1 = 0.297(5 Figure J-4 shows the perturbed eigenvalue E1 in terms of the dimensionless ratio (El — E1 )/E 1 , plotted as a function of the dimensionless ratio (5/E 1 . Perturbation theory predicts the straight line of slope 0.297. The points are the correct answer. They were calculated from the eigenvalues El obtained by an accurate (numerical integration) solution of the timeindependent Schroedinger equation for the four potentials V'(x) corresponding to the values of (5/E 1 indicated. The shift in the energy of the first eigenvalue, as predicted by perturbation theory, is seen to be in error by about 10 percent for (5/E 1 ^ 0.9, which corresponds to (El E 1 )/E 1 ^ 0.25. For (E1 — E 1 )/E 1 ^ 0.05, the error is about 0.5 percent. Now it is apparent that the error in the perturbation theory we have developed is of the order of the square of a small quantity since, throughout the development, the squares of small quantities were always neglected. The numbers just quoted indicate that, in the present case, an approximate measure of the size of this small quantity is the ratio (E' 1 — E 1 )/E 1 . Note also that the eigenvalue E'1 calculated by perturbation theory is always too large. It can be shown that this is true for any form of the perturbation v(x), and it is easy to see why it happens. Perturbation theory uses the unperturbed eigenvalue tIi i(x) to evaluate E1 — E 1 = f°° tlii(x)v(x)/i 1 (x)dx. Comparing the plots of ifi 1(x) and of 44(x), we see that this procedure gives too much weight to the values of v(x) near the ends of the region. But near the ends of the region v(x) is largest, and therefore the contribution of v(x) to the perturbed eigenvalue E'1 is overestimated. A comparison of the exact form of the eigenfunction 411(x) of the potential V'(x) with the form (J-17) predicted by perturbation theory shows that the error in the coefficient a 13 is also of the order of the square of the quantity (E1 — E 1)/E 1 . ^ 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 E1 A comparison between the first eigenvalues for several V-bottom potentials obtained from time-independent perturbation theory and from accurate solutions of the time-independent Schroedinger equation. Figure J-4 31dWVX3NV eigenfunction. Consideration of the form of the time-independent Schroedinger equation for the perturbed and unperturbed potentials will make it clear why this happens. Next let us evaluate E1 — E 1 . Taking n = 1 in (J-13a), and inserting the appropriate unperturbed eigenfunction, we have CO TIME- INDEPENDENT PERTU RBATION T H EO RY ^ If more accurate estimates of En and t//,,(x) are needed, it is possible to extend the perturbation theory to obtain expressions in which the error is of the order of the cube, or even of a higher power, of the appropriate small quantity. However, in practice (J-13a) and (J-14) are normally adequate. THE TREATMENT OF DEGENERACIES Consider the case of two different unperturbed eigenfunctions, which we label Vi i(x) and 0 2(x), whose corresponding unperturbed eigenvalues E 1 and E2 happen to be exactly equal. These eigenfunctions are said to be degenerate. There are a number of important examples of this situation that actually arise in the study of atomic and nuclear physics. For instance, many of the eigenfunctions are degenerate for an electron bound in the 1/r Coulomb potential of a hydrogen atom. When eigenfunctions are degenerate, we shall often be interested in studying the effect of a small perturbation which changes the potential in such a way as to remove the degeneracy. However, to apply perturbation theory in a case involving degenerate eigenfunctions, we must exercise care. This need is clearly indicated by (J-11), which makes the prediction a12 — 1 and a21 — 1 for the case E 1 = E2. (Equation (J-11) states a12 ^ v21/(E1 — E 2). Taking El = E2, and using (J-10), this becomes a12v2 1 /(E i — E1) = v21/v11 — 1, in general. Note that this result does not depend on the "additional requirement" that v(x) be small compared to the difference between two unperturbed eigenvalues.) This really tells us only that the theory we have developed breaks down in this case. But it also provides some clue to the nature of the difficulty by showing that, when E 1 = E2, the assumption a 1 « 1 for n l of (J-8) is not consistent with the results obtained from that assumption in the cases n = 1, 2 and l = 1, 2. The difficulty is resolved when we realize that there is certainly no a priori basis for the assumption that a12 « 1 and a21 « 1, when tJr 1 (x) and (/i 2 (x) correspond to eigenvalues which are exactly equal. Under such circumstances it might very well be that, in contrast to the assumption, a small perturbation could have a big effect and thoroughly mix up the two degenerate eigenfunctions. To account for this situation we first investigate only the mixing, due to the presence of the perturbation v(x), of the two degenerate unperturbed eigenfunctions tJr 1 (x) and 0 2 (x) with each other. In doing this we ignore the mixing with 0 1 (x) and tf/ 2 (x) of any of the non-degenerate unperturbed eigenfunctions 1/1 3(x), 04(x), t/r 5 (x), .... Now in many cases of physical interest the matrix elements have two symmetries. These are v11= v 22 an d y 12 = v21. In such cases the result of the investigation is that the perturbation mixes the tjr l(x) and 02(x) into the following two linear combinations O °(x) = and 02) (x) = V2 r C01(x) + 02(x)] (J-18a) 01(x) — 02(x)] (J-18b) These particular linear combinations have a very useful property: If the perturbation is applied directly to either of them it will not cause one to mix with the other. This can be seen by evaluating the integrals that appear in the coefficients which, according to (J-14), determine the mixing. For instance J 07 *v0 (2) dx = J (' 2 [1i + i]v [J i tei dx — JrV2 dx + J tIfivi I l dx — J 11441 a dx] =2[v11 — v12 Similarly, [01 — t// 2] dx +v21 — v22] = 0 c_ Node Node Node Node Figure J 5 - Two independent degenerate vibrations of a circular drum head. So the perturbation does not mix iji? and iG° among themselves, and non degenerate perturbation theory can be applied directly to these particular linear combinations to calculate the energy shifts, even though they are degenerate before the application of the perturbation. - But how can we find, in a general case, the particular linear combinations of degenerate eigenfunctions that have the very desirable property of not mixing among themselves when the perturbation is applied? There is a mathematical procedure—the one used to obtain (J-18a) and (J-18b)—but it is rather complicated. Fortunately, there are also physical arguments that can be used, instead of mathematical ones, to simplify the application of time-independent perturbation theory to quantum mechanical systems that involve degeneracies. Before considering a quantum mechanical system, it is informative to look at an example of a physical argument which can be used in a classical system. In one of the higher frequency modes of a circular drum head, the drum head vibrates with a nodal line lying along a diameter. This mode is degenerate because the same frequency is obtained for all orientations of the nodal line. But there is only a two-fold degeneracy because there are only two independent vibrations—the vibrations whose nodal lines are perpendicular. These independent degenerate vibrations are indicated in Figure J-5. All other vibrations at this frequency can be obtained by linear combinations of these two. In particular, other sets of two independent degenerate vibrations, with perpendicular nodal lines of different orientation, can be obtained by appropriate linear combinations. In the absence of a perturbation, all these sets are equivalent. Now imagine applying a perturbation by fixing a small weight to the drum head at some position other than its center, as indicated in Figure J-6. Because of the asymmetry introduced by the perturbation, the two previously independent vibrations are mixed together to form two new vibrations, as indicated in Figure J-7. Also the perturbation removes the degeneracy because the weight lies along the nodal line for one vibration and therefore has no effect on the frequency of that vibration, while it does a ffect the frequency of the other vibration. After gaining some experience with these problems, it is possible to tell from physical arguments what the form of the perturbed vibrations must be. This allows the set of independent degenerate unperturbed vibrations to be chosen as the particular set for which one nodal line runs through the weight. Then the application of the perturbation does not mix the vibrations because they have the same form both before and after its application, and the non-degenerate classical perturbation theory can be used in the calculation of the frequency shifts produced by the perturbation. There are several implicit examples in the text of applying physical arguments to quantum mechanical systems to find the particular linear combinations of degenerate eigenfunctions that are not mixed by the application of a perturbation. The first is found in Section 8-6, where the energy shifts produced by the spin-orbit interaction in a hydrogen atom are evaluated. To clarify the point in question, we begin by observing that there is a redundancy of Figure J 6 - Applying a perturbation to a circular drum head. S3 13da3 N3J3 4 30 1N3W1d3 a1 3H1 (ID O T TIME- INDEPENDENT PERTU RBATI ON THE ORY ^ Figure J-7 The results of applying a perturbation to a circular drum head. quantum numbers in the one-electron atom if the spin-orbit interaction is neglected. That is, n, 1, ml , ms , j, and mj would all be "good" quantum numbers but, since there are only three spatial coordinates and one spin coordinate, only four quantum numbers are needed. In other words, if we ignore the spin-orbit interaction there are solutions to the time-independent Schroedinger equation for the hydrogen atom which can be written as i/ nim ims . But in these circumstances there are also solutions to the equation which can be written as ^i jm . The latter are certain linear combinations of the former. (It is not appropriate to use s as a label since it has only the single value 1/2.) If we use the Cinlmtms to evaluate the spin-orbit energy shifts in perturbation theory there is a difficulty. These unperturbed eigenfunctions are degenerate since the total energy of the state specified by the quantum numbers n, 1, m l, ms depends only on the quantum number n. Instead, in Section 8-6 we use the set of degenerate unperturbed eigenfunctions /nljmJ. The reason is that since J and JZ , the quantities specified by j and mj, have definite values whether or not the spin-orbit interaction is present, it follows that the application of this perturbation cannot change their values. (This is not true of LZ and SZ , the quantities specified by m1 and ms). Consequently, the perturbation cannot produce a large mixing of the Ii ni , even though they are degenerate. So they must be the set of degenerate unperturbed eigenfunctions analogous to those in (J-18), to which nondegenerate perturbation theory can be applied directly as is done in obtaining (8-35). Thus in Section 8-6 the quantum numbers used to specify the state are precisely those that must be used to justify evaluating the spin-orbit energy by calculating its expectation value according to nondegenerate perturbation theory. The forms Cairn, of the eigenfunctions that must be used in (8-35) are not shown explicitly in that equation because the expectation value occurring in it is written in the compact notation (1/r) dV(r)/dr. But they are if the expectation value is written in an expanded notation analogous to (J-13b). Note that the required forms are found by applying a physical argument, not a mathematical argument. Explicit use is made in Section 17-8 of equations completely equivalent to (J-18). PROBLEMS 1. Use time-independent perturbation theory to calculate the first eigenvalue E 1 and the first eigenfunction 0 1 (x) of the potential x< —a/2orx> +a/2 oo V(x) = x —a/2 < x < + a/2 a/2 where 8 is small relative to E 1 . Compare with the results obtained in the example treated in Appendix J. 2. Use time-independent perturbation theory to calculate the first eigenvalue E 1 for the potential in Problem 3 of Appendix H. Compare your results with those obtained by the analytical treatment in Problem 3 of Appendix H, and also those contained in Problem 4 of Appendix G, which applied to numerical integration of the potential. 3. Except for certain pathological cases, no degeneracies arise in problems involving one particle moving in one dimension. In order to obtain a simple example of the application of degenerate time-independent perturbation theory, consider one particle moving in the two dimensional infinite square well potential x < —a/2 or x > +a/2 or y < — a/2 or y > +a/2 co V(x,y) = 0 — a/2 < x < + a/2 and — a/2 < y < + a/2 8 ^ ^ SW3 -180ad Use the techniques of Section 7-2 to set up the time-independent Schroedinger equation for the potential. Separate this partial differential equation into two ordinary differential equations by the usual method, making use of the fact that V(x,y) can be written as V(x) + V(y). Since these equations, and the conditions on 4i at the edges of the well, have the same form as for a one dimensional infinite square we ll, their solutions can be written immediately. Note that there are degeneracies in almost all the eigenfunctions. 4. Consider the application of the perturbation 8 x>O and y>0 v(x,y) = 0 x<Oory<.0 to the particle in the two dimensional infinite square well of Problem 3. Investigate the effect of this perturbation on the first pair of eigenfunctions that are degenerate, as follows. Evaluate their four matrix elements with the perturbation. Use the results to justify the applicability of the linear combinations of these eigenfunctions quoted in (J-18a) and (J-18b). Then use these linear combinations to evaluate the energy shifts that the perturbation produces in the eigenvalues. Appendix K TIME-DEPENDENT PERTURBATION THEORY Here we extend the theory of Appendix J to the case of perturbations which are functions of both position and time. This is an important case for several reasons, one being that timedependent perturbation theory provides the only nonnumerical method for solving the Schroedinger equation for a time-dependent potential V(x,t). (One exception is a time-dependent potential of the form V(x,t) = V1(x) + V2(t). For this form only, the Schroedinger equation can be separated in the manner of Section 5-5 by assuming a solution W(x,t) _ 11i(x)(p(t).) Thus we consider a time-dependent potential V'(x,t) which can be decomposed as follows (K-1) V'(x,t) = V(x) + v(x,t) where V(x) is a time-independent unperturbed potential and v(x,t) is a small time-dependent perturbation. The solutions to the Schroedinger equation for V(x) are the set of unperturbed wave functions iEnt, 1 ,n(x) (K-2) Pn(x,t) = ewhere the En and ^in(x) are the unperturbed eigenvalues and eigenfunctions. Assume that a solution to the Schroedinger equation for V'(x,t) can be written (K-3) 'F'(x,t) = an(t)W,(x,t) E where the coefficients an(t) are functions of time. Different solutions will have different sets of coefficients, but here we shall not use a second subscript to indicate this explicitly. Substitute (K-3) into the equation aŸf' h2 a2q" =O +V'W'— iii at 2M axe which it is supposed to satisfy. This gives r 22 O =O a — n+Vgin—t fi n]+ an vP , — iŸ1 n nL M a The bracket vanishes because the P n are solutions to the Schroedinger equation for the potential V. Multiply the remaining terms by the complex conjugate of some particular unperturbed wave function 111,n = e - `E"4/64/,n , and integrate over all x. Then, evaluating gi n, we have — , dt '11 > E, V an e - i(En—Em)t/fi n f,„ 4' mv4'nltx = 12 da En dt e- i(E n - Em)t/ir CO L Since the tji n are orthogonal as in (J-5), and normalized, this reduces to dam(t) t) _ _ a (t)e i(En—Em)tlm vmn E n - (K-4) We have extended the definition of the matrix element v mn given in (J-6) to include timedependent perturbations. And we have obtained, in (K-4), a set of coupled first order ordinary differential equations, one for each m, which determine the an(t). The details of the solution K-1 TIME- DEPEND ENT PERTU RBATIO N THEORY X C a) of these equations depend on the details of the particular problem at hand. We consider here a simple but illustrative case. Assume a perturbation of the form v(x,t) = v(x) t > 0 (K-5) This is a perturbation v(x) which is "switched on" at t = O. For this case the set of unperturbed wave functions (K-2) are exact solutions for t < 0. Next assume that the wave function for the particle is known to be equal to a single one of these wave functions, say 'I' k(x,t), for t < 0. This amounts to assuming that the total energy of the particle is known to be precisely Ek for t < 0. This does not conflict with the uncertainty principle AEAt > h/2 (K-6) because in the infinite time before t = 0 it would be possible to measure the energy of the particle with perfect precision. In terms of (K-3), this assumption provides the following set of initial conditions for the an(t) at t = 0. an (0) = (K-7) n k (We assume that the an(t) do not change discontinuously at t = 0. This assumption will be justified by the results of the calculation.) We would like to find the perturbed wave function 'If'(x,t) for the particle at a time t > 0. To do this we shall evaluate the an(t) for t > 0. Let us require that the perturbation v(x) be small enough, or that the time t be short enough, that «1 an(t) ^., 1 n0k 0 (K-8) n= k t > Then we may neglect all terms in the right side of (K-4) except for n = k. This gives dam(t) ," - f ak(t)e - t(E k - E„,)tl vmk (K-9) To evaluate ak(t), set m = k. Then - ak(t)vkk h d d t) or _ t v dt dak(t) ak(t) — h kk Integrate both sides from 0 to t' > 0, remembering that the vmk are all independent of t for t > 0. This gives [ln ak(t)J D ^ C - ^ vkktJ which is In Cak(til According to (K-7), ak(0) = 1. So we find ^- - vkkt ' - e- ivkkt/h ak(t) t' > 0 t > 0 (K-10) where we have dropped the primes to simplify the notation. Next evaluate the an(t), n k, by setting m = n in (K-9) and by making the additional approximation that ak(t) = 1. We have daa^ t) - ttE k e - En)tl h via n k or dan`/ t) ti - th nk -i(Ek - En)tlh dt n k [an(t)1 r 0o From (K-7), an(0) = 0, so N r vnk C Ek — e - i(Ek - En)tlh To' En [e - i(Ek - E n)tlh vnk n k (K-11) 1] En where we have again dropped the primes. Evaluating `h'(x,t) from (K-3), (K-10), and (K-11) we find an(t) _ Ek — — e-i(Ek+vkk)tIhY'k(x) + E E Unk —E n n#k k ne- Mal [e -i(Ek-En)t/i'i — (K-12) Note that the energy Ek + Vkk appearing in the exponential of the first term is exactly the perturbed energy Ek = Ek +vkk, which would be predicted for a completely time-independent perturbation equal to v(x). It is of interest to consider the quantity a,'(t)an(t). This real function of t is the square of the magnitude of the coefficient an(t). Multiplying (K-11) into its complex conjugate, we find an (t)an(t) ^ v kvnk sin' h2 En — Ek) t^ L( 2h En — Ek 2 (K-13) 2h I This quantity oscillates in time between zero and 4v kv nk/(En — Ek)2, with frequency y = (En — Ek)/h. We plot in Figure K-1 the factor sin e [(En — Ek)t/2h]/[(En — Ek)/2h] 2 as a function of (En — Ek)/2h for fixed t. Now the wave function describing the particle initially contained only the wave function Y'k(x,t) for its single quantum state with quantum number k. The perturbation v(x,t) has the effect of mixing in contributions from other states over a whole range of the quantum number n. However, we see that the most important contributions come from those n which correspond to eigenvalues En lying within a range centered about Ek and of width AE, where AE/2h nit or (K-14) AE 2i h/t Now the value of an(t)an(t) at any instant t is equal to the probability of finding the particle in the quantum state n at that instant. (If this statement is not considered self-evident, ^ - 37r - - ^r t 0 ^ 2ar 3^r t En - Ek 2X Figure K-1 The plot of a function which arises in time-dependent perturbation theory. A1:1O3H1 NOIlb'8a(lla3d 1N34N3d30 -31/1I1 w Integrate from 0 to t' > 0 to obtain TIME- DEPENDENT PERTU RBATION THEORY _ X ^ it can be proven by using the second operator association of (5-32) to calculate the expectation value of the particle's total energy for the wave function of (K-3), and then interpreting the results in light of the fact that if the particle is in quantum state n a measurement of its total energy can yield only En .) Thus at any time t there is a certain probability of finding the particle in final quantum state n which is different from the initial quantum state k, and with total energy En different from the initial total energy Ek. This appears to be a violation of the law of conservation of energy by an amount En — Ek, which may be large compared to the energy vkk supplied by the perturbation. However, in the time interval 0 to t the probability of finding the particle with energy En is important only when En — Ek is at most equal to about AE, where t and AE are related by (K-14). According to the uncertainty principle (K-6), any measurement of the total energy of the particle which is carried out in this time interval must be uncertain by an amount of the order of b/t, which is comparable to AE. This removes the difficulty and provides an example of the uncertainty principle. Consider (K-13) for small values of t > O. The equation says that the probability of finding the particle in a particular quantum state n is proportional to the square of t. This statement is in contrast to the linear dependence on t that might be expected intuitively. However, physical intuition is always based on our experience with systems in the classical limit In that limit the resolution of any experimental apparatus is so large compared to the separation of the eigenvalues, or even to the width of the range AE, that it is not possible to measure an(t)an(t) for a single value of n. All that can be measured classically is the total probability of finding that the particle has made a transition from the initial quantum state k to some other final quantum state n. We express this in terms of the transition probability Pk, which is defined as ^ O. En Pk = Q (K-15) an (t)an(t) n# k To evaluate it, we assume that there are a large number of closely spaced final quantum states in the range AE; the number of final quantum states dNn per energy interval dEn is the density of final states p n = dNn/dEn . Then the summation over n can be approximated by an integral over dN n . That is co a(t)an(t) dE dEn an (t)an(t) dNn = Pk — " n ^ c0 — 00 Evaluating an(t)an(t) from (K-13), we have N 1 Pk °° r — ^2 * sin vnkUnkpn J t^ [(En 2h (En - 00 dE E 2 kl — 2h Owing to the factor sin 2 [(En — E k)t/2h]/[(En — E k)/2fî] 2 , most of the contribution to the integral comes from the range AE. If we assume that the matrix element v nk and the density of final states pn are both slowly varying functions of n in that range, we can write N Pk — v h2 sin 2 nk Pn - cp [(En — Ek) 2h CE 2h \2 n— t J dE n k The quantum number n now refers to a typical final quantum state in the neighborhood of the initial quantum state k. Let Z = (E„— Ek)t/2h; then co Pk ^_ vnkvnk h2 p n2l^2t J sin2 Z2ZdZ — 00 which gives Pk ti 2n ^ v kvnkPn t (K-16) R. ti 27E ^ vrykv ry kP n (K-17) This important formula is often called Golden Rule No. 2. It is very widely used in advanced work in quantum physics because it is of very general applicability. In any situation in which transitions are made to an essentially continuous range of final states under the influence of a constant perturbation, the transition rate can be evaluated from this formula. Note that we have here a good example of the use of quantum mechanics in the evaluation of transition rates. The ability to do this is one of its most important advantages over the old quantum theory. An equation in the text that is closely related to Golden Rule No. 2 is (8-43), giving the rate at which atoms make transitions from a higher energy quantum state to a lower energy one. Although it is not identified in the text as such, the basic equation in the treatment of beta decay of radioactive nuclei is actually Golden Rule No. 2. In this equation, (16-12), the beta decay matrix element M plays the role of v„ k in (K-17). And the term (E — K e)2pe, being proportional to the product of the number of quantum states per unit energy interval for the antineutrino and for the electron, plays the role of p„. Appendix L is based entirely on Golden Rule No. 2. PROBLEM 1. At t < 0 an electron is known to be in the n = 1 quantum state of a one-dimensional infinite square well potential which extends from x = — a/2 to x = + a/2. At t = 0 a uniform electric field is applied in the direction of increasing x. The electric field is left on for a short time r and then removed. Use time-dependent perturbation theory to calculate the probability that the electron will be in the n = 2, 3, or 4 quantum states for t > i, in terms of the strength of the electric field. Make plots of these probabilities as a function of r. (Hint: Some of results of Problem 1 of Appendix J can be used.) w318oad The transition probability is proportional to t, as expected. The transition rate Ric - dPk/dt is independent of t, since Appendix L THE BORN APPROXIMATION In this appendix we develop a method, due to Born, for obtaining approximate quantum mechanical predictions for the differential cross section da/dil and cross section a that describe the way a potential V(r) scatters a particle in three dimensions. It depends on material developed in Appendices J and K. The first step is to give a quantum mechanical description of a particle in the beam that is incident upon the scattering potential. We do this by extending to three dimensions results that are familiar in one dimension. Equation (6-9) shows that a one-dimensional eigenfunction for a free particle of mass m traveling with velocity y in the positive direction along the x axis is tfr(x) = Ae` kX (L-la) where k = 2it/)l = 2icp/h = my/h (L-lb) and where A is a constant. The student may show by substitution that the traveling wave eigenfunction (L-1a) is also a solution to the three-dimensional time-independent Schroedinger equation for a free particle h2 [ where ax2 + ay2 + ôz2 a2ja20 a2 J = EIP (L-2a) E = p2/2m = h 2 k2/2m (L-2b) In three dimensions, (L-1a) describes a particle which is definitely known to be moving parallel to the x axis with velocity y, whose y and z coordinates are entirely unknown since tJi*(x)/i(x) is obviously independent of y and z, and whose x coordinate is also entirely unknown since ikxAeikx = A*A (L-3) 0 *(x)(x) = A*e Thus the particle is moving somewhere in a beam, parallel to the x axis, of infinite transverse and longitudinal dimensions. Of course, this is not physically realistic since all beams are always limited in their transverse dimensions by diaphragms of finite aperture and in their longitudinal dimensions by the finite length of the apparatus. On the other hand, the dimensions of real beams are extremely large compared to the characteristic atomic or nuclear dimensions. Therefore (L-la) provides an accurate description of the incident particle in the region of importance where the atomic or nuclear potential which produces the scattering has any appreciable value. The unrealistic aspects of (L-1a) are, however, the origin of certain problems concerning the normalization of the eigenfunction. In Section 6-2 we showed that these problems can always be handled and can usually be ignored. The present calculation provides an example of a case in which they cannot be ignored; we must use a three-dimensional extension of the technique of box normalization in a form called periodic boundary conditions. We set (L-4) A= L -3 /2 where L is the edge length of a very large cubical box surrounding the region of the scattering potential, and we restrict the range of the space variables to lie within the box. Then the eigenfunction is normalized because ^^*^ = A*A = A 2 = L -3 (L-5) L-1 THE BORN APPROXIMATION L L L >< Figure L 1 The space dependence of the real part of an eigenfunction in box normalization with periodic boundary conditions. - and J = L JdT = L 3L3 = 1 where di is the volume element, and where the integration is now taken only over the volume of the box. We furthermore demand that the eigenfunction and its space derivative in the direction normal to the wall have the same values at corresponding points of the opposing walls of the box. The real (or imaginary) part of i/i will then typically have the behavior plotted in Figure L-1 as a function of one of the space variables, holding the other two constant. Using periodic boundary conditions, the eigenfunction will be completely periodic, with period L, in all three directions. Its behavior repeats indefinitely in adjacent boxes (just as a scene observed from within a cube with mirror walls repeats indefinitely), and we are justified in considering what happens only within a single box—that is, in restricting the range of variables to the box. In most cases of physical interest the scattering potential is a spherically symmetrical function V(r). We assume this to be true, although it is not a necessary restriction. It is then obviously convenient to describe the incident particle in terms of the spherical coordinates r, 8, 4) instead of the rectangular coordinates x, y, z. Define the origin to be at the center of the potential and the polar axis to be along the x axis. This means x= r cos 8 and >Ji for a free particle of the incident beam can be written I, = L-3/2 eikx = L - 3/2 e ikr cos B = L-3/2eik•r (L-6) where k is a vector of magnitude k directed along the beam direction, which is the direction of the x axis, and where r is a vector from the origin to the point (r,B,çi). In this form the normalized eigenfunction for a free particle traveling in some other direction can be written = L— 3/2 e ik' • r (L-7) where k' is a vector in the direction in question of magnitude equal to the value of k' appropriate to the mass and velocity of the particle. The validity of (L-7) can be verified by the same arguments as were used for (L-6) Now consider a particle in the incident beam impinging upon the potential V(r). We want to calculate the probability per unit time that the particle will be scattered in some direction. If V(r) is not too strong, we can treat this as a perturbation problem: What is the rate at which a constant (in time) perturbation V(r) induces transition from the initial quantum state associated with the free particle eigenfunction (L-6) to a final quantum state associated with the free particle eigenfunction (L-7)? Since the final quantum state is in an essentially continuous range of final quantum states because the possible eigenvalues E' = h2ki2/2m are almost continuously distributed even with box normalization, the answer is given, approximately, by a threedimensional extension of Golden Rule No. 2, developed in Appendix K. It is Rk ^ 2n vk' k vk 'kl^k' h (L-8a) where Rk is the rate we wish to calculate, where we have used the vectors k and k', instead of quantum numbers, to label the initial and final states, and where v k•k is the matrix element of the potential taken between these states. That is vk k = (L— 3 /2eik' r * ) V(r)L- 3 /2 e ik • rdZ = L— 3 I V(r)ei(k —k') rdz = L — 3 Vk'k (L-8b) r with Vk'k = f V(r)e i(k - k') • r di The quantity pie of (L-8a) is the number of possible quantum states per unit energy interval for the particle associated with the final eigenfunction. As we have employed box normalization with periodic boundary conditions (required because the eigenfunctions appearing in V k'k must be normalized), the density of final states p ie will have some finite value since the boundary conditions impose restrictions on the possible de Broglie wavelengths. As an example, consider k' parallel to one edge of the box. Then the real (or imaginary) part of tfi would typically have the appearance shown in Figure L-1, with the distance d equal to the de Broglie wavelength = 2x/k'. The periodic boundary conditions can be satisfied for propagation parallel to one edge of the box only if L contains exactly an integral number of wavelengths of the traveling waves. Compare this with the case of free particle standing wave eigenfunctions in a box with impenetrable walls, which is treated in Section 6-8. In that case, the boundary conditions demand that the have nodes at the walls of the box. For the propagation direction parallel to one edge of the box, the condition can be satisfied if L contains either an integral number of wavelengths or a half-integral number of wavelengths. Consequently, in every wavelength or energy interval there are two times as many allowed wavelengths in the standing wave case as there are in the traveling wave case. However, for each possible wavelength there are two separate traveling waves, one propagating in one direction and another propagating in the opposite direction. The factors of 2 cancel out, not only for propagation in directions parallel to the edges of the box but also for propagation in all directions, and the number of possible quantum states per unit energy interval is therefore the same in both c as es. Example 1-3 calculates the number of electromagnetic standing waves that fit into an impenetrable-walled box, for each interval of wave frequency. Section 11-10 shows that the results of the calculation immediately yield (11-49), which specifies the number of standing wave eigenfunctions per unit energy interval that fit into such a box. Since the number of possible quantum states per unit energy interval is the same for the case of traveling wave eigenfunctions in box normalization with periodic boundary conditions, we may use (11-49) here. In our present notation, it is Pk' dE' = m3/2 L 3E'1/2 dE' 21/2 ir2h3 (L-9) where L 3 is the volume of the box and where E' = h2k'2/2m Therefore Pk' M3/2L 3 h k'_ mL 3k' =2 n 27r 2fi 2 1/2 2^i 3 2 1/2m 1/2 (L-10) This is not quite what we want because it is the density of all states associated with k', whereas we want pie , the density of states associated with k' when that vector lies within some certain range of directions. Now it is clear that for a spherically symmetrical potential V(r) the scattering angular distribution will not depend on the azimuthal angle 4. Consequently, it is appropriate to consider together all final states associated with vectors k' whose directions lie anywhere within the angular range B to B + dB. The density pk' of these states is smaller than Pk, by a factor equal to the ratio of the solid angle an = 2x sin B dB contained within the range B to . 0 + dB to the total solid angle 47r contained within the entire angle of B. (See Figure 4-8 for a definition of solid angle.) That is dS2 Pk' = Pk' — so 4^ Px' mL 3 k' 3 2 dS2 87r h Using this in (L-8a), we have Rk ^ 2n _ 6 h k' dû k kVk k 87E3h2 NOIldWIXOFidd `d N 1:IO9 3H1 w Figure L 2 Proof that I = WO. At time zero consider a rectangular parallelepiped with ends of area da and length y dt extending along the particle's direction of motion. If the particle is anywhere within its volume then by time dt it will cross the end toward which it is moving. The probability that this will happen is the probability per unit volume `l`*tY = t*ti of finding the particle in the parallelepiped multiplied by its volume vdtda, or dtda. The probability per unit time per unit area is v>L"i/i. This is the quantity defined to be the probability flux /. TH E B ORN APPROXIMATIO N - Area da Length udt Now let us calculate the probability per unit time that the particle in the initial quantum state associated with the vector k will cross a unit area normal to the direction of k. This is the incident probability flux I. Its value is proven in the caption of Figure L-2 to be the product of the probability ,li* /i of finding the particle in a unit volume and the velocity y of the particle. That is I = v/i *fi (L-13) With (L-1b) and (L-5), this becomes I= kh - s L m (L-14) Next, divide the transition rate Rk by the element of solid angle dit to obtain the rate of transitions per unit solid angle into the final states associated with the vector k'. Then we have the probability per unit time of scattering into a unit solid angle at the angle 0, which is S(0), the scattered probability flux. Thus mL -3 k' (L-15) Vk'kVk'k S(0) ^- 4^2iz3 Section 4-3 defined a differential cross section in terms of an incident beam containing many particles and a target containing many scattering centers, and used an arbitrary time interval in the definition. Here we deal with a single particle incident on a single scattering potential, and also consider a unit time interval. We adapt the previous definition to the present need by writing S(0) = dS 2 (L-16) Here I and S(0) are the incident and scattered probability fluxes, defined as above, and the differential scattering cross section dQ/di2 is defined to be the proportionality constant relating the two. Solving (L-16) for du/di2, and using (L-14) and (L-15), we obtain daS(9) _ dit I m mL -3 k' V k kYk k khL - 3 4ir2b3 But k' = mv'/h = my/>h = k because the initial and final speeds of the particle are the same when it scatters from the potential V(r) whose center remains fixed at the origin of coordinates. Therefore da dit 27th 2 Vk'k = V(r)e i(k -k') ' r di m 2 Vk'kVk'k (L-17a) where (L -17b) with the integration taken over a very large box surrounding the scattering potential. This is the Born approximation for dQ/di2. Note that the size of the box has dropped out since L does not appear in (L-17a), and since contributions to the integral in (L-17b) will come only from the small region in which V(r) has any appreciable value and therefore the value of the integral is independent of ifs limits. (We use this limit independence in writing (L-19).) It is possible to carry out part of the integration of (L-17b) immediately. Define (L-18) X= k —k' which is, physically, 1/h times the negative of the momentum transferred to the scattered particle by the scattering potential. Also define a set of spherical coordinates r, 0, 0 with an origin at the center of the potential and polar axis along the direction of x. (They should not be confused with the spherical coordinates r, 0, 0 whose polar axis lies along the direction of k.) Then (k —k')•r=x•r=xrcos0 and di = r2 sin 0 dr dO d0 so ^ Vk'k = n Z7c JJ J 000 V(r)e`Xr cos ° r2 sin O dr d0 d0 (L-19) or 00 Vk' k = f n f V(r)e`Xr cos ° 2xr 2 sin O dr dO J J 0 0 The Co integral can be evaluated by making the change of variable Z = co Vk'k = r V(r) et J [ Xr — e ixr ixr cos O. The result is eXr - (L-20a) J 27CY2 dr 0 which is Tick V(r) = 0 sin xr 2 47xY dr xr (L-20b) Finally, let us express x in terms of the scattering angle O. Consider the vector diagram of Figure L-3, which illustrates the relation (L-18). From this figure it is apparent that (L-21) x = 2k sin (0/2) AN EXAMPLE Consider a three-dimensional attractive square well potential V(r) O Vo whose radial dependence is illustrated in Figure L-4. Here R Vk ,k = — Vo 0 sin xr Zr 47CY2 dr and we obtain upon integration Vick = —47rVoR 3 [sin xR — xR cos xR] (xR)3 r> R (L-22) 31dW `dX3NV Illustrating the relation between the vectors which enter in the Born approximation. Figure L-3 THE BO RN APPRO XIMATION 0 X O c â â Figure L 4 — Va - potential. An attractive square well So 2 m d6 zR cos 2 V°R 2 6 [sin xR — (xR)6 dS2 ^ 2^h21 16^r xR] 2 ( or do- 4m2 2 6 {sin [2kR sin (0/2)] — 2kR sin (0/2) cos [2kR sin (0/2)] } 2 — h4 V°R (L-23) d [2kR sin (0/2)]6 The form of this differential scattering cross section is indicated in Figure L-5. At 0 = 0, xR = 0 but [sin xR — xR cos zR] 2/(zR)6 = 1/9. Consequently da/dS2 has a finite maximum at 0 = O. It drops with increasing angle, reaching its first zero when sin zR — xR cos xR = 0 has its first nonzero root. This is zR = 4.49 or 2kR sin (0'/2) = 4.49 At high energies, kR » 1, 0' « 1, and the value of 0 at the first zero of da/dS2 is 4.49 B' (L-24) kR For this scattering potential, or for any other with a moderately "sharp" edge, da/dS2 has the characteristic behavior of an optical diffraction pattern: it has consecutive maxima and minima with the largest maximum in the forward direction. The angle 0' decreases with increasing k (increasgyofthpiclernasgfquyothepni calse), and the angular distribution becomes more strongly peaked forward. The separation in angle between adjacent minima has a value 0 which is given approximately by BNB^N 4.49N— 4.49 A kR 2ir R 4.49 2 6.28 R N— or 0N A R (L-25) This result is used on several occasions in the text when discussing nuclear and particle physics. The scattering cross section for the potential we have considered can be evaluated from its differential cross section by calculating ^ dS2 v =J ^ e Figure L-5 (L-26) ' The differential scattering cross section for an attractive square well potential. PROBLEMS 1. Use the Born approximation to evaluate the differential scattering cross section for an attractive Gaussian potential V(r) = — Yoe (1111)2 Define the "width" of the forward maximum in terms of the angle at which it falls to 1/e of its peak value. Then compare it to the width of the forward maximum for the attractive square well potential, defined as O' in (L-24). 2. Use the Born approximation to calculate the differential scattering cross section for the screened Coulomb potential V(r) = (ZZe 2/4rcEor)e - rid This provides a useful approximation to the potential between a charged particle and a neutral atom if d is set equal to the radius of the atom. Then let d -> oo, and show that dQ/dII approaches the Rutherford scattering differential cross section (4-9) when V(r) approaches the normal Coulomb potential. SW3 1 80 ad where the integral is taken over all solid angle. (This obvious equality follows from the definitions of the quantities involved.) We shall not actually carry out the integration because the important characteristics of the scattering cross section are easy to see qualitatively. That is, Q decreases with increasing k because the angular region in which d6/dll has an appreciable value becomes smaller. In closing, we must discuss the range of applicability of the Born approximation. The condition of validity of the perturbation theory underlying the approximation is (K-8), which states that the amplitude of the wave function for the scattered particle is small compared to the amplitude of the wave function for the incident particle. It is also necessary that the free particle wave function for (L-6) be a reasonable representation of the incident wave function and that the free particle wave function for (L-7) be a reasonable representation of the scattered wave function, in the region of the scattering potential where Vk-k is evaluated. These conditions will usually be met if the energy E of the incident particle is large compared to the magnitude of the scattering potential, that is, if for all r (L-27) E» IV(r)I because then the scattering potential is a small perturbation which can usually produce only a small effect. When the Born approximation is not applicable a method called partial wave analysis can be applied to evaluate the scattering produced by a potential. A development of this mathematically complicated method can be found in most quantum mechanics textbooks. Appendix M THE LAPLACIAN AND ANGULAR MOMENTUM OPERATORS IN SPHERICAL POLAR COORDINATES THE LAPLACIAN OPERATOR The Laplacian operator V 2, which enters into the three-dimensional Schroedinger equation, is defined in rectangular coordinates as v2 = a2 a2 a2 ax2 + aY 2 + az2 (M-1) We show here how to transform the operator into the form it assumes in spherical polar coordinates, which is C rz Ci 2 a) snB (M-2) 9 dr r sin B acp 2 + r sin B 00 The most straightforward way to carry out the transformation is \ to make repeated applications of the "chain rule" of partial differentiation. This is a tedious procedure. But the first term of (M-2) can be obtained, without too much tedium, by considering a case in which the Laplacian operates on a function i/ = /i(r) of the radial coordinate alone. In this case, the derivatives in the last two terms of (M-2) yield zero, and we have 02 r ar j + V2 1 a \r2 r2 We shall obtain this expression from the expression 2 = V20 ar atk) ar a2^ a2i a2^ ax 2 + ay2 aZ2 which is the Laplacian in rectangular coordinates of (M-1), operating on si(r). To do this, we use the relation r = (x2 + y2 + z2)112 connecting the rectangular and the spherical polar coordinates (see Figure 7-2). We evaluate M 1 - N THE LAPLACIAN AND ANGU LAR M OMENTUM OPERAT O RS 2 and ax 1 a:ji a (i alp) a x a>/i ax2 ax ( r ar ) ax ( r ar ) x ax r ar a2 tp 1 alk x ar a 1 a^y a2 lk ax2 a2 0 1 ao Similarly, the y and z ax ar ^ r ar ) a1 alk) ar ( r ar ) x2 ar + r r' derivatives yield a2 tp _ 1 alp y2 a (tatp ) ayz r ar + r ar r Or and a20 _ 130 z 2 a az2 rar Adding these three expressions, we obtain or + r ar ( r ar 00 y2 + z2) a (1 °2 = 3 ar + (x2 + r Or r ar a^ ) 3 2 ao a C1 ovi) °— r ar +r ar —— r Or Now note that the expression we have obtained expands to v2 a) ar + r 3a i r ar a2tp ( 1 + — r2 ar r ar t or v2 a, 020 r ar ar 2 2 Also note that the first term of (M-2), that is °2,/' V20 _ 1 a (r2 a^) r2 ar ar expands to °20 = 2 , J, (2r ^ + r2 07,2 r ) or 2 a, a2 4' 2 ° — rar + ar2 Comparison shows that the expression we have obtained is identical to the first term of (M-2). The second and third terms can be obtained by taking tfi = CO, and then taking t = O(B). THE ANGULAR MOMENTUM OPERATORS In rectangular coordinates, the operators for the three components of orbital angular momentum are Lxop = — ih\y a —z a) Y LyoP = t^2 z axa — x aza) _C LZop = — a al i^i x C aY— y ax (M-3) Lxop = ih (sin cp ae + cot 0 cos go ) W Lyop = ih C—cos q) LZop = —ih 0 + cot 0 sin 9 acP / (M-4) a9 We shall show that these are equivalent, taking L Zop as the simplest example. To do this, we must use the relations x=r sin 0 cos q) y = r sin 0 sin cp (M-5) z = r cos 0 connecting the rectangular and spherical polar coordinates (see Figure 7-2). It is easiest if we start by applying the chain rule to aVi/acp, and obtain alalp — ax a>y ay a^ OZ 7cp ax 7cp + ay acp + az 8cp From (M-5), we have ax = —r sin O sin cp= —y 89 - ay = r sin O cos cp= x - cP aZ - =o acp Thus alfr — acp alif y ax + x ay As an operator equation, this reads a alp _ a a &p —y ax +x ay which verifies the equivalence of the two forms of LZop quoted in (M-3) and (M-4). Similar calculations will do the same for L xop and Lyop • In rectangular coordinates, the operator for the square of the magnitude of the orbital angular momentum is ) 2 (M-b L op x = L 2 op + Ly2+2 L Zop By squaring Lxop , Lyop , and LZop , and adding, it is found after some manipulation of the sinusoidal functions that \2 / (M-7) Lôp = —fie[si1 a (s in O a t+ z a 2 n 0 a0 \ sin 0 a9 J Note the relation between (M-7) and the last two terms in (M-2). It forms the basis of an alternative way of obtaining those terms, which can be found in mathematical reference books. o p a0/ PROBLEM 1. By using the techniques of Appendix M, show that Lxop has the form stated in (7-37). w31eoad When transformed to spherical polar coordinates, these operators asssume the forms Appendix N SERIES SOLUTIONS OF THE ANGULAR AND RADIAL EQUATIONS FOR A ONE-ELECTRON ATOM This appendix outlines the procedures used to obtain analytical solutions to (7-16) and (7-17), the differential equations that specify the angular and radial behavior of the one-electron atom eigenfunctions and also lead to the determination of the eigenvalues. These equations are M m ^2 N _ _ l(l + 1)OM sin B (N-1) sin B dB d9 + sine O and r dr — — r2 ( dR) + Z [E — V(r)]R = 1(1 + 1) (N-2) The central feature of the procedures is essentially the same as that employed in Appendix I to obtain a power series solution to the time-independent Schroedinger equation for a simple harmonic oscillator potential. The treatment given in that appendix was quite detailed, while the one given here is brief. Thus the student should read Appendix I carefully before beginning this material. THE ANGULAR EQUATION The first step in solving (N-1) is to write it in a more concise form by changing to the independent variable. z = COS B (N-3) After expressing the derivatives in terms of the new variable, and using the relation cos t O + e O = 1, it is easy to show that the equation assumes the form sin [(1 — N 2 [/(/ + 1) — 1 mi z2 ] O = 0 (N-4) + The solutions to this differential equation are called the associated Legendre functions, which we write as Ot„,,(z). But it is convenient to deal with the Legendre polynomials, written as P1(z), because they are more widely encountered and because they solve a simpler differential equation. The relation between the two functions is dz z2) dO] Olm t (Z) = (1 — Z2)Imtl/2 dlmt lmt! dz tl (N-5) and the differential equation satisfied by the P1(z) is (1 — z2) d2Pi dz 2z dPi + 1(1 + 1)P1 = 0 dz (N-6) N-1 z SERI ESSO LUTIO NSFO RA O NE- ELECTRO N ATO M N z x a^ o. o. To show that the relation between the two functions defined by (N-5) is consistent with the differential equations satisfied by each of them, (N-4) and (N-6), first differentiate the latter miI times, to obtain dlmtl +l 2 dlmil+ 2 (1 — z) i — 2 (lmil + 1 )z dz lm l l+l Pi dzlmtl+2 P dint!' + [l(l + 1) — Imil(Imil + 1)] dzlmil Pi 0 (N-7) Next substitute Oimt = (1 — z2)Im11/2I, into (N-4) to produce (1 — z2) 2 dz2 — 2(1111/1 + 1)z dz + [1(1 + 1) — I miI (I miI + 1)]F = 0 (N-8) Comparison of (N-7) and (N-8) shows that F = (dlmtl/dzlmtl)P i so that Oimt = (1 — z2)Imtl/2 x (dlmtl/dzl in/ l ) Pi , in accord with (N-5). A power series solution to (N-6) begins by assuming that the P1 can be written as Pi(z) _ k=o akz k (N-9) Substituting into (N-6), and gathering coefficients of common powers of z, yields E k=o {k(k — 1)a kz k-2 — [k(k + 1) — l(l + 1)]akzk } = 0 After writing out explicitly a number of terms in this series, and again gathering coefficients ofcmnpwersz,it haequoncbxprsda E {(j + 2)( j + 1)aj+ 2 — [1(j + 1) — l(l + 1)] az j = o In order that the equality be maintained for any value of z, it is necessary that the coefficient of each power of z must vanish. Thus the recursion relation j(j + 1) — 1(1 + 1) aj+ 2 = (N-10) (j+ 2 )(j+ 1) a./ must be satisfied. Because this relation connects the values of the constants a whose indices differ by two, the series (N-9) breaks into two independent series; one involves even powers and the other involves odd powers. The even series contains as a common factor the single arbitrary constant ao . All the other constants in that series are determined in terms of a0 by the recursion relation. For the odd series the single arbitrary constant is al . The recursion relation requires that aj+2 aj as j —> co. And consideration of (N-9) shows that this means both of the series will lead to the result Pi(z) —> oo at z = + 1 if they actually are infinite series. This, in turn, would lead to physically unacceptable behavior of the eigenfunctions constructed from the Pi(z). But it can be prevented as follows. One of the series is suppressed by setting its arbitrary constant equal to zero. Then the other series is prevented from being an infinite series by requiring that l be one of the integers (N-11) 1=0, 1,2,3,... j= The recursion relation shows that this terminates the series at the lth term, so that the Legendre polynomials are of degree 1. It is straightforward to use the recursion relation to show that the first few have the forms (N-12) Po = 1, P1 = z, P2 = 1 — 3z 2, P3 = 3z — 5z3 For each poylnomial the arbitrary constant a o or a l has been chosen so that the coefficients z are simple integers. This means that the polynomials are not normalized. ofalpwers The associated Legendre functions are obtained immediately from the Legendre polynomials by employing (N-5). The first few are O oo = 1 0 10 = z, O 1 +1 = (1 - z 2 ) 1/2 (N-13) 0 20 = 1 — 3z2, 0 2 ±1 = (1 — z2)1/2z, 0 2±2 = 1 — z 2 O30 = 3z — 5z3, 0 3 +1 = (1 — z 2) 112 (1 — 5z2), e3+ 2 = (1 — Z 2)z, 0 3 ±3 = (1 — Z 2) 312 This is just the condition of (7-27), which Example 7-1 shows to be equivalent to the condition of (7-20). By using (N-3) to convert from z back to cos 0, and using also the relation cost 0 + sin2 0 = 1, the O l,,, i can be written as polynomials involving sin 0 and cos 0. If the student does this, he will recognize that their general behavior is correctly described by (7-21). He will also recognize the specific behavior seen in the one-electron atom eigenfunctions of Table 7-2. THE RADIAL EQUATION Upon writing the potential energy as V(r) = - Ze e /4n € 0r, the radial equation, (N-2), assumes the form R Ze2 21 1d R=1(1 +1) r2 dR E+ r2 dr ( dr ) (N-15) 47cEor h2 [ In terms of the new independent variable p = 2,6r .------ 2 where 132 = j () i,A ^ (N-16) z 2µE h (N-17) and also using µZe 2 47cE0h2/3 Y the equation becomes (N-18) [ L 1 1(1 + 1) y 1 d 2 dR p2 d p (P d p + - 4- p2 + P R= 0 (N-19) The power series procedure cannot be applied directly to (N-19) because it leads to a recursion relation involving more than two of the constants appearing in the series. But it can be applied indirectly by first considering the form of the solutions R(p) for very large values of p. For p -> co the second and third terms in the brackets can be ignored in comparison to the first term and so (N-19) reduces to 1 d( p -> co (N-20) pi dp p2 dR dp l = 4It is easy to verify that p -> co (N-21) R(p) = e Pl2 is a solution to (N-20) which remains finite. This suggests that we search for a solution to (N-19) of the form R(p) = e - P/2F(P) (N-22) Substitution of (N-22) into (N-19) leads, after some manipulation, to 1 dF + ry-1 LL p d2F+ (2 -1 p f dP dp2 1(l + 1)1 F=0 P2 (N-23) J This differential equation determines the functions F(p). A power series solution to (N-23) begins with the assumption (X) F(p) = pS k= 0 akpk ao 0, s > 0 (N-24) This form is used because it ensures that F will be finite at p = 0, even though there are several terms in (N-23) which become infinite there. Substituting into (N-23), and gathering coefficients wZ NOlb'f1 O31b`I dH a 3H1 The arbitrary constants have, again, been adjusted to make these unnormalized polynomials look as simple as possible. Note that for a given value of 1 the combined properties of (N-5) and (N-12) require that m 1 be one of the integers mj =-1,-1+1,...,0,...,1-1,1 (N-14) SERIESSOLUTIONSFORAONE- ELECTRON ATO M -rz of common powers of p, produces CO E {[(s + k)(s + k + 1) -1(1+ 1)]akp s+k-2 —(s + k+ 1 k=0 —y)a kps +k -1 } = 0 After writing out explicitly a number of terms in this series, and again gathering coefficients of common powers of p, it is seen that the equation can be expressed as [s(s + 1) — 1(1 + 1)]aops -2 + i=0 {[(s + j + 1)(s + j + 2) — 1(1 + 1)]a+ 1 —(s + j + 1 —y)ai}ps+i1 = 0 In order that the equality be maintained for any value of p, it is necessary that two relations be satisfied. They are (N-25) s(s + 1) — 1(1 + 1) = 0 and a '+1 s+j+1—y (s+1+1)(s+j+2)— l(1 +1) a _ (N-26) ' The first determines the possible values of s; it is called the indicial equation. The second is the recursion relation connecting the values of the constants a whose indices differ by one. The indicial equation, (N-25), is quadratic in s. Its two roots are easy to find; they are s =1 >0 and s = —(1 + 1). The latter must be rejected because it violates the physical condition s _ 0. Thus we set s =1 so that F(p), or any eigenfunction constructed from it, is finite at p = in (N-26) and write the recursion relation as j+1+1—y _ aj+ 1 (j + 1 +1)(j +1 +2) — l(1 +1) aJ (N-27) Inspection of the recursion relation shows that for j —* oo it requires aj+1 -' ai/j. This ratio of the successive constants in the series expansion of F(p) is the same as in the series expansion for es'. Thus R(p) = e °12F(p) —* oo as p — co if the F(p) series actually is an infinite series. To prevent such physically unacceptable behavior in the eigenfunctions containing R(p), the series is terminated by requiring that y be one of the integers y=n (N-28) n=1+ 1, 1+ 2, 1+ 3,... (N-29) 1= 0, 1, 2, 3, ... (N-30) where with Consideration of (N-27) verifies that doing so causes the series to terminate at the [n — (1 + 1)]-th term. And inspection of (N-24) shows this makes the F(p) be polynomials of order n — 1. The condition (N-29) is identical to (7-26), which expresses the possible values of the quantum number n for a given value of the quantum number 1. The one-electron atom energy quantization equation, (7-27), is obtained from (N-17), (N-18), and (N-28), as follows p2h2 11 2Z2 e4h2 21/ (47cE0)2h4n22µ E= or µ Z2 e4 E n (4^cEO)22h2n2 =— n = 1, 2, 3, ... (N-31) Schroedinger's very first substantial application of his new theory was to the one-electron atom. When he obtained (N-31), which he knew to be in accurate agreement with experiment, he knew the theory must be taken seriously. The functions expressed by (N-24) are written as Fn1 to indicate that their specific dependences on p are determined by the values of n and 1. By using (N-27) and (N-28), it is PROBLEMS 1. Fill in all the details leading from (N-1) to (N-6), the differential equation satisfied by Legendre polynomials. Also make the comparison between (N-7) and (N-8). 2. Carry out in detail the power series solution to (N-6), the differential equation satisfied by Legendre polynomials, to the point of obtaining the recursion relation (N-10). 3. Use the Legendre polynomial recursion relation, (N-10), and the condition that l be an integer, to show that the first few polynomials have the forms quoted in (N-12). Then verify the forms quoted in (N-13) for the first few associated Legendre functions, and use them to show that (7-21) and the entries in Table 7-2 have the correct dependence on O. 4. Fill in all the details leading from (N-15) to (N-23), the differential equation for the function F(p) which determines in part the radial dependence of the one-electron atom eigenfunctions. 5. Carry out in detail the power series solution to (N-23), the differential equation for the function F(p) which determines in part the radial dependence of the one-electron atom eigenfunctions, to the point of obtaining the indicial equation (N-25) and the recursion relation (N-26). 6. Use the indicial equation, (N-25), and the recursion relation, (N-26), to verify that the first few functions Fr1, which determine in part the radial dependence of the one-electron atom eigenfunctions, have the forms quoted in (N-32). Then use these forms to show that (7-24) and the entries in Table 7-2 have the correct dependence on r. z en sw318o ad straightforward to determine their forms. The first few are Flo = 1 (N-32) F20 = 2 — p, F21 = P F30 = 6 — 6p + p2, F31= 4p— p 2, F32 = p2 For each of these unnormalized polynomials, the arbitrary constant has been adjusted to give it the simplest appearance. They are closely related to what are called the associated Laguerre polynomials. According to (N-22), the functions specifying the radial dependence of the oneelectron atom eigenfunctions can be written as R„1 = e nl2Fnt (N-33) If the student uses (N-16), (N-18), and (N-28) to express the R,a as functions of r, instead of p, he will then recognize the general behavior described by (7-24) as well as the specific behavior seen in Table 7-2. Appendix O THE THOMAS PRECESSION The relativistic effect which introduces the factor of 1/2 in (8-25) for the spin-orbit orientational potential energy is called the Thomas precession. It is not difficult to understand if we keep the geometry sufficiently simple. For this purpose, let us assume that the electron moves about the nucleus in a circular Bohr orbit, as illustrated in Figure 0-1. The figure shows the situation as seen by an observer in the nuclear rest frame xy. The electron is momentarily at rest in the frame xiy i at the instant t 1 , and momentarily at rest in the frame x 2y2 at the slightly later instant t 2 . Both the axes of xy and of x 2y2 have been constructed parallel to the axes of x iyi , as seen by an observer in x iyi . Nevertheless, we shall show that the observer in xy sees the axes of x2y2 rotated slightly relative to his own axes. He sees the axes of the x 3y3 frame rotated even more, etc. Thus he sees that the set of axes in which the electron is instantaneously at rest are precessing, relative to his own set of axes, as the electron goes around the nucleus— even though the observers instantaneously at rest relative to the electron contend that each set of axes x„+ 1Y„+ 1 is parallel to the preceding set x„y„. By using a sequence of reference frames x„y„ in which the electron is momentarily at rest, and which are each moving with constant velocity relative to the others and relative to the xy frame, we can apply special relativity theory to the problem even though the electron is accelerating relative to the xy frame. Figure 0-2 shows xy, x iy i, and x 2y2 from the point of view of the observer in x 1y1 . Since the electron is moving with velocity y relative to the nucleus, the axes xy are moving with velocity —y in the direction of the negative x 1 axis relative to x 1 Yi . As seen in x 1 yi, the electron is accelerating toward the nucleus with acceleration a in the direction of the positive y i axis. If the time interval (t 2 — t 1 ) is very small, the change in velocity of the electron in that interval is dv = a(t2 — t 1 ) = a dt (0-1) and this will be the velocity of x 2y2 as seen by x iyi . Now let us use the relativistic velocity transformation equations of Appendix A to evaluate the components of u a, the velocity of x2y2 as seen by xy. These give dvx — v x 1 1 vxdvx O+v =v — v•0 2 uQy = dv y 1 — 1 C2 2 c = dv vxdvx C2 Figure 0-1 The frames of reference used in calculating x2 x3 the Thomas precession. O-1 N Y2 i THE THO MASPRECESS IO N O Y1 y du x2 xl Figure 0-2 The frames of reference used in calculating the Thomas precession, as seen in the x ly 1 frame. x Using the same transformation equations to evaluate the components of u b, the velocity of xy as seen by x2y2 , we have ubx = uby = vv —dvv = — dv 1 dvyv y c2 Next we calculate the angle between the vector u a and the x axis of the xy frame. It is B — u Qy — a u ax dv 11 2 11 1 v c2 The angle between the vector u b and the x axis of the x 2y2 frame is — dv — 6b — ub y ubx 1 dv2 C2 Figure 0-3 shows the x2y2 and xy frames from the point of view of xy. Because of the equivalence of inertial frames, u a and ub must be exactly opposite in direction. Since the angles between the x axes and the relative velocity vectors are not the same, the x2y2 frame appears to be rotated relative to the xy frame. The angle of rotation is d6= Bb — dv Ba = v2 1— i C2 dv 2 V C2 As dv is a differential, we may neglect dv 2/c2 and obtain dB =^ v I1— v \ v c22/ As the velocity of an electron in a one-electron atom is relatively small compared to the velocity of light, v2/c 2 « L (This is also true for the electrons responsible for the optical spectra in other atoms.) Thus we may obtain an excellent approximation to dû by making a binomial Y2 X2 Figure 0-3 An exaggerated illustration of the Thomas precession. expansion of the square root, keeping only the first two terms. That is O 2c 2 vdv vadt dv v 2 -.— 2vc2 = 2c2 = 2c2 where we have evaluated dv from (O-1). The axes in which the electron is instantaneously at rest appear to precess, relative to the nucleus, with the so-called Thomas frequency wT _ dB va dt = 2c2 Inspection of the figures will verify that the sense of precession is given by the vector equation _ 1 xa œT (O-2) 2c2 v Relative to frames in which the electron is at rest, its spin magnetic dipole moment precesses in the magnetic field it experiences at the Larmor frequency w. But these frames are themselves precessing with frequency w T relative to the frame in which the nucleus is at rest. Consequently, the dipole moment is seen in the nuclear rest frame to precess with angular frequency w'=w+w T (O-3) Using an equation analogous to (8-14), plus (8-24), and evaluating g and jib, we have =- 2 h v xE=vxE=— 2 2mc2 h mc (O-4) To evaluate w T in similar terms, we may use Newton's law to express the acceleration of the electron as a function of the electric field: a = F/m = — eE/m. With this, (O-2) yields _ e v xE (O-5) wT 2mc2 Thus, the precessional frequency in the nuclear rest frame is e e e w ' =— v xE+ vxE= — vxE (O-6) mC 2 2mc2 2mC2 Comparing (O-4) and (O-6), we see that the effect of transforming the spin magnetic dipole precession frequency, from the frames in which the electron is at rest to the normal frame in which the nucleus is at rest, is to reduce its magnitude by exactly a factor of 1/2. The same is true of the orientational potential energy AE since the magnitude of that quantity is proportional to the magnitude of the precession frequency w. This can be seen from equations analogous to (8-13) and (8-14) AE= —µs• B= = -T- h S•B and B Thus we have completed our verification of the factor of 1/2 in (8-25). PROBLEM 1. The Thomas precession can also be described in terms of a time dilation between the refer- ence frame in which the nucleus is at rest and the reference frames in which the electron is instantaneously at rest, which leads to a disagreement between an observer at the nucleus and the observers at the electron concerning the time required for each to make a complete revolution about the other. Work out the details of this description, and compare with the results of Appendix O. ■ dB . dv[1—(1- 318 oad 13 2 Appendix P THE EXCLUSION PRINCIPLE IN LS COUPLING If an atom contains two or more electrons that have common values of the quantum numbers n and 1, because they are in the same subshell, the exclusion principle imposes restrictions on the possible values of the remaining quantum numbers. In the Hartree approximation, these are the mi and ms quantum numbers of each electron. In this case the exclusion principle says simply that no two electrons can have the same set of all four quantum numbers. In LS coupling, the quantum numbers that are used, in addition to n and I for each electron, are 1', s', j', mi'. These quantum numbers specify the way the electrons interact in LS coupling. The restrictions imposed by the exclusion principle on the possible values of these quantum numbers are more complicated, but they can be determined as follows. Working first in the Hartree approximation, the possible values of ml and ms are used to determine the possible values of the quantum numbers m' 1 ms ml. From these the possible values of 1', s', j', in are then determined. Although in LS coupling the z components of L' and S', which are specified by ml and ms, are changed by the residual Coulomb and spin-orbit interactions, L', S', J' Jz are not changed. Therefore, the restrictions that are found in the Hartree approximation concerning the associated quantum numbers, 1', s', j', m also apply in LS coupling. As an example, we determine the LS coupling quantum numbers which satisfy the exclusion principle for two electrons in the 2p subshell. Referring to Table P-1, we first list all the possible sets of values of m 1 and ms for the two electrons, which satisfy the exclusion principle. There are 15 different sets of m1 and ms for the two electrons which satisfy the exclusion principle, and a number of others, such as m i l = + 1, ms , = + 1/2, m12 = + 1, ms2 = + 1/2, , , , Table P-1 Possible Quantum Numbers for an np 2 Configuration Entry m11 ms, m12 mSZ mi MS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 +1 +1 +1 +1 +1 +1 +1 +1 0 0 0 0 —1 -1 —1 +1/2 +1/2 +1/2 +1/2 +1/2 —1/2 —1/2 —1/2 +1/2 + 1/2 +1/2 +1/2 +1/2 +1/2 —1/2 +1 0 0 —1 —1 0 —1 —1 +1 0 —1 —1 0 —1 0 —1/2 +1/2 +2 +1 +1 0 0 +1 0 0 +1 0 —1 —1 —1 —2 —1 0 +1 0 +1 0 —1 0 —1 0 0 +1 0 0 0 —1 T1/2 +1/2 —1/2 —1/2 +1/2 —1/2 —1/2 —1/2 +1/2 —1/2 —1/2 —1/2 —1/2 m i. +2 +2 +1 +1 0 0 0 —1 +1 0 0 —1 —1 —2 —2 P-1 N i ^ Table P-2 THE EXCLUSIO N PRINCIPLE I NLSCOU PLING Entry 1 Possible Quantum Numbers for an np 6 Configuration m1 1 ms1 m12 ms2 m13 m s3 ml 4 ms4 m15 ms5 m16 ms6 ml ms m^ +1 + 1/2 + 1 - 1/2 0 + 1/2 0 -1/2 -1 +1/2 -1 - 1/2 0 0 0 which are ruled out because they violate it. For each set the corresponding values of the quantum numbers nib m, m are evaluated from the relations mi = m11 + m 12 , ms = ms , + ms2 , mi = mt + ms, which represent z components of the angular momentum addition equations, (10-6), (10-8), and (10-10). The problem now is to identify the allowed quantum states, specified in Table P-1 in terms of m4, ms, m', with the specification of these states in terms of l', s', j'. We begin by using (10-14), which represent other requirements of angular momentum conservation. Setting 1 1 = l2 = 1, we find that the possible combinations of l', s', j', expressed in spectroscopic notation, are as follows: 1 S0 iPl, 1D2, 3 5 1 , 3P0, 3 P,, 3 P2, 3 D,, 3D2, 3 D 3 . The 3 D 3 states are immediately ruled out because for these states there would be m values of + 3 and -3, but we see that there are none listed in Table P-1. Since there are no 3D 3 states, there can be no 3 D 2 or 3D, states; all these states correspond to S' and L' vectors of the same magnitude in the same multiplet and they stand or fall together. Now, entry number 1 in the table says there must be states with s' _ > 0 and l' > 2, since ms = -s', ... , s' and mi = -1', ... , l'. These requirements can be satisfied only by the states 1 D2 . There are five such states corresponding to the five values mJ = -2, - 1, 0, 1, 2. Entry number 2 says that there must be states with s' > 1 and l' > 1. This requires the presence of the states 3P0 , 3P 1 , 3P2 . For 3P0 there is one state corresponding to mi = O. For 3P, there are three states corresponding to m = -1, 0, 1. For 3P2 there are five corresponding to m = - 2, -1, 0, 1, 2. The number of states we have identified so far is 5 + 1 + 3 + 5 = 14. Only a single state is left, and this must be a state with mi = 0 because all the other m i values of the table have been used. It is clear then that this must be the single quantum state 1 S0 . We have found that in the Hartree approximation the only possible quantum states for two electrons with the configuration 2p 2 are those associated with the symbols 1 S0,'D2, 3P0,1,29 This is equally true for an np 2 configuration with any n. Since these restrictions are expressed in terms of the quantum numbers l', s', j', they are also valid in LS coupling. Note that these results agree with the states that are observed to be present in the 6C energy-level diagram of Figure 10-8. As a second example, consider six electrons in the same p subshell, that is, consider the configuration np6, with any n. Table P-2 lists the allowed quantum states for this case, in analogy to the listing for the np2 configuration, but in the present case the table has only one entry. The entry is obviously the single state 'S o . Of course, six electrons represents the maximum number that can occupy a p subshell. Thus we conclude that when this subshell is filled, its total spin angular momentum, total orbital angular momentum, and total angular momentum, are all zero. Furthermore, it is apparent that the same conclusion will be obtained for any completely filled subshell. The conclusion is confirmed by the analysis of the optical spectra of noble gas atoms. Also, if a completely filled subshell has no net spin or orbital angular momentum, there can be no net magnetic dipole moment. This is confirmed by Stern-Gerlach experiments on noble gas atoms. Table P-3 lists the quantum states allowed by the exclusion principle for some configurations containing several electrons in the same subshell. Each symbol gives the l' and s' values of an allowed multiplet. The possible values of j' and mi' for the states of that multiplet can be determined in terms of l' and s' from (10-13) and (10-14). Entries are given for configurations ranging from no electrons in the subshell up to the maximum number of electrons consistent with the exclusion principle. For no electrons, l' = s' = j' =0, which is described by the symbol 1S0 . For one electron in any subshell, s' = 1/2, and the allowed states are necessarily 2S1/2, or 2 - 1/2,3/2, etc. The allowed states for other configurations are determined by the calculations in the examples above, or by similar calculations. The allowed states can also be obtained from more elegant calculations based on the mathematical theory of groups. It is particularly interesting to note the symmetries in Table P-3 about the half-filled subshell configurations. The number of states is greatest for this configuration, and the states for a configuration in which a subshell is filled except for a certain number of electrons are ^ , - nso ns l 1S 2S ns2 1 npo np i np2 np3 np4 np 5 np6 ls nd° l nd nd2 nd3 nd4 nd 5 nd6 nd7 nd8 nd9 lond S 2p 3p 1 S, 1 D 1s 1D , 2 P, 2D 4s 3p 2p 1S ls 2D 1 S, iD, 1G 2D, 2p, 2D, 2F, 2G , 2H 3p, 3 F 4p 4 F ls, 1 D, 1 G, ls, 1 D, 1 G, 1 F, 1 1 3 P, 3F, 3 P, 3D, 3F, 3 G, 3H 5D 2D, 2p, 2D, 2F, 2G, 2 H, 2 ,5 , 2D, 2F, 2G, 2I 4 P, 4F, 4D, 4G 6S iS, 1 D, 1 G, 1 S, 1D, 1 G, 1 F, 1 j 2D, 2p, 2D, 2F, 2G , 2H 3 P, 3F, 3P, 3 D, 3F, 3 G, 3H 4p, 4 F 1 S, 3P, 3 F 1 D, 1 G 2D 5D Is exactly the same as the states for the configuration in which there are just that number of electrons in the subshell. This result can also be expressed by saying that the allowed states for electrons are the same as the allowed states for holes-a fact that has important consequences in solid state and nuclear physics, as well as atomic physics. The symmetries are a striking demonstration of the effect of the exclusion principle because, if it were not for this principle, the number of states would increase monotonically as the number of electrons in the subshell increased. JNIldf1O0S7NI31dIO NIad NOISf11OX33H1 Possible Quantum Numbers for Configurations Containing Several Electrons in the Same Subshell Table P-3 Appendix Q CRYSTALLOGRAPHY An ideal crystal consists of a large number of identical groups of atoms positioned to form a regular array in three dimensions. The group of atoms which is repeated is known as the basis of the crystal and may contain a single atom, several atoms, or as many as several thousand atoms, depending on the crystal. Each of the replicas, throughout the crystal, contains the same kinds of atoms at the same positions relative to each other and all the replicas have exactly the same orientation. Placement of the basis replicas is described by giving a regular array of points, called a lattice, such that the disposition of atoms about any lattice point is the same as about any other lattice point. The idea, for a two dimensional crystal, is illustrated in Figure Q-1, which shows lattice points and a basis of two atoms, labeled with the symbols O and • . A particular three dimensional lattice is defined by three vectors, a, b, and c, not in the same plane, such that the positions of the lattice points are given by n i a + n2b + n3c, where n 1 , n2 , and n3 are integers (positive, negative, or zero). The vectors a, b, and c are called the fundamental translation vectors for the lattice and linear combinations with integer coefficients are called lattice translation vectors. It is usually convenient, though not necessary, to position the lattice so that atoms are at lattice points. For a particular crystal, a can be chosen to be one of the shortest displacement vectors from some atom in one basis replica to the analagous atom in a neighboring replica, then b can be chosen as another such vector, not colinear with a, and finally c can be chosen as another, not coplanar with a and b. If the N atoms of the basis are labeled i = 1, 2, ... N and the origin is placed at one of the lattice points, then the atomic positions are given by vectors of the form n 1 a + n2b + n3c + p.. The first three terms locate a lattice point while the last locates an atom relative to that point. The periodicity of the atomic positions can also be described by means of a unit cell. This is a geometric figure, such as a cube or rectangular solid, constructed so that when a large number of them are placed with the same periodicity as the lattice points they fill the space with no overlap and without any space between. One way to construct a unit cell is shown in Figure Q-2. The cell is a parallelepiped. Two opposite sides are parallelograms with a and b as edges, two other opposite sides are parallelograms with b and c as edges, and the final two sides are parallelograms with a and c as edges. There is one unit cell for each lattice point and the atoms in the unit cell may be taken as the basis. If atoms lie at the corners of the cell, they are linked by lattice translation • • • • O O • • • O O • O • O • O • O • O • • O • O . • Part of a two dimensional crystal structure. Lattice points are marked by •, atoms of one type by O, and atoms of another type by •. The arrows labeled a and b are fundamental lattice vectors; the displacement vectors joining lattice points all have the form n i a + n 2 b, where n 1 and n 2 are integers. The arrows labeled p1 and p2 are basis vectors which give the positions of the basis atoms relative to a lattice point. Figure Q-1 Q-1 N CRYSTALLOGRAPHY Q Figure Q-2 A parallelopiped unit cell with lattice points at the corners. The faces are parallelograms, with edges along fundamental lattice vectors. vectors and only one of them can be included in the basis. If an atom lies on one of the faces, there must be an identical atom on the opposite face, with a lattice translation vector joining them, and only one of this pair can be included in the basis. Similarly, if an atom lies on a cell edge, there must be identical atoms on three other edges, separated by lattice translation vectors, and only one of these four can be included in the basis. For any given crystal the lattice, basis, and unit cell are not unique. It is always possible, for example, to use a basis and unit cell which are twice as large as the originals. Then the lattice consists of half the points of the original lattice. If the basis is the smallest possible group of atoms which repeats throughout the crystal, then the associated lattice and unit cell are said to be primitive. Lattice vectors and unit cells for a primitive lattice are also not unique. A look at Figure Q-1 should convince the student that there are other choices for the vectors a and b such that vectors of the form n l a + n2b give the positions of all lattice points. Crystal lattices are categorized according to the symmetry they display and the symmetry, in turn, is evident in the shape of the conventional unit cell. There are 14 different lattice types, a typical lattice of each type being called a Bravais lattice. The 14 Bravais lattices are arranged in 7 lattice systems, as shown in Figure Q-3. Notation for the cell edges and angles are defined in the diagram of a general cell, shown at the top of the figure. For the simple or p rimitive (P) cubic lattice, a cube is the primitive unit cell and there are lattice points only at the corners. A cube is not primitive for the body centered (I) or face centered (F) lattices. In addition to primitive lattice points at the cube corners, the first of these has a primitive lattice point at the cube center while the second has a primitive lattice point at the center of each face. The tetragonal unit cell has two square and four rectangular faces. In addition to the primitive cell there is a body centered cell in the tetragonal system. If, instead of the cells shown, new cells are constructed using the square formed by base diagonals of four adjoining original cells, the primitive cell becomes base centered and the body centered cell becomes face centered. These are not new lattice types. The orthorhombic unit cell has six rectangular faces. In addition to the primitive cell there are base centered, body centered, and face centered cells in the system. Primitive lattice points are shown in the diagrams The base of a monoclinic cell is an oblique parallelogram and the sides are rectangles, perpendicular to the base. A triclinic cell also has an oblique parallelogram for a base, but at least two sides and perhaps all four are not perpendicular to the base. In the base plane of a hexagonal lattice, the points are at the vertices and center of a regular hexagon. The primitive unit cell has a base which is a parallelogram with equal edges, and interior angles of 60° and 120°, as shown in the diagram. The sides are rectangles, fJ JlH dda0O11dlsllaO y Cubic a = b = a =/3 = c y= 7r/2 P F Tetragonal c a = b a= /3= y ^ =7r/2 P Orthorhombic a= b# c a= (3= y=7r/2 P C F Monoclinic a a # b * = /3 = c 7r /2 = y P P P P Triclinic Hexagonal Trigonal = b c a# /3# y# a Figure Q 3 - 7r/2 a= b# c a= /3= 7r/2,y=27r/3 a = b = c a=/3=y# 7r/2 The 7 lattice systems and the 14 Bravais lattices. perpendicular to the base. The edges of a trigonal cell are of equal length and the three edges which meet at a corner make equal angles with each other. They are symmetrically arranged around the body diagonal, shown as a dashed line in the diagram. In general the crystalline structure of a particular material is determined by the interaction between the constituent particles and, at low temperature, the most stable configuration is the one for which the total energy is a minimum. In many cases the difference in energy for two ^ C^ CRYSTALLOGRA PHY The hexagonal close packed structure. Dots represent atomic positions. The smaller primitive unit cell is also shown. Figure Q-4 or more structures is slight and the material may have a different structure at higher temperatures. A few simple structures are discussed as examples. Most elemental metals crystallize in one of the close packed structures: the face centered cubic (FCC) structure with a face centered cubic lattice and a primitive basis of one atom or the hexagonal close packed (HCP) structure with a hexagonal lattice and a basis of two atoms. The HCP structure is shown in Figure Q-4. The base of the unit cell can be divided into two equilateral triangles, inverted with respect to each other. Then, if one of the basis atoms of the HCP structure is placed at a lattice point, the other is at the midpoint of the line which joins the center of one of the triangles to the center of the triangle directly above it on the top face. The similar line through the center of the other triangle marks an open channel through the crystal. The close packed structures can be generated by arranging layers of spheres, packed together as tightly as possible. In any layer the sphere centers form the base plane of a hexagonal lattice, as shown in Figure Q-5. The next layer above is identical in structure but it is shifted so that its spheres fit snugly into the wells formed by spheres below. There are two sets of wells, marked by small crosses and by small dots on the diagram, and either set may be used. These wells are at the centers of the triangles formed by the lines joining sphere centers. Spheres of the third layer fit into the wells of the second layer and, in different structures, may be either directly over spheres of the first layer or directly over wells of the first layer. The layer pattern is then repeated and, in the first case, an HCP structure is formed while, in the second case, an FCC structure is formed. For the HCP structure the centers of first and third layer spheres form a hexagonal lattice. Centers of second layer spheres are along the line joining wells of the first layer to wells of the third layer directly above. For the FCC structure the layer shown in Figure Q-5 cuts obliquely across the cube so that three neighboring spheres lie respectively at a cube corner and two neighboring face centers, as shown in Figure Q-6. Successive layers form parallel planes through primitive lattice points. Figure Q-5 Close packing of spheres with centers on a plane. One set of wells between spheres is marked by dots and the other by crosses. The base of the hexagonal primitive unit cell is also shown. ^ in - For both the HCP and FCC structures each atom is surrounded by twelve neighboring atoms. If, for either structure, the atoms are replaced by spheres as described above, the spheres would occupy 74% of the volume, the highest occupation (or packing) fraction of any crystalline structure. At room temperature 16 of the chemical elements, including calcium, nickel, platinum, copper, silver, gold, and aluminum, have FCC structures. Iron is FCC above 1401°C and below 906°C. The rare gases neon, argon, krypton, and xenon bond via van der Wa als forces and, when they crystallize at low temperatures, they also form FCC structures. Twenty-two of the chemical elements form HCP structures at room temperature. These include magnesium, titanium, cobalt, zinc, zirconium, cadmium, thallium, and many of the rare earth metals. For most of these the model of close packed spheres closely predicts the ratio of cell height to hexagonal edge. For some however, the hexagonal layers have greater separation than the close packed model and the packing fraction is less than for ideal HCP. Zinc and cadmium belong to this group. The body centered cubic structure (BCC), with a body centered cubic lattice and a primitive basis of one atom, is slightly less tightly packed than the FCC and HCP structures. Every atom has only eight nearest neighbors, each a distance (0/2)a away, but there are six other neighbors a distance a away and, if the atoms were replaced by the largest spheres consistent with the cube size, they would occupy 68% of the volume. At room temperature 14 chemical elements, including lithium, sodium, potassium, rubidium, cesium, tungsten, and iron, are BCC. Many intermetallic compounds, such as CuPd, CuZn (called f3 brass) AgMg, A1Ni, and BeCu, as well as some ionic compounds, including many of the halides of cesium and thallium, crystallize with a cesium chloride (CsCI) structure. This structure may be characterized by a cubic cell with atoms of one type at the corners and an atom of the other type at the cube center. The lattice is simple cubic and the primitive basis contains one atom of each type, separated by half the cube diagonal or (-0/2)a. Each atom sits at the center of a cube with eight atoms of the other type at the corners. If the two atoms of the basis were identical this structure would be BCC. Many covalently bonded materials have diamond or zinc blende structures. Both of these have face centered cubic lattices and a primitive basis of two atoms. In the diamond structure the two atoms are of the same type while in the zinc blende structure they are of different types. Otherwise the two structures are the same. The two atoms of the basis are displaced from each other along a line which is parallel to one of the body diagonals of the cubic cell and their separation is one fourth the diagonal length or ( /4)a. Figure Q-7 shows a diagram of the structure. Each atom sits at the center of a regular tetrahedron with four atoms at the vertices. In zinc blende the surrounding atoms are of a different type than the central atom. The structures are loosely packed. A diamond structure composed of spheres which touch along the body diagonal has only 34° ° of its volume occupied by spheres. The elemental semiconductors silicon and germanium have diamond structures. Each of these atoms has four electrons in its outer shell and can form four covalent bonds with neighboring atoms. The diamond structure results when these bonds are of equal length and are symmetrically arranged. Carbon has a diamond structure only if formed at high temperature and pressure. At room temperature, its stable form is graphite, with a complex hexagonal structure. Many compound semiconductors with equal numbers of two types of atoms crystallize with a zinc blende structure. If one of the atoms has N electrons in its outer shell and ^ AHd `daJO11 `d1SJlHO Figure Q 6 A close packed plane in the FCC structure. Only atoms in the plane are shown. Other close packed planes are parallel to the one shown and pass through the other atomic positions. A two-dimensional hexagonal cell is also pictured. U m 1 2 CRYSTALLOG RAPHY 0 Q 0 O Q . 1 4 \^ 3 Q QO ^ o 0 (a) 2 3 4 ^ 1 2 (b) 0 (a) Perspective and (b) plan views of the zinc blende structure. Atoms of each type are arranged with a face centered cubic lattice and the two lattices are displaced from each other by one fourth the cube body diagonal. The diamond structure is the same except that all atoms are of the same type. Elevations are in units of the cube edge a. Figure Q 7 - the other has 8 — N, then in the crystal each atom can form covalent bonds with four neighbors of the other type. Some examples are GaAs, ZnSe, SiC, CdS, and ZnS, which is zinc blende itself. Many ionic crystals have the structure of sodium chloride. This structure has a face centered cubic lattice with a primitive basis of two atoms, separated by half the cube edge, as shown in Figure Q-8. There are four atoms of each type per cube and each atom has six nearest neighbors, all of the other type. Most of the alkali halides, and most of the sulphides, selenides, and tellurides of the alkaline rare earths have NaC1 structures. So do many nitrides, phosphides, and hydrides. Crystals formed by most of the chemical elements on the right side of the periodic table are less symmetric than the examples given above. For example, gallium and indium are tetragonal, iodine, oxygen, and one form of sulfur are orthorhombic, and arsenic, antimony, bismuth, mercury, and another form of sulfur are trigonal. For many of these the primitive basis is large and the structure is quite complicated. The structure of a crystal is most apparent in the external shape of the sample. Crystals tend to cleave along planes with high densities of atoms and these planes form the outer surfaces. In general the sample does not have the same shape as the unit cell since many of these cleavage planes are not parallel to cell faces. Nevertheless, the angles between sample faces are determined by the crystalline structure and measurement of these angles is often a first step in identifying the structure. Physical properties depend on the crystal structure. The electrical conductivity of a tetragonal or hexagonal crystal, for example, is different for an electric field parallel to the rectangular cell faces than for an electric field parallel to the cell base. Most methods for investigating the crystal structure involve the scattering of x rays from crystal samples. Although it is the electrons which scatter x rays, the periodic arrangement of the atoms leads to a formulation of the scattered amplitude in terms of reflections from planes which pass through atomic positions. At each plane the angle of reflection is the same as the angle of incidence and waves reflected by all planes interfere to produce the scattered wave. In general, the scattered wave is diffuse and has a small amplitude. If, however, the angle of incidence for any set of parallel planes satisfies the Bragg relation of (3-3) The NaCI structure. Each type atom is arranged in a face centered cubic lattice and the two lattices are displaced from each other by one half the cube edge. Figure Q 8 - (a) (b) (c) Some planes in simple cubic lattices. (a) The (100), (010), and (001) planes. (b) A (110) plane. (c) A (111) plane. Figure Q-9 for n = 1, that is 2d sin 0 then waves from all planes in the set add constructively and a large amplitude reflected wave is obtained. Here ) is the x-ray wavelength, d is the distance between adjacent planes of the set, and 0 is the angle between the propagation direction of the incident or reflected wave and one of the planes. This is exactly as described in Section 3-1 for electron waves. A set of parallel crystal planes is identified by means of three integers, called Miller indices and related to the intercepts of the planes on the crystal axes, along the fundamental translation vectors a, b, and c. To find the indices of a plane, its intercept on a is measured in units of a, its intercept on b is measured in units of b, and its intercept on c is measured in units of c. The reciprocals of these numbers are multiplied by a common factor so that the result is three integers with no common integer divisor, except 1. These integers are the Miller indices. They are displayed by placing them in parentheses: (hkl). All planes in the set have the same indices. If an index is negative a bar is placed above its magnitude. If a plane is parallel to a crystal axis its intercept on that axis is taken to be at infinity and the corresponding index is 0. The geometry for cubic crystals is particularly easy to deal with. For these materials (hkl) planes are perpendicular to vectors with components h, k, and 1, respectively, along three mutually perpendicular cube edges. Some planes are shown in Figure Q-9. The (100), (010), and (001) planes are perpendicular to a, b, and c respectively. They are parallel to cube faces. The (110), (101), (011), (110), (101), and (011) planes cut through diagonals on opposite cube faces. The (111), (111), (111), and (I1 1) planes are perpendicular to cube body diagonals. For simple cubic lattices, adjacent planes with indices (hkl) are separated by the distance d, whose value is = d = a \/h 2 + k2 +12 For example, (100) planes are separated by a cube edge a, (110) planes are separated by a face diagonal or a/J, and (111) planes are separated by a body diagonal or a/.. For face centered and body centered cubic lattices there are planes between these planes and the separation is less. In an x-ray diffraction experiment, Bragg reflection angles are measured for scattering from a large number of differently oriented planes, then the Bragg relation is used to compute interplanar separations. A lattice type is assumed and Miller indices are assigned to the various planes so that ratios of experimentally determined interplanar separations match the values predicted. If a match is obtained, cell dimensions can then be calculated. AHdda 0O11dlsJla O p Appendix R GAUGE INVARIANCE IN CLASSICAL AND QUANTUM MECHANICAL ELECTROMAGNETISM The discussion which follows is more quantitative than that in Section 18-6 because it is assumed that the student is familiar with Maxwell's equations in differential form and the vector potential, and has at least heard of Hamilton's equations of mechanics. We shall treat gauge invariance first from a classical standpoint and then add more quantitative material to the discussion in Section 18-6 of gauge invariance in quantum mechanics. In 1868 Maxwell had available to him four equations of electromagnetism which were (in the simplest form, since units will be of no concern here) V•E= p, V x E= —0B/0t, V•B=0, ando x B=j where E is the electric field, B the magnetic field, p the charge per unit volume, and j the current per unit area. Maxwell noticed that taking the divergence of the last equation gave V•(V x B)=V•j=0 since the divergence of a curl is zero. This result was in conflict with the continuity equation for electric charge V .j= —ôp/ât if the charge density p is not a constant in time, so he modified Ampere's law to be V x B=j+ 0E/cat This insured the local conservation of charge, since the continuity equation says that no net charge can be created or destroyed in an arbitrarily small volume. Global charge conservation does not help here, since creating a charge at point x 1 while destroying a similar charge at point x 2 will not satisfy the continuity equation if x 1 and x 2 are not both inside the volume considered. To understand the deeper significance of Maxwell's addition to Ampere's law, it is easier to deal with the vector and scalar potentials A and V instead of the fields, so we use B=V x A and E = —V V — aA/at The origin of gauge invariance lies in the fact that A and V are not unique for given physical fields E and B. That is to say, gauge transformations on A and V leave E and B unaltered. The associated invariance of the Maxwell equations is called gauge invariance. As an example of a gauge transformation, let V —* V' = V — ex/at, where x is arbitrary. To leave E unchanged there must be the simultaneous transformation A —+ A' = A + Vx. That is, E —* —V V + V(ôx/ôt) — ôA/dt — 88(O x)/8t = E by changing the order of space and time derivatives. Note that this leaves B unchanged also, since the curl of a gradient is zero, so that B — V x A + V x Vx=VxA=B. R-1 GAUGE INVARIANC E IN C LASSI CAL A ND QU ANTU MME CHANICAL E LECT RO MAG NETIS M N °Cx Q The important point is that the global symmetry of the electric field (and global charge conservation) has been converted into a local symmetry with local charge conservation because of the addition of a new field, the magnetic field. In other words, V can now be made different at any point—not just changed everywhere at once—by introducing a compensating change in A. The result is still the symmetry that E and B, the only physical observables, remain unchanged. It is interesting to note that the above process can be turned around. The local invariance requirement forces a relationship between V and A and hence between E and B fields. With the aid of Lorentz invariance, Maxwell's equations can be derived from this local symmetry requirement. This approximates the procedures to be used in obtaining gauge theories: A global symmetry is turned into a local symmetry by the addition of one or more new fields, and from the resulting relations the field equations are obtained. As explained in Section 18-6, the related problem in quantum mechanics is turning a global phase invariance into a local one, and this requires the addition of the electromagnetic field to compensate the local phase change. If Q is the charge of the particle involved, the required local phase transformation, as given in (18-14), is W(x t) --> kP'(x,t) = e iQx(x,t),I,( x,t ) There needs to be simultaneously a correlated change in the electromagnetic quantities, which will be just the previously discussed gauge transformations and V -4 V' = V — ax(x,t)/at A —* A' = A + V (x,t) Now the Schroedinger equation will be satisfied. However, as discussed in Section 18-6, this is not the free particle Schroedinger equation, but rather one which includes the electromagnetic field. It may be obtained by using the fact that classically the Lorentz force F on a particle of charge Q moving at velocity y, which is F = QE + Qv x B can be obtained from Hamilton's equations of mechanics using the Hamiltonian H of the form H = 2m (p — QA)2 + QV where p is the particle's momentum. The Hamiltonian is then converted to an operator equation by using the quantum mechanical replacement p — ihV, which is a three-dimensional extension of (5-32). By allowing the operator equation to operate on the wavefunction P(x,y,z,t), we obtain ,z,t) m (— ihV — QA) 2 + QV1f(x,Y,z,t) = iii a( at [2 This is the desired Schroedinger equation with the full spatial dependence displayed. Comparing this with the free-particle Schroedinger equation, we see that this equation results from substituting and a/at a/at +iQV/h V — V — iQA These same substitutions work in the Klein-Gordon equation (Section 17-4) and in the Dirac equation (Section 5-2). Thus this prescription for converting a global symmetry into a local one works relativistically as well. Appendix S ANSWERS TO SELECTED PROBLEMS Answers to approximately one half of those problems that are not self-answering, and do not involve graphing. Chapter 1 (7) 5466 A (4) 7.51 W. (5a) 4.09 x 10 9 kg (5b) 6.5 x 10 -14 2.14 (15c) 1.00 (15a) 2.50 (15b) (10b) 280°K (22a) 1410°C (22b) 1.26 cm (21) 1.8152., 0.6142maX (24) 18,020°K Chapter 2 (2a) 2.0 eV (2b) zero (2c) 2.0 V (2d) 2950 A (2e) 2.0 x 10 14 /cm2 -sec (8a) 3.1 keV (8b) 14.4 keV (10) 3.6 x 10 -17 W (4) 3820 A (20) 300% (12) 1.235 x 10 2Ô Hz, 2.427 x 10 -2 A, 2.731 x 10 -22 kg-m/sec (23) 2.64 x 10 -5 A (26a) 5.725 keV (26b) 0.870 A, 2.170 A (21) 44° (30a) 5.46 x 10 -22 kg-m/sec (29a) 2.022 MeV (29b) 29.7 % (31) c/3 (30b) 2.71 eV, yes Chapter 3 Chapter 4 Chapter 5 Chapter 6 (14a) 1.287 A (4) neutron (6) 1.096 x 10 -6 A (2) 4.34 x 10 -6 eV (18) 37.7 KV A (17) 41.3° (15) 1.596 (14b) 11.6° (27) 1.40 x 104 A3 (28a) 0.987 keV/c, yes (28b) 9.87 MeV/c, no (30) 4.17 x 10 - 8 eV (28c) 9.87 MeV/c, yes (6a) 4.29 x 10 -14 m (6b) 3.72 x 10 -14 m (3) Z 1 /3 RH (10) 4240, 11.4 (13) 7 (7) 1.58 x 10 -14 m (9) 4000 A (18) 1.2 km/sec (14) Fgrav /F coul = 4.4 x 10 -40 , yes (25a) 23.2 eV (19) 13.46 eV, 13.46 eV/c, 921.2 A, 4.30 m/sec (31) 1.50 x 10 6 m/sec (34) 26.7 A (30) 4.90 A (25b) 36.8 eV (38a) 6, 4 (38b) smaller (38c) 2.68 A (35) n = 5 (39a) A A) = 3647n 2/(n 2 - 16), n = 5, 6, 7, . . . (39b) visible, infrared (39c) 3647 A (39d) 54.4 eV (40) 2.38 A (7a) 0.1955 (7b) 0.3333 (5) 0.84 (4) (C/mit 2) 1 "2 (11) zero, 7.067 x 10 -2 a2 (9b) 2n 2 h2 /ma2 = 4E0 (29a) 0.4 A (26) smaller (25) E, will increase (33a) c i ci E 1 + c 2c2E 2 (12) zero, (h/a)2 (8a) 0.62 (8b) 1.07 x 10 -56 (8c) 2.1 x 10 -6 (l0a) 4.32 MeV (9b) proton: 3.07 x 10 -5 , deuteron: 2.51 x 10 -7 (10b) 2 x 10 -3 VO (10c) 0.0073 (15a) [1 + (sin 2 k2 a)/4x(x - 1 )] -1 , x = E/VO (15b) n2n 2h 2 /2ma2 (25a) zero (25b) zero (21a) 2.05 MeV (20a) 9 eV (20b) 1 eV (32a) 0.5 Hz (29b) 1 0 36 (25c) 0.0777a 2 (25d) 88.826(h/a) 2 (32b) 0.049 joule (32c) 1.5 x 10 32 (32d) 3.3 x 10 - 34 j oule 1.3 x 10 -33 m (32e) S- 1 ANSWERS TOSELECTED PROBLEMS Chapter 7 (11a) 4.147% (lib) 11.44% (9a) 2E 2 (9b) 2E 2 (7a) 4a0 (7b) 5a0 (13a) -0.85 eV (13b) 9.52 A (12b) 54.7°, 125.3° (12c) 35.3°, 144.7° (16a) hcot Be`'V21-1 (13c) 3.46h (13d) 2h (13e) zero (13f) zero (26a) mh (26b) m2h2, rn2h2, mh Chapter 8 (3a) 6.51 x 10 -24 nt -m (3b) 1.89 x 10 -22 nt (3c) 1.48 x 10 -5 eV (5) 29 tesla/m (7) 0.019 eV (10a) 74.5° (10b) 74.5° (10c) 25.2° (19) An = ±1, +3, +5,... (21) 27 Chapter 9 (25) 870 V (15a) 0.48 A (15b) 1.6 A (14a) 2.4 (27a) Co: 8.50 keV, Fe: 7.83 keV (26a) 8.65 x 106 m -1 , 1.7 (27b) 8.50 keV (28) 2.44 x 10 -16 sec Chapter 10 (8) 10.0° (17a) 12 (20c) 1.8 x 10 -3 A (la) 6700 A (lb) 0.152 A (22a) 1.4 eV (22b) 104 tesla (22c) no (20d) 2000 A Chapter 11 (10b) vm = v\3N °/2ta, B = (hv/k)J3N °/Ica (5a) 0.418 eV (5b) 4410°K (21) 1.28 x 10 16 sec -1 (27) 3.1 eV (20a) none (20b) 51.4 joule (29a) .itr2 h2 /32m1 2 (29b) 4 /3 (28) 10.3 eV Chapter 12 (1) 4.64 eV (2) 18 A (5b) r = 4 (6) 120°K (l0a) 1 (10b) 1 (lla) 1/72 (llb) 210/1 (10c) 2 (lOd) 2 (10e) 2 (10f) 2 (17) 2900 cm -1 , 40 cm -1 (20) D2: 0.375 eV, HD: 0.460 eV (15) 0.190 A (22a) 2.49 x 10 14 Hz (22b) 3650 nt/m (25a) 2.91 (25b) 2.88 (26a) 8.7 x 10 -47 kg-m 2, 6.9 x 10 -47 kg-m 2 (26b) 0.1 eV (29) 3/2 Chapter 13 (4a) metallic (4b) covalent (semiconductor) (4c) ionic (4d) covalent (insulator) (4e) molecular (6) 10 1° V/m (9a) 0.47 mm/sec (9b) 1.2 x 10 5 m/sec (9c) 1.6 x 10 6 m/sec (11a) 65.4 m (lib) 4.4 x 104 A (13b) ,N5 (Vi) + /) (15a) 6.95 eV (19) 0.756 eV -1 (20) 5.5 x 10 -3 (24) 377°K (33) 1.834 x 10 -5 amp Chapter 14 (1) 1.3 x 10 4 A (11a) 8.4 x 10 -5 amp/m (11b) 700 amp/m (12a) 0.549 (12b) 1.43 x 10 -23 joule/tesla (17a) 5.4 x 10 8 amp/m (17b) 1.73 x 10 6 amp/m (17c) 1200 joule (18b) 310 Chapter 15 (1) 3/2 (3a) 5.8 x 10 -37 MeV (3b) 0.72 MeV (5) 2.4 F (7) 3.02 cm (10a) 5.95 MeV (11) 23.0 MeV (14a) 23.8 MeV (14b) 0.48 MeV (16a) 2.764 MeV (16b) 3.44 F (18a) 7.275 MeV (18b) 14.44 MeV (23a) 5/2 (23b) even (23c) negative (23d) zero (25a) 1.09 (25b) 6.0526 F (25e) 6.31 F, 5.79 F Chapter 16 (4) 1(1 - e -R")/R (7a) 4 x 10 9 yr (7b) 23 g (7c) 4 x 10 -7 g (9a) 13.1 g (9b) 3.61 g (lla) 4.0 x 104 m/sec, (ila) allowed, Gamow -Teller (11b) forbidden, 10 -6 supression (11c) allowed, Fermi or Gamow-Teller (lid) forbidden, 10 -3 supression (15a) 2.9 x . 10 -62 joule-m 3 (22b) a= -3.1 x 10 -35 m4/sec, b = 2.5 x 103 mm/sec, p3 = 8.0267 x 10 34 m -3 , P4 = g.0248 x 10 34 m 3 (24) 78° (26c) 1/2k1 (28) 0.67 bn/sr (29) 0.074 rad (32a) 0.154 MeV (32b) 154 eV (32c) 0.065 eV (32d) 99 (34a) 3.27 MeV (34b) 2.53 x 10 3 kg Chapter 17 (5a) 0.16 sin (0.90r), r < 2; 0.24e -°.23. , r > 2, r in F (8a) 10 (8b) 33° (12a) 5 x 10 -24 sec (12b) 1 (12c) 3 (13) 2.2 x 10 -8 m (15) 6m0 c2 = 5360 MeV (16a) -10 -43 cm2 (16b) 10 18 cm (5a) arid, uus, ûd (5b) =1 So , p = 3S 1 (2a) 1.7 x 10 -6 (2b) —10 5 (14) +2x 2/3r (6) 6 x 10 34 m -2-sec -1 (12) 4 Appendix A (5a) 3.965 x 107 m/sec (Sb) 2.522 x 10 - 6 sec (7) 0.946c (12) 2.991 x 108 m/sec, 0.9975c (8) (c2/v)(1 — ^/1 — v 2/c 2) (16a) 2.696 x 10 14 joules (16b) 1.783 x 10 7 kg (16c) 5.94 x 10 6 ANSWERS TO SELECTED PROBLE MS Chapter 18 INDEX A and B coefficients, 394, 395 Abelian transformation, 690 Absorption, stimulated, 393 Absorption edge, 342 Absorption spectra, 98 and emission spec tra, 104 Absorptivity, 6 Acceptor impurity, 469 Acoustic radiation, 399 Actinide, 334 Action, 111 Adiabatic demagnetization, 506 Age: of earth, 561 of universe, 608 Alkali, 336 spectra of, 349 Allowed band, 447 Allowed beta decay, 572 Alpha decay, 206, 555 energy of, 556 Alpha particle model, 552 Alpha particle scattering, 88 Alternation of intensities, 436 Angstrom unit, 5 Angular correlation experiment, 465 Angular frequency, 129 Angular momentum, see specific types Angular momentum operator, 255, M-1 Annihilation, 44, 464 Anomalous Zeeman effect, 364 Antiferromagnetism, 503 Antineutrino, 566 detection of, 575 Antiscreening, 699 Antisymmetric eigenfunction, 305 Associated Laguerre polynomial, N-5 Associated Legendre function, N-1 Asymmetry term, 527 Asymptotic freedom, 685, 698 Atomic eigenfunction, 323 Atomic mass unit, 520 Atomic number, 94, 342, 511 Atomic radius, 86, 327 Atomic spectra, 96 Atomic stability, 95 Attenuation coefficient, 50 Attenuation length, 50 Azimuthal quantum number, 115, 240 Balmer formula, 97 Balmer series, 97, 98 Band, conduction, 450 valence, 450 Band spectra, 430 Band theory, 445 Band width, 454 Barn unit, 517, 597 Barrier penetration, 201, 206, 558 Barrier potential, 199 Baryon, 640, 649 Baryon number, 640, 649, 651 BCS theory, 487 Beta decay, 562 coupling constant, 569, 574 energy, 564 interaction, 569, 572. See also Weak interaction matrix element, 568 rate, 570 spectrum, 567 Big bang theory, 20, 608, 710 Binding energy, 102, 524 per nucleon, 524, 530 Blackbody, 3 Blackbody radiation: and Big bang theory, 20 and cavity radiation, 5 energy density of, 5 and photon gas, 34, 399 Planck spec tral formula for, 17 Planck theory of, 13, 398 Rayleigh-Jeans theory of, 7 spectral measurements, 3 and thermometry, 19 Bloch eigenfunctions, 457 B meson, 682 Bohr magneton, 269 Bohr microscope, 67 Bohr model, 100 and hydrogen energy levels, 286 Bohr quantization postulate, 98 and de Broglie postulate, 112 and Wilson- Sommerfeld rules, 114 Bohr radius, 100, 246 Boltzmann constant, 12, 740 Boltzmann dist ribution, 13, 104, 377, 384, C-1 and quantum systems, 391 Boltzmann factor, 391, 392 Bombarding particle, 521 Bond: covalent, 418 ionic, 416 metallic, 444 molecular, 444 Born approximation, L-1 Born postulate, 64, 135 Bose condensation, 399, 402 Bose distribution, 382, 384 for photons, 398 Boson, 310, 378 1 N Box normalization, 182 Brackett series, 98 Bragg scattering condition, 58, 459 Bravais lattice, Q-2 Breeder reactor, 606 Breit- Wigner formula, 596 Bremsstrahlung, 42 Brillouin zone, 460 Broken symmetry, 674 Brueckner theory, 529 Control rod, 606 Cooper pair, 487, 546 Copenhagen interpretation, 79 Correlation angle, 465 Correspondence principle, 117 Cosmic rays, 42, 44 Coulomb potential, 234 screened, L-7 Coulomb scattering, 90, 591, E-1 cross section for, 95 Coulomb term, 527 Coupling constant, 682 beta, 569, 573 electromagnetic, 639 nuclear, 638 Covalent bond, 418 Covalent solid, 444 CP operation, 657 CPT theorem, 658 Critical field, 485 Critical temperature, 484 Cross section, 48 Compton scattering, 49 Coulomb scattering, 95 pair production, 49 photoelectric, 49 total photon, 49 Crystal lattice, 443 Crystallography, 448, Q-1 Curie law, 494 Curie temperature, 497 ferromagnetic, 497 Curve of stability, 563 Cabibbo angle, 703 Carbon atom, energy levels of, 361 Carbon cycle, 610 Cascade hyperon, 649 Causality and qu antum theory, 79, 139 Cavity radiation; see Blackbody radiation Centrifugal potential, 345, 536 Chain reaction, 602 Charge conjugation, 655 Charge density: atomic, 323 nuclear, 516 Charge independence, 618, 621 Charm, 678 quantum number, 678 Charmonium, 680 Classical limit for orbital angular momentum, 259 of quantum theory, 117, 184 for simple harmonic oscillator, 21, 136, 165 for step potential, 198 Classically excluded region, 213 Collective model, 545, 549 Color, 683 Daughter nucleus, 556 Color charge, 684, 699 Davisson-Germer experiment, 57 Color force field, 686 De Broglie postulate, 56 Comparative lifetime, 571 and Bohr quantization postulate, 112 Complementarity principle, 63 and infmite square well, 218 Complex conjugate, 135, F-1 and Schroedinger equation, 129 Complex exponential, F-2 and uncertainty principle, 72 Complex number, F-1 De Broglie wave, 56, 69 and Schroedinger equation, 134 De Broglie wavelength, 56 Compound nucleus, 591, 595 Debye specific heat theory, 389 Compound nucleus resonance, 595 Debye temperature, 390 Compton effect, 34 Decay energy: theory of, 36 alpha, 556 and uncertainty principle, 68 beta, 564 Compton scattering cross section, 49 Decay law, 558 Compton shift, 35, 37 Decay rate, 558 Compton wavelength, 37 alpha, 207 Conduction band, 450 beta, 570 Conduction electron, 32, 191, 215, 405 gamma, 579 Conductivity, 450, 463 Deep-inelastic scattering, 669 Conductors, 449 Degeneracy, 115, 239, 240, 327 Configuration, 332 of atomic eigenfunctions in applied field, Conse rvation laws: 252 for nuclear reactions, 588 for Coulomb potential, 536 for observed interactions, 654 exchange, 305 Contact potential, 27, 407 perturbation theory of, J-8 Continuity of eigenfunction and derivative, 155, Degeneracy effect for gases, 401 214 Delayed neutron emission, 606 Continuum energy states, 110 Delta particle, 651 and Schroedinger theory, 163 Density of states, in band, 455 Contraction, Lorentz, A-8 and effective mass, 463 for free particle, 453 for photons, 398 Detailed balancing, 381, 639 Deuterium, 107 Deuteron, 619 Diamagnetism, 493 Differential cross section, 94, L-4 Differential equation, 127 Differential operator, 144 Diffraction: general formula for, 57 of particles, 58, 76 an d uncertainty principle, 67, 77 Dilation, time, A-8 Dirac theory: and beta decay, 566 and hydrogen energy levels, 286 an d pair production, 47 and Schroedinger theory, 132 Direct interaction, 591, 593 Directional bond, 422 Distance of closest approach, 91 Dist ribution function, 3. See also specific types D meson, 679 Domains, 500 Donor impurities, 468 Doping, 467 Doppler shift and Mdssbauer effect, 586 relativistic, 46 D rift speed, 450 Dual nature of radiation, see Wave-particle duality Dulong-Petit law, 388 Dynamical quantity, 143 Effective mass, in crystal lattice, 461 in nuclei, 533 Effective Z, 325 Eigenfunction, 154, 166, 242, 262 degenerate, J-8 required properties of, 155 Eigenvalue, 165, 239, 262 Eigenvalue equation, 259, 262 Einstein A and B coefficients, 394, 395 Einstein photon hypothesis, 30, 63 Einstein relativity postulate, A-5 Einstein specific heat theory, 388 Elastic scattering, 593, 668 Electric dipole radiation, B-3 Electric dipole tran sition, 289, 580 Electric quadrupole moment, 514, 546, 600 Electromagnetic interaction, 574, 653, 655 Electromagnetic spectrum, 33 Electron, 59 Electron affinity, 336 Electron capture, 564 Electron emission, 564 Electron gas, 404, 406 Electron molecular spectra, 429 Electronic neut ri no, 642 Electronic specific heat, 406 Electron-positron an nihilation, 464 Electron-positron pair, 43 Electron radius, 277 Electron spin resonance, 369 Electron volt unit, 29 Electroweak gauge theory, 699, 701 Elements: abundances of, 510 origin of, 607 periodic table of, 330 Emission: spontaneous, 291, 393 stimulated, 291, 393 Emission spectrum, 98 Emissivity, 6 End point, 565 Energy b an d, 446 Energy gap, 489 Energy level diagram, 20 x-ray, 339 Energy quantization: of one-electron atom, 101 Pl an ck postulate of, 14 of radiation, 30 in Schroedinger theory, 157 an d uncertainty principle, 68 by Wilson-Sommerfeld rules, 110 Enhancement factor, 380 Entropy, 410 Equilibrium decay, 559 Equipartition of energy, 12 Eta meson, 651 Ether frame, A-3 Even function, 140 Exchange: of particle lables, 306 of phonons, 487 of pions, 634 Exchange degeneracy, 305 Exchange force, 316 Exchange interaction, 498 Exchange operator, 624 Excited state, 102 Exclusion principle, 308, 319 an d atomic structure, 337 in LS coupling, 363, P-1 an d nuclear structure, 531 Exhaustion region, 481 Expectation value, 141 general presc ri ption for, 146, 171 Exponential attenuation, 50 Exponential decay law, 558 Extrinsic conductivity, 467 Extrinsic region, 481 Fermi distribution, 383, 384 Fermi energy, 385 for metals, 406 for nucleus, 531 in semiconductors, 471 Fermi gas, 405 Fermi gas model, 531, 549 Fermi momentum, 465, 480, 671 Fermion, 310, 378, 382 Fermi selection rules, 571 Fermi temperature, 480 X W 0 — Fermi unit, 94, 511 Fermi velocity, 479 Fermi-Yang model, 673 Ferrimagnetism, 503 Ferromagnetism, 493, 497 Feynman dia gr am, 669 Filled subshell, 252, 363 Fine stru cture, 114, 276 in hydrogen atom, 287 Landé interval rule for, 359 Fine structure const an t, 116, 286, 639, 682 Finiteness of eigenfunction and derivative, 155 Fission, 525, 602 Fission fragment, 602 Flavors, 678 Flux, probability, 196 Flux quantization, 491 Fock calculation, 322 Forbidden b an d, 447 Forbidden beta decay, 572 Forward bias, 473 Fourier integral, D-1 Franck-Condon principle, 432 Fr an ck-He rt z experiment, 107 Free electron gas, 404 Free electron model, 452 Free particle: density of states for, 453 qu antum mechanical behavior of, 178 Frustrated total internal reflection, 205 FT value, 571 Fundamental tran slation vectors, Q-1 Fusion, 525 Fusion reactor, 607 Galilean tr an sformation, A-1 Gamma decay, 578 selection rules for, 580 tr an sition rate, 5 79 Gamma ray, 32, 578 Gamow- Teller selection rules, 572 Gas degeneration, 401 Gauge fields, 691 Gauge inva rian ce, 655, R-1 Gauge inva rian t, 689 Gauge theories, 688 Gauge tran sformation, R-1 Gaussian dist ri bution, D-3 Gaussian potential, L-7 Geiger-Marsden experiment, 89 Gell-Mann-Nishijima relation, 646, 681 Generation, quark-lepton, 705 g factor, Landé, 368 orbital, 269 spin, 274 GIM mechanism, 704 Global gauge symmetry, 688 Glueballs, 692 Gluons, 684, 692 mass of, 697 Golden Rule No. 2, K-5 Goldstone boson, 701 Goudsmit- Uhlenbeck postulate, 276 Gr an d unification theories, 706 Gravitational interaction, 574, 654 Gravitational red shift, 588 Graviton, 654 Ground state, 102 Group velocity, 72 Group wave function, 182, 192 Group of waves, 70 Hadron, 649 Half-life, 559 Hall coefficient, 451, 479 Hall effect, 451 Halogen, 336 Hamiltonian, 262 Handedness, see Helicity Harmonic oscillator, see Simple harmonic oscillator Hartree theory, 319 Heat capacity, 388 Heisenberg matrix mechanics, 261 Heisenberg principle, see Uncertainty principle Helicity, 577, 642, 657 Helium energy levels, 317 Hermite polynomials, I-5 Heteropolar bond, 418 Heusler alloy, 499 Hidden va ri ables, 79 Hierarchy problem, 708 Higgs particles, 702 Hole, 451 in filled band, 464 an d positron, 47 an d x-ray spec tr a, 338 Homopolar bond, 422 Hydrogen energy levels, 101, 286 Hydrogen molecular ion, 418 Hypercharge, 674 Hyperfine splitting, 288, 363, 512 Hyperon, 648 Hysterisis, 501 Identical particles, 302 Imaginary number, 131, F-1 Imaginary pa rt, F-1 Impact parameter, 90 Independent particle motion: in atoms, 320 in nuclei, 531 Indeterminacy principle, see Uncertainty principle Indicial equation, N-4 Indistinguishability, 303 and qu an tum statistics, 377 Induced fission, 603 Inelastic scattering, 593 Inertial frame, A-2 Infmite square well potential, 214 ground state of, 147 Inhibition factor, 378 Insulator, 448 Interactions, comparison of properties, 574, 653 Interatomic force, 416 Intermediate boson, 643, 653 Internal conversion, 581 coefficient of, 582 Intensity, of radiation, 63 Interval rule, 359 in hyperfine splitting, 514 Intrinsic conductivity, 467 Intrinsic parity, 639 Inversion of NH3, 209 Ionic bond, 416 Ionization energy, 110, 335, 336 Irreducible, 674 Isobar, 601, 632 Isobaric an alogue levels, 633 Isolated band, 448, 449 Isomer shi ft , 587 Isospin, 631 Isotope, 521 Isotope effect, 486 Isotopic abundance, 428, 437 Isotopic spin, see Isospin Jastrow potential, 627 Jet, 693 JJ coupling: atomic, 356 nuclear, 540 J meson, 679 Josephson effect, 491 Kirchoff law, 6 Klein-Gordon equation, 639 K meson, 644, 649 decay of, 658 Kronig Penney model, 457 Kurie plot, 569 Laguerre polynomials, associated, N-5 Lamb shift, 288 Lambda particle, 644 Lambda point, 402 Landé g- factor, 368 Landé interval rule, 359, 514 Lanthanide, 334 Laplacian operator, 235, 236, M-1 Larmor frequency, 270 Larmor precession, 270 Laser, 291, 392 Lattice tr an slation vector, Q-1 Laue diffraction pattern, 61 Legendre functions, associated, N-1 Legendre polynomials, N-1 Lenz law, 493 Lepton, 641 Lepton number conservation, 642 Leptoquark, 707 Level densities, of band, 463 Lifetime, 292, 558 Linearity of Schroedinger equation, 132, 166 Line spectrum, 97 formation of, 102, 348 Line width, 76 Liquid drop model, 526, 549 Liquid helium, 402 Local gauge symmetry, 688 Lorentz contraction, A-8 Lorentz transformation, A-11 (Ji LS coupling, 356 exclusion principle in, P-1 selection rules for, 364 Lyman series, 98 Magic numbers, 530, 561 Magnetic dipole moment: atomic, 365 nuclear, 512, 543 orbital, 267, 268 spin, 274 Magnetic field strength, 492 Magnetic induction, 492 Magnetic quantum number, 240 Magnetic resonance, nuclear, 392 Magnetic susceptibility, 493 Magnetization, 492 Majorana neutrino, 709 Many body effects: in nuclei, 545 in solids, 484 Many particle states, 595 Maser, 393 Mass deficiency, 523 Mass formula, 528 Mass number, 511 Mass spectrometry, 519 Mass unit, 520 Mass width, 652 Matrix element beta decay, 568 electric dipole, 290 electric quadrupole, 581 magnetic dipole, 581 nuclear, 569 pe rturbation, 771 and selection rules, 292 Matrix mechanics, 261 Matter waves, 56, 69 Maxwell dist ri bution, 3, 14, 377 Mean free path, 450 Meissner effect, 484 Meson, 650. See also specific types Meson theory, 634 Metallic bond, 445 Metallic solid, 445 Metastable state, 295, 393 Michelson-Morley experiment, A-4 Miller indices, Q-7 Mirror nuclei, 552, 601 Mobility, 451 Models an d theories, 509, 545 Moderator, 606 Molecular bond, 444 Molecular solid, 444 Momentum spectrum, 567 Moseley formula, 341 Miissbauer effect, 584 Multiple scattering, 89 Multiplet, 359 Multipolarity, 579 Muon, 641 Muonic atom, 106 Z m Muonic neutri no, 641 Natural line width, 76 Negative resistance, 477 Net potential: atomic, 320 nuclear, 531, 541 Neutral current process, 703 Neu tr ino, 566 electronic, 642 muonic, 641 production of, 667 tauonic, 642 Neutri no oscillations, 709 Neutron, 512 Neutron number, 526 Neutron-proton scattering, 622 Noble gas, 335 Normal Zeem an effect, 364 Normalization, 138, 149 in box, 182 n-type semiconductor, 468 Nuclear abundance, 526 Nuclear binding energy, 524 Nuclear charge density, 517 Nuclear electric quadrupole moment, 514, 546 Nuclear force, 511 coupling constant, 638 see also Nucleon force Nuclear interaction, 574 parity conse rv ation in, 595 see also Strong interaction Nuclear magnetic dipole moment, 512, 543 Nuclear magnetic resonance, 392 Nuclear magneton, 512 Nuclear mass, 519 Nuclear mass density, 518 Nuclear mass formula, 528 Nuclear matrix element, 569 Nuclear pairing interaction, 541 Nuclear parity, 542 Nuclear potential scattering, 591 Nuclear radius, 518 Nuclear reaction, 588 energy balance in, 521 Nuclear reactor, 602 Nuclear spin, 434, 512, 542 Nuclear spin-orbit interaction, 537 Nuclear spin quantum number, 435 Nuclear symmetry character, 434, 512 Nucleon, 512 Nucleon force, 618. See also Nuclear force Nucleon potential, 619 Nucleon resonances, 651 Nucleus, discovery of, 90 Numerical integration, G-7 Numerical solution of Schroedinger equation, G-1 Observed interactions, 653 Odd function, 142 Old quantum theory, 2 c ri tique of, 118, 295 Omega meson, 652 Omega particle, 648 One-electron atom: eigenfunctions, 243 eigenvalues, 239 Schroedinger equation, 235 Operator. angular momentum, 255, M-2 Laplacian, 235, M-1 linear momentum, 145 Operator equation, 145 Optical excitation, 348 Optically active electron, 349 Optical model, 592 Optical pumping, 396 Optical pyrometer, 3, 19 Optical spectra, 348 Orbital angular momentum, 254 and parity, 294 quantization of, 99 quantum mechanical conservation law for, 259 quantum numbers, 253 total, 355 Orbital g-factor, 269 Orbital magnetic dipole moment, 268 Orthogonality, 230, 307, 344, J-2 Ortho-molecule, 435 Pair annihilation, 43, 45 Pairing in covalent bonds, 421 in nuclei, 541 in superconductivity, 487 Pairing energy, 542 Pairing term, 527 Pair production, 43 cross section for, 49 Dirac theory of, 47 Paramagnetism, 493 Para-molecule, 435 Parent nucleus, 556 Parity, 220, 294, 576 conse rv ation in electromagnetic interaction, 576 conse rv ation in nuclear interaction, 595 intrinsic, 639 nonconservation in beta decay, 576 nuclear, 542 operation, 294 and orbital angular momentum, 294 an d selection rules, 295, 572, 580 Pa rt ial b an d, 499 Pa rt ial derivative, 127 Particle in a box, 215 Particle-wave duality, see Wave-particle duality Pa rt on, 667 Paschen-Bach effect, 370 Paschen series, 98 Pauli principle, see Exclusion principle Penetration of classically excluded region, 189 Penetration distance, 190 Periodic table, 330, 331 Permanent magnetism, 501 Pe rturbation theory: time dependent, K-1 time independent, J-1 Pfund series, 98 Phase integral, 111 Phase space, 111, 409 Phi meson, 652 Phipps-Taylor experiment, 273 Phonon, 399, 484 Quantum electrodynamics, 288, 291, 295, 635, 639, 685, 690 Quantum number, 20, 100, 238. See also specific types and superconductivity, 487 Phonon wing, 585 Quantum statistics, 377 Phosphorescence, 295 Photoconductivity, 467 Photoelectric effect, 27 cross section for, 49 Einstein theory of, 29 Photoelectron, 28 Photon, 40, 650, 653 momentum of, 35 rest mass of, 35 Photon gas, 34, 398 Pi meson, see Pion Pickering series, 123 Pion, 634, 653 Pion field, 634 Pion resonances, 651 Planck blackbody spectrum, 17 theory of, 13, 398 Planck const ant, 16, 31 Planck energy quantization, 20, 410 and Schroedinger theory, 222 and Wilson-Sommerfeld rules, 111 Planck postulate, 20 Plasma, 609 p-n junction, 472 Polar molecule, 418 Population inversion, 396 Positron, 43, 464 Positron emission, 564 Positronium, 45, 106, 466 Pound-Rebka experiment, 588 Power series technique, I-3 Poynting vector, 63, B-2 Preons, 710 Primitive unit cell, Q-2 Principal quantum number, 115, 240, 535 Probability density, 135, 244 average, 252 directional, 249 radial, 244 Probability flux, 196 Product particle, 521 Prompt fission neutron, 605 Proper length, A-8 A-8 Propetim, Protn,51 Proton-proton cycle, 609 Psi meson, 679 p-type semiconductor, 469 Quantization: of action, 111 of energy, see Energy quantization of magnetic flux, 491 of orbital angular momentum, 99, 254 space, 273 of spin angular momentum, 274 Quantum chromodynamics, 691 Quantum Mate, 20,166 Quark, 673, 676, 678 mass of, 682 Quark quantum number, 682 Q-value, 522, 589 Rad, unit, 616 Radial node quantum number, 534 Radial probability density, 244 Radiancy, 4 Radiation: by accelerated charge, B-1 by atoms and Bohr model, 99 by atoms and Schroedinger theory, 167 intensity, 63 Radioactive series, 560 Radioactivity, 555 Radius: atomic, 86, 327 Bohr, 100, 246 nuclear, 518 Raman effect, 432 Ramsauer effect, 202, 229, 592 Range of interaction: beta, 574, 653 electromagnetic, 636, 653 gravitational, 574, 653 nuclear, 635, 653 Rare earth, 334 Rayleigh-Jeans blackbody theory, 6 Rayleigh-Jeans spectrum, 12 Rayleigh scattering, 38, 49, 55, 432 Reaction, nuclear, 588 Reactor. fusion, 607 nuclear, 602 Real part, F-1 Reciprocal wavelength, 70 Reciprocity property, 197 Recombination current, 473 Rectifiers, 472 Recursion relation, I-4 Reduced mass, 105, 233 Reflection coefficient, 188, 196 Regeneration, 660 Reines-Cowan experiment, 575 Relativistic energy, A-15 Relativistic mass, 523, A-14 Relativity theory, A-1 and electron spin, 277 and hydrogen atom, 116, 286 Renormalization, 700 Repulsive core, 627, 629 Residual Coulomb interaction, 353 Residual nucleus, 521 Resistance, 450, 464 negative, 477 Resistivity, 450 Resonances, pion-nucleon, 651 Z v m 00 X W 0 Z Resonant absorption, 584 Rest mass, A-14 Rest mass energy, A-16 Rho meson, 652 Rigid rotator, 264, 299, 423, 599 Rotational quantum number, 424 Rotational spectra: molecular, 423 nuclear, 599 selection rules, 424 Ruby laser, 396 Russell-Saunders coupling, 356 Ruth erford model, 90 Rutherford scattering, 90, E-1 cross-section for, 95, 591 Rydberg constant for finite nuclear mass, 105 for hydrogen, 97 for infinite nuclear mass, 102 Saturation: in molecular binding, 422 of nuclear forces, 524, 618, 629 Scattering, nuclear, 88, 593 Scattering probability flux, L-4 Schmidt line, 543 Schottky specific heat, 413, 506 Schroedinger equation, 132 an d de Broglie postulate, 129 an d differential operators, 145 and Dirac theory, 132 and Newton law, 184 plausibility argument for, 128 Screened Coulomb potential, L-7 Selection rules: for alkali atoms, 351 for beta decay, 572 an d correspondence principle, 117 for gamma decay, 580 for LS coupling, 364 for matrix elements, 292 for one-electron atoms, 288 x-ray, 340 Self-conjugate, 641 Self-consistency, 320 Semiconductor, 450, 467 Semiempirical mass formula, 528 Separation constant, 152 Separation of va ri ables, 151 in one-electron atom Schroedinger equation, 235 Serber potential, 624 Series limit, 97 Series solution of Schroedinger equation, I-1 Shell, 246, 325 Shell model, 534, 549 excited states of, 599 predictions of, 540 Sigma particle, 648 Simple harmonic oscillator classical limit of, 117, 136, 165 eigenfunctions of, 223 eigenvalues of, 222 energy levels in old quantum theory, 20 ground state probability density, 136 ground state wave function, 133 phase diagram, 111 potential for, 221 series solution of, I-1 Simultaneity, A-5 Single particle state, 592 Singlet state, 312 Single-valuedness: of eigenfunction and derivative, 155 of one-electron atom eigenfunction, 237 Size resonance, 202, 592 Slater determinant, 309 Solar cell, 27 Solar constant, 23 Solid an gle, 95 Sommerfeld model, 114 and hydrogen energy levels, 286 Space quantization, 273 Specific heat, 388 Debye, 390 Einstein, 388 Electronic, 406 Shottky, 413 Spectral line, 97, 102 Spectral radiancy, 3 Spectroscopic notation, 331, 339, 358 Spectroscopy, 97 Spherical polar coordinates, 235, M-1 Spin: electron, 272, 274 nuclear, 434, 512, 542 total, 355 Spin dependence of nucleon potential, 621 Spin eigenfunction, 311 Spin g- factor, 274 Spin magnetic dipole moment, 274 Spin-orbit interaction, 278 in alkali atoms, 350, 372 general formula for, 285 in multielectron atoms, 353 in nuclear potential, 537 in nucleon potential, 629 an d Thomas precession, O-1 Spin qu antum number electron, 274 nuclear, 435, 512, 542 total, 358 Spin resonance, electron, 369 Spontaneous emission, 291, 393 Spontaneous fission, 560, 603 Spontaneous symmetry breaking, 700 Square well potential, 209 analytical solution of, H-1 numerical solution of, G-1 Standing waves, 8, 113 Stefan-Boltzmann constant, 4 Stefan law, 4 and Planck spectrum, 19 Stellar formation, 609 Step potential (E < V0 ), 184 (E > V0 ), 193 Steradian, 597 Ste rn-Gerlach experiment, 272 Stimulated absorption, 393 Stimulated emission, 291, 393 Stopping potential, 28 Strangeness, 643, 644 Strange particles, 643 Strong coupling constant, 699 Strong interaction, 641, 653, 655. See also Nuclear interaction Subshell, 252, 329 properties when filled, 252, 363 Superconducting state, 484 Superconductor, 484 type II, 491 Superfluid, 402 Supergravity, 710 Superheavy elements, 561 Supe rnova, 611 Superposition principle, 64 Supersymmetry theory, 710 Surface term, 527 Susceptibility, 493 paramagnetic, 495 SU (2) theory, 673 SU (3) theory, 674, 678 Symmetric eigenfunction, 305 Symmetry character, 310 nuclear, 435, 512 Target nucleus, 521 Tau particle, 647 Tauonic neut rino, 642 Tauons, 642 Taylor experiment, 77 Thermal current, 473 Thermal equilibrium, 381, C-1 Thermal radiation, 2. See also Blackbody radiation Thermionic emission, 407 Theta particle, 647 Thomas frequency, O-3 Thomas precession, O-1 Thomson experiment, 58 Thomson model, 86 Time, flow of, 660 Time dilation, A-8 Time-independent Schroedinger equation, 150 and classical wave equation, 203 and energy quantization, 156 plausibility argument for, 154 Time reversal, 657 Total angular momentum, 281, 355 Total internal reflection, 203 Total magnetic dipole moment, 365 Total orbital angular momentum, 355 Total radial probability density, 323 Total relativistic energy, A-16 Total spin angular momentum, 312, 355 Transistor, 474 Tran sition group, 336 Tr an sition probability, K-4 Tran sition rates: for alpha decay, 207 for beta decay, 570 for electric dipole radiation, 290 for gamma decay, 579 and selection rules, 288, 289 Transmission coefficient, 196 Trian gle anomaly, 705 Triplet state, 312 T ri tium, 571 Tunnel diode, 209, 475 Tunneling, 199, 201, 558, 603 Type II superconductor, 491 Ultraviolet catastrophe, 13 Uncertainties, 150 Uncertainty principle, 65 consequences of, 77 and de Broglie postulate, 72 and infinite square well, 150. interpretation of, 66 an d stability of atom, 248 an d statistical nature of qu antum theory, 139 verification of, 586 and wave-particle duality, 191 and zero- point energy, 217 Unitary group, 701 Unitary symmetry, 673 Unit cell, 448, Q-1 primitive, Q-2 Universal 3 °K blackbody radiation, 20, 609 Vacuum polarization, 699 Valence, 336 Valence band, 450 Van Allen belts, 42 Van der Waals attraction, 444 Vector meson, 652 Vector model, 258, 283 Vector potential, 689 Vibrational qu an tum number, 426 Vibrational spectra, 427 molecular, 426 nuclear, 600 Vibration-rotation spectra, 426 Virial theorem, 263 Virtual particle, 634 Volume term, 527 W± particles, 702 Wave function, 64, 134, 166 interpretation of, 64, 134 and probability density, 135 Wave group, 70 Wave number, 129 Wave velocity, 72 Wave-particle duality, 62 an d matter, 56 and radiation, 40 Weak interaction, 641, 647, 653. See also Beta decay Weak isospin, 702 Weak mixing an gle, 702 Width of energy levels, 583 Wien displacement law, 4, 5 and Planck spectrum, 19 Wilson-Sommerfeld quantization rules, 111 Work function, 30, 408 Wu experiment, 575 X Q Z Yukawa potential, 638 Yukawa theory, 634 Xi particle, 648 X-ray, 32, 40 X-ray continuum spectrum, 41 X-ray line spectrum, 337 X-ray production, 40, 42, 337 X-ray selection rules, 340 X-ray tube, 41 Z° particle, 702 Zeeman effect, 274, 364 Zero point energy, 217, 429 of electromagnetic field, 291 and stability of atom, 248 Zero potential, 178 Zweig forbidden, 679 Yang-Mills theory, 690 .. Date Due SIUKA-S #P R777 Return this book on or before the last date stamped below $ Useful Constants and Conversion Factors Quoted to a useful number of significant figures. Avogadro's number Coulomb's law constant c = 2.998 x 10 8 m/sec e = 1.602 x 10 -19 coul h = 6.626 x 10 -34 joule-sec h = h /2n = 1.055 x 10 -34 joule-sec = 0.6 5 82 x 10 -15 eV-sec k =1.381 x 10 -23 joule / °K = 8.617 x 10 -5 eV/ °K No = 6.023 x 1023/mole 1 /47cE0 = 8.988 x 109 nt - m2 /coul2 Electron rest mass Proton rest mass Neutron rest mass Atomic mass unit (C 12 - 12) me = 9.109 x 10 -31 kg 0.5110 MeV/c 2 mp = 1.672 x 10 -27 kg 2 =938.MeV/c mn = 1.675 x 10 -27 kg 2=93.6MeV/c 2 931.5MeV/c u = 1.661 x 10 -27 kg= Bohr magneton Nuclear magneton Bohr radius Bohr energy µ b = eh/2m, = 9.27 x 10 -24 amp-m2 (or joule/tesla) Speed of light in vacuum Electron charge magnitude Planck's constant Boltzmann's constant = eh /2m p = 5.05 x 10 -27 amp-m 2 (or joule/tesla) ao = 47rEOh2/mee2 = 5.29 x 10 -11 m = 0.529 A -18 E1 = — mee4/( 47cE0) 22h2 = —2.17 x 10 joule = —13.6 eV Electron Compton wavelength Ac = h/mec = 2.43 x 10 -12 m = 0.0243 A Fine-structure constant a = e2 /47cEOhc = 7.30 x 10 -3 ^ 1/137 k300 °K = 0.0258 eV ^ 1/40 eV kT at room temperature 1 eV = 1.602 x 10 -19 joule 1A =10 -10 m 1F=10 -15 m 1 joule = 6.242 x 10 18 eV 1 barn (bn) = 10 -28 m2

Log In

Física Quântica - Eisberg & Resnick

Related papers

Related papers