Adams S. Foundations of Physics

Foundations
of
Physics
Second Edition
FP.CH00_FM_4PP.indd 1 3/17/2023 12:51:18 PM

license, disclaimer of liability, and limited warranty
By purchasing or using this book (the “Work”), you agree that this license grants permission
to use the contents contained herein, but does not give you the right of ownership to any of
the textual content in the book or ownership to any of the information or products contained
in it. This license does not permit uploading of the Work onto the Internet or on a network (of
any kind) without the written consent of the Publisher. Duplication or dissemination of any
text, code, simulations, images, etc. contained herein is limited to and subject to licensing
terms for the respective products, and permission must be obtained from the Publisher or
the owner of the content, etc., in order to reproduce or network any portion of the textual
material (in any media) that is contained in the Work.
Mercury Learning And Information (“MLI” or “the Publisher”) and anyone involved
in the creation, writing, or production of the companion disc, accompanying algorithms,
code, or computer programs (“the software”), and any accompanying Web site or software of
the Work, cannot and do not warrant the performance or results that might be obtained by
using the contents of the Work. The author, developers, and the Publisher have used their
best efforts to insure the accuracy and functionality of the textual material and/or programs
contained in this package; we, however, make no warranty of any kind, express or implied,
regarding the performance of these contents or programs. The Work is sold “as is” without
warranty (except for defective materials used in manufacturing the book or due to faulty
workmanship).
The author, developers, and the publisher of any accompanying content, and anyone involved
in the composition, production, and manufacturing of this work will not be liable for dam-
ages of any kind arising out of the use of (or the inability to use) the algorithms, source code,
computer programs, or textual material contained in this publication. This includes, but is not
limited to, loss of revenue or profit, or other incidental, physical, or consequential damages
arising out of the use of this Work.
The sole remedy in the event of a claim of any kind is expressly limited to replacement of the
book, and only at the discretion of the Publisher. The use of “implied warranty” and certain
“exclusions” vary from state to state, and might not apply to the purchaser of this product.
FP.CH00_FM_4PP.indd 2 3/17/2023 12:51:18 PM

Foundations
of
Physics
Second Edition
Steve Adams, PhD
MERCURY LEARNING AND INFORMATION

Dulles, Virginia
Boston, Massachusetts
New Delhi
FP.CH00_FM_4PP.indd 3 3/17/2023 12:51:19 PM

Reprint and Revision Copyright ©2023 by Mercury Learning and Information LLC.
All rights reserved.
Original title and copyright: Principles of Physics. Copyright ©2022 by The Pantaneto Press. All rights
reserved. Published by The Pantaneto Press.
This publication, portions of it, or any accompanying software may not be reproduced in any way, stored in
a retrieval system of any type, or transmitted by any means, media, electronic display or mechanical display,
including, but not limited to, photocopy, recording, Internet postings, or scanning, without prior permission
in writing from the publisher.
Publisher: David Pallai

Mercury Learning and Information
22841 Quicksilver Drive
Dulles, VA 20166
[email protected]
www.merclearning.com
800-232-0223
Steve Adams. Foundations of Physics, Second Edition.

ISBN: 9781683929703
The publisher recognizes and respects all marks used by companies, manufacturers, and developers as a
means to distinguish their products. All brand names and product names mentioned in this book are trade-
marks or service marks of their respective companies. Any omission or misuse (of any kind) of service marks
or trademarks, etc. is not an attempt to infringe on the property of others.
Library of Congress Control Number: 2022952345
232425321 This book is printed on acid-free paper in the United States of America.
Our titles are available for adoption, license, or bulk purchase by institutions, corporations, etc. For additional
information, please contact the Customer Service Dept. at 800-232-0223(toll free).
All of our titles are available in digital format at authorcloudware.com and other digital vendors. The sole
obligation of Mercury Learning and Information to the purchaser is to replace the book, based on
defective materials or faulty workmanship, but not based on the operation or functionality of the product.
FP.CH00_FM_4PP.indd 4 3/17/2023 12:51:19 PM

For Alison
FP.CH00_FM_4PP.indd 5 3/17/2023 12:51:19 PM

FP.CH00_FM_4PP.indd 6 3/17/2023 12:51:19 PM
Contents
Prefacexxix
CHAPTER 1: THE LANGUAGE OF PHYSICS 1
1.0 Introduction 1
1.1 The SI System of Units 1
1.1.1 Derived Units 2
1.1.2 Energy 2
1.1.3 Viscosity 3
1.2 Dimensions 3
1.2.1 Method of Dimensions 4
1.3 Scientific Notation, Prefixes,
and Significant Figures 5
1.4 Uncertainties 6
1.4.1 Types of Uncertainty 7
1.4.2 Combining Uncertainties 7
1.5 Dealing with Random and Systematic Experimental Errors 8
1.5.1 Random Errors 9
1.5.2 Systematic Errors 9
1.6 Differential Calculus 9
1.6.1 Derivatives and Rates of Change 9
1.6.1.1 Second Derivatives 11
1.6.2 Maximum and Minimum Values 12
FP.CH00_FM_4PP.indd 7 3/17/2023 12:51:19 PM

viii • Contents
1.7 Differential Equations 13

1.8 Integral Calculus 15
1.9 Vectors and Scalars 16
1.9.1 Adding Vectors 16
1.9.2 Resolving Vectors into Components 17
1.9.3 Multiplying Vectors 18
1.9.3.1 Scalar Product 18
1.9.3.2 Vector Product 19
1.10 Symmetry Principles 19
1.11 Exercises 22
CHAPTER 2: REPRESENTING AND ANALYZING DATA 25
2.0 Introduction 25
2.1 Experimental Variables 25
2.2 Recording Data 26
2.3 Straight-Line Graphs 28
2.3.1 Interpreting Straight-Line Graphs 28
2.3.2 Analyzing Straight-Line Graphs 28
2.4 Plotting Graphs and Using Error Bars 29
2.4.1 Plotting Graphs by Hand 29
2.4.2 Finding a Gradient from a Straight-Line Graph 30
2.4.3 Using a Spreadsheet Program (e.g., Excel) 31
2.4.4 Using Error Bars 32
2.5 Logarithms 33
2.5.1 Logarithmic Scales and Logarithms 33
2.5.2 Using Logarithms 34
2.6 Testing Mathematical Relationships between Variables 35
2.6.1 Direct Proportion 35
2.6.2 Inverse Proportion 35
2.6.3 Inverse-Square Law 36
2.6.4 Power Law 36
2.6.5 Exponential Decay or Growth 36
2.7 Exercises 37
FP.CH00_FM_4PP.indd 8 3/17/2023 12:51:19 PM

Contents • ix
CHAPTER 3: CAPTURING, DISPLAYING,

AND ANALYZING MOTION 41
3.0 Introduction 41
3.1 Motion Terminology 41
3.2 Graphs of Motion 42
3.3 Equations of Motion for Constant Acceleration:
The Suvat Equations 44
3.3.1 Derivation 1: From Graphs of Motion 44
3.3.2 Derivation 2: Using Calculus 45
3.4 Projectile Motion 46
3.4.1 Independence of Horizontal and Vertical
Components of Motion 46
3.4.2 Parabolic Paths 46
3.4.3 The Range of a Projectile 47
3.5 Equation of Motion 48
3.6 Methods to Capture and Display
Graphs of Motion 49
3.6.1 Motion Sensors and Dataloggers 49
3.6.2 Light Gates 49
3.6.3 Mobile Phones and Tablets 51
3.6.3.1 Accelerometer Sensor 51
3.6.3.2 Video Capture 51
3.7 Exercises 53
CHAPTER 4: FORCES AND EQUILIBRIUM 55
4.1 Force as a Vector 55
4.1.1 Free-Body Diagrams 55
4.1.2 Resolving Forces 56
4.1.3 Finding a Resultant Force 56
4.2 Mass, Weight, and Center of Gravity 58
4.2.1 Mass 58
4.2.2 Weight 58
4.2.3 Center of Gravity 59
FP.CH00_FM_4PP.indd 9 3/17/2023 12:51:19 PM

x • Contents
4.3 Equilibrium of Coplanar Forces 61

4.3.1 Using the Triangle of Forces to Solve Equilibrium Problems 61
4.3.2 Resolving Forces to Solve Equilibrium Problems 63
4.4 Turning Effects of a Force: Moments, Torques, and Couples 63
4.4.1 Moments and Torques 63
4.4.2 Resultant Moment 64
4.4.3 Couples 65
4.4.4 The Principle of Moments 66
4.5 Stability 68
4.5.1 Types of Mechanical Equilibrium 68
4.5.2 Degrees of Stability 69
4.6 Frictional Forces 70
4.6.1 The Origin of Frictional Forces Between Surfaces in Contact 71
4.6.2 Static and Dynamic (Kinetic) Friction 72
4.6.3 The Coefficients of Friction 72
4.6.4 Measuring the Coefficient of Static Friction 73
4.6.5 Measuring the Coefficient of Dynamic (Kinetic) Friction 74
4.7 Exercises 75
CHAPTER 5: NEWTONIAN MECHANICS 79
5.0 Introduction 79
5.1 Newton’s Laws of Motion 79
5.1.1 Newton’s First Law of Motion 79
5.1.2 Galilean Relativity 80
5.1.3 Newton’s Second Law of Motion 81
5.1.4 Free Fall 82
5.1.5 Newton’s Third Law of Motion 84
5.2 Linear Momentum 86
5.2.1 Newton’s Second Law in Terms of Linear Momentum 86
5.2.2 Impulse and Change of Momentum 87
5.2.3 Conservation of Linear Momentum 87
5.3 Work Energy and Power 90
5.3.1 Work 91
FP.CH00_FM_4PP.indd 10 3/17/2023 12:51:19 PM

Contents • xi
5.3.2 Gravitational Potential Energy Changes (Uniform Field) 92

5.3.3 Kinetic Energy 93
5.3.4 The Law of Conservation of Energy 94
5.3.5 Energy and Momentum in a 2D Collision 95
5.3.6 Energy Transfers 96
5.3.7 Power 97
5.4 Energy Resources 98
5.5 Propulsion Systems 99
5.5.1 Jet Propulsion 99
5.5.2 Rockets 100
5.5.3 Radiation Pressure 102
5.6 Frames of Reference 102
5.6.1 The Center of Mass Frame 104
5.6.2 The Galilean Transformation 105
5.7 Theoretical Mechanics 106
5.7.1 Force and Energy 106
5.7.2 Lagrangian Mechanics 107
5.8 Exercises 109
CHAPTER 6: FLUIDS 111
6.0 Introduction 111
6.1 Hydrostatic Pressure 112
6.1.1 Excess Pressure Caused by a Column of Fluid 112
6.1.2 Atmospheric Pressure 113
6.1.3 Using a Manometer to Measure Pressure Differences 114
6.1.4 Barometers 115
6.1.5 Dams 116
6.2 Buoyancy and Archimedes Principle 118
6.2.1 Buoyancy Forces 118
6.2.2 Archimedes’ Principle 118
6.2.3 Flotation 119
6.3 Viscosity 120
6.3.1 The Coefficient of Viscosity 120
FP.CH00_FM_4PP.indd 11 3/17/2023 12:51:19 PM

xii • Contents
6.4 Fluid Flow 121

6.4.1 Laminar and Turbulent Flow 121
6.4.2 The Equation of Continuity 122
6.4.3 Drag Forces in a Fluid 122
6.4.4 Stokes’ Law 123
6.4.5 Turbulent Drag 124
6.4.6 The Bernoulli Equation 125
6.4.7 The Bernoulli Effect 126
6.4.8 Viscous Flow Through a Horizontal Pipe –
The Poiseuille Equation 126
6.4.9 Measuring the Coefficient of Viscosity 129
6.5 Measuring Fluid Flow Rates 130
6.5.1 A Venturi Meter 130
6.5.2 A Pitot Tube 131
6.6 Exercises 132
CHAPTER 7: MECHANICAL PROPERTIES 137
7.1 Density 137
7.2 Inter-atomic Forces 138
7.3 Stretching Springs 140
7.3.1 The Spring Constant 140
7.3.2 Springs in Series and in Parallel 141
7.3.3 Elastic Potential Energy (Strain Energy) 142
7.4 Stress and Strain 143
7.4.1 The Young’s Modulus 144
7.4.2 Experimental Measurement of Young’s Modulus
for a Metal Wire 145
7.4.3 Stress Versus Strain Graph for a Ductile Metal 147
7.4.4 Rubber Hysteresis 148
7.5 Material Terminology 149
7.6 Material Types 150
7.7 Exercises 151
FP.CH00_FM_4PP.indd 12 3/17/2023 12:51:19 PM

Contents • xiii
CHAPTER 8: THERMAL PHYSICS 155

8.1 Thermal Equilibrium 155
8.2 Measuring Temperature 156
8.3 Temperature Scales 156
8.4 Heat Transfer Mechanisms 158
8.4.1 Conduction 158
8.4.2 Convection 160
8.4.3 Radiation 160
8.5 Black Body Radiation 161
8.6 Heat Capacities 162
8.6.1 Specific Heat Capacity 162
8.6.2 Molar Heat Capacities of Gases 163
8.6.3 Measuring Specific Heat Capacity 164
8.7 Specific Latent Heat 165
8.8 Exercises 166
CHAPTER 9: GASES 169
9.1 The Gas Laws 169
9.1.0 Introduction 169
9.1.1 Boyle’s Law 169
9.1.2 Charles’s Law 171
9.1.3 Gay Lussac’s Law (The Pressure Law) 172
9.2 The Ideal Gas Equation 173
9.3 The Kinetic Theory of Gases 174
9.3.1 Assumptions of the Kinetic Theory 174
9.3.2 Explaining Gas Pressure 174
9.3.3 Molecular Kinetic Energy and Temperature 177
9.3.4 Molar Heat Capacities of an Ideal Monatomic Gas 178
9.3.5 Equipartition of Energy 179
9.3.6 The Law of Dulong and Petit 180
9.3.7 Graham’s Law of Diffusion 181
9.3.8 The Speed of Sound in a Gas 181
FP.CH00_FM_4PP.indd 13 3/17/2023 12:51:19 PM

xiv • Contents
9.4 The Maxwell Distribution 181

9.5 The Boltzmann Factor and Activation Processes 183
9.6 The First Law of Thermodynamics 185
9.6.1 Internal Energy 185
9.6.2 Heating, Working, and the First Law of Thermodynamics 185
9.6.3 Work Done by an Ideal Gas 186
9.6.4 Thermodynamic Changes 186
9.7 Heat Engines and Indicator Diagrams 188
9.7.1 What Is a Heat Engine? 188
9.7.2 Indicator Diagrams 189
9.7.3 The Otto Cycle 190
9.7.4 The Diesel Cycle 193
9.8 Exercises 194
CHAPTER 10: STATISTICAL THERMODYNAMICS AND
THE SECOND LAW 199
10.1 Reversible and Irreversible Processes 199
10.2 The Second Law of Thermodynamics as a Macroscopic Principle 201
10.2.1 Macroscopic Statements of the Second Law 201
10.2.2 Heat Transfer and Entropy 202
10.2.3 Entropy and Maximum Efficiency of a Heat Engine 203
10.3 Entropy and Number of Ways 205
10.3.1 Macro-state and Micro-states 205
10.3.2 Entropy and Number of Ways 208
10.3.3 Poincaré Recurrence 209
10.4 What Is Temperature? 209
10.5 Absolute Zero and Absolute Entropy 210
10.5.1 Entropy at Absolute Zero 210
10.5.2 Calculating Absolute Entropy 210
10.5.3 Entropy Changes for an Ideal Gas 211
10.6 Refrigerators and Heat Pumps 212
10.6.1 Refrigerators 212
10.6.2 Heat Pumps 214
FP.CH00_FM_4PP.indd 14 3/17/2023 12:51:19 PM

Contents • xv
10.7 Implications of the Second Law 214

10.7.1 The Second Law, the Arrow of Time, and the Universe 214
10.7.2 The Second Law and Living Things 215
10.7.3 Entropy and Energy Availability 215
10.8 Exercises 216
CHAPTER 11: OSCILLATIONS 219
11.0 Oscillations 219
11.1 Capturing and Displaying Oscillatory Motion 219
11.1.1 Graphs and Equations of Displacement, Velocity,
and Acceleration 221
11.1.2 Phase and Phase Difference 223
11.2 Simple Harmonic Motion 223
11.2.1 Equation of Motion for Simple Harmonic Motion 223
11.2.2 Physical Conditions for Simple Harmonic Motion 224
11.3 The Mass-Spring Oscillator 224
11.4 The Simple Pendulum 225
11.5 Energy in Simple Harmonic Motion 227
11.5.1 Variation of Energy with Time 227
11.5.2 Variation of Energy with Position 228
11.5.3 Damping 229
11.6 Forced Oscillations and Resonance 230
11.7 Exercises 231
CHAPTER 12: ROTATIONAL DYNAMICS 235
12.1 Angles 235
12.1.1 Measuring Angles in Radians 235
12.1.2 Small Angle Approximations 236
12.2 Describing Uniform Circular Motion 237
12.2.1 Angular Displacement, Angular Velocity, and
Angular Acceleration 237
12.3 Centripetal Acceleration and Centripetal Force 238
12.3.1 Centripetal Acceleration 238
12.3.2 Centripetal Force 239
FP.CH00_FM_4PP.indd 15 3/17/2023 12:51:19 PM

xvi • Contents
12.3.3 Centripetal Not Centrifugal 239

12.3.4 Moving in Uniform Circular Motion 241
12.4 Circular Motion, Simple Harmonic
Motion, and Phasors 244
12.5 Rotational Kinematics 246
12.5.1 Equations for Uniform Angular Acceleration 246
12.5.2 Rotational Kinetic Energy 247
12.5.3 Angular Momentum 249
12.5.4 The Second Law of Motion for Rotation. 250
12.5.5 Conservation of Angular Momentum 252
12.6 Deriving Expressions for Moments of Inertia 252
12.6.1 Moment of Inertia of One or More Point Masses 252
12.6.2 Moment of Inertia of a Rod 253
12.6.3 Moment of Inertia of a Cylindrical Shell and
a Uniform Cylinder 254
12.6.4 Moment of Inertia of a Uniform Sphere 255
12.7 Torque Work and Power 256
12.8 Rotational Oscillations, the Compound Pendulum 256
12.9 Exercises 258
CHAPTER 13: WAVES 263
13.1 Describing and Representing Waves 263
13.1.1 Basic Wave Terminology 263
13.1.2 Transverse and Longitudinal Waves 265
13.1.3 Graphs of Wave Motion 267
13.1.4 Equation for a One-Dimensional Traveling Wave 267
13.1.5 Amplitude and Intensity 269
13.2 Reflection 269
13.3 Refraction 270
13.3.1 Refraction at a Boundary Between Two Different Media 270
13.3.2 Snell’s Law of Refraction 271
FP.CH00_FM_4PP.indd 16 3/17/2023 12:51:19 PM

Contents • xvii
13.3.3 Absolute and Relative Refractive Indices 274

13.3.4 Total Internal Reflection 274
13.3.5 Optical Fibers 275
13.3.6 Dispersion 276
13.4 Polarization 277
13.4.1 What Is Polarization? 277
13.4.2 Polarizing Filters 277
13.4.3 Rotation of the Plane of Polarization 279
13.4.4 Polarization by Reflection and Scattering 280
13.5 Exercises 281
CHAPTER 14: LIGHT 285
14.1 Light as an Electromagnetic Wave 285
14.1.1 Waves or Particles? 285
14.1.2 Electromagnetism 285
14.1.3 Electromagnetic Waves 286
14.1.4 Measuring the Speed of Light 288
14.1.5 Maxwell’s Equations and the Speed of Light 290
14.1.6 Defining Speed, Time, and Distance 291
14.2 Ray Optics 291
14.2.1 Thin Lenses 291
14.2.2 Predictable Rays for Thin Lenses 293
14.2.3 Images 294
14.2.4 Image Formation with a Convex Lens 294
14.2.5 Image Formation with a Concave Lens 296
14.2.6 Object at Infinity 297
14.2.7 The Lens Equation 298
14.2.8 Virtual Image Formed by a Plane Mirror 299
14.2.9 Real and Apparent Depth 300
14.3 Optical Instruments 301
14.3.1 An Astronomical Refracting Telescope 301
14.3.2 An Astronomical Reflecting Telescope (Newtonian Telescope) 302
14.3.3 A Compound Microscope 304
FP.CH00_FM_4PP.indd 17 3/17/2023 12:51:19 PM

xviii • Contents
14.4 The Doppler Effect 304

14.4.1 The Doppler Effect for Electromagnetic Waves 305
14.4.2 “Red Shift” and “Blue Shift” 306
14.5 Exercises 307
CHAPTER 15: SUPERPOSITION EFFECTS 311
15.0 Superposition Effects 311
15.1 Two-Source Interference 311
15.1.1 Demonstrating Superposition Effects with Sound 312
15.1.2 Demonstrating Superposition Effects with Light 314
15.1.3 Using the Double Slit Experiment to Find the
Wavelength of Light 315
15.1.4 Superposition of Harmonic Waves 316
15.2 Diffraction Gratings 318
15.2.1 The Diffraction Grating Formula 319
15.2.2 Spectroscopy 323
15.2.3 Spectrometers 325
15.3 Diffraction by Slits and Holes 326
15.3.1 Diffraction by a Narrow Slit 326
15.3.2 Analysis of the Single Slit Diffraction Pattern 327
15.3.3 Diffraction Through a Circular Hole 329
15.3.4 Resolving Power and the Rayleigh Criterion 330
15.4 Standing (Stationary) Waves 331
15.4.1 Standing Waves on a String (Melde’s Experiment) 331
15.4.2 The Mathematics of Standing Waves 334
15.5 Exercises 335
CHAPTER 16: SOUND 339
16.1 The Nature and Speed of Sound 339
16.2 The Decibel Scale 340
16.3 Standing Waves in Air Columns 341
16.4 Measuring the Speed of Sound 343
16.5 Ultrasound 344
16.6 Analysis and Synthesis of Sound 345
16.7 Exercises 346
FP.CH00_FM_4PP.indd 18 3/17/2023 12:51:19 PM

Contents • xix
CHAPTER 17: ELECTRIC CHARGE AND

ELECTRIC FIELDS 349
17.1 Electric Charge 349
17.2 Electrostatics 350
17.2.1 Charging by Friction 350
17.2.2 The Gold Leaf Electroscope 351
17.2.3 Using a Coulomb Meter 353
17.3 Electrostatic Forces 354
17.3.1 Coulomb’s Law 354
17.3.2 Investigating Electrostatic Forces 355
17.4 The Electric Field 356
17.4.1 Electric Field Strength 356
17.4.2 Electric Field Strength of a Point Charge 358
17.4.3 Gauss’s Law 359
17.4.4 Using Gauss’s Theorem 361
17.5 Electric Potential Energy and Electric Potential 362
17.5.1 Electric Potential and Potential Difference 363
17.5.2 Electric Potential Gradient and Electric Field Strength 363
17.5.3 Accelerating Charged Particles in an Electric Field 365
17.5.4 Deflecting Charged Particles in an Electric Field 366
17.5.5 The Absolute Electric Potential of a Point Charge 367
17.6 Exercises 369
CHAPTER 18: DC ELECTRIC CIRCUITS 373
18.0 Direct Current (DC) Circuits and Conventional Current 373
18.1 Charge and Current 374
18.1.1 Charge Carriers and Charge Carrier Density 374
18.1.2 Measuring Current 375
18.1.3 Currents in Circuits – Kirchhoff’s First Law 376
18.2 Measuring Potential Difference 377
18.2.1 EMF Potential Difference and Voltage 377
18.2.2 Kirchhoff’s Second Law 378
18.3 Resistance 379
18.3.1 Measuring Resistance 380
FP.CH00_FM_4PP.indd 19 3/17/2023 12:51:19 PM

xx • Contents
18.3.2 Current–Voltage Characteristics 382

18.3.3 Resistors in Series and in Parallel 386
18.3.4 Resistivity 388
18.4 Electrical Energy and Power 389
18.4.1 EMF and Internal Resistance of a Real Cell 390
18.4.2 Measuring the Internal Resistance and emf of a Cell 391
18.4.3 Power Transfer from a Real Cell to a Load Resistor 392
18.5 Resistance Networks 394
18.5.1 Potential Dividers 394
18.5.2 Using Kirchhoff’s Laws to Solve Resistance Networks 395
18.6 Semiconductors and Superconductors 396
18.6.1 Semiconductors 396
18.6.2 Variation of Resistance of a Metal with Temperature 397
18.7 Exercises 398
CHAPTER 19: CAPACITANCE 403
19.1 What Is a Capacitor? 403
19.1.1 Capacitors and Charge 403
19.1.2 Capacitance 405
19.1.3 Energy Stored on a Charged Capacitor 406
19.1.4 Efficiency of Charging a Capacitor 407
19.2 The Parallel Plate Capacitor 408
19.3 Capacitor Charging and Discharging 408
19.3.1 Equations for Capacitor Discharge 409
19.3.2 Equations for Capacitor Charging 411
19.4 Capacitors in Series and Parallel 412
19.4.1 Capacitance of Capacitors in Series 412
19.4.2 Capacitors in Parallel 413
19.5 The Capacitance of a Charged Sphere 413
19.6 Exercises 414
CHAPTER 20: MAGNETIC FIELDS 417
20.0 The Magnetic Field 417
20.1 Permanent Magnets 419
FP.CH00_FM_4PP.indd 20 3/17/2023 12:51:19 PM

Contents • xxi
20.2 Magnetic Forces on Electric Currents and Moving Charges 420

20.2.1 The Magnetic Force on an Electric Current 420
20.2.2 The Force on a Moving Charge 422
20.2.3 The Path of a Moving Charged Particle in a Magnetic Field 423
20.2.4 The Velocity-Selector: Crossed Electric and Magnetic Fields 424
20.3 The Magnetic Fields Created by Electric Currents 427
20.3.1 The Biot–Savart Law 427
20.3.2 The Magnetic Field at the Center of a Narrow Coil 427
20.3.3 The Magnetic Field of a Long Straight Current-Carrying Wire 429
20.3.4 The Magnetic Field Along the Axis of a Solenoid 430
20.3.5 Ampère’s Theorem 431
20.4 Electric Motors 432
20.4.1 The Turning Effect on a Coil in a Uniform Magnetic Field 432
20.4.2 A Simple DC Electric Motor 433
20.5 Exercises 434
CHAPTER 21: ELECTROMAGNETIC INDUCTION 437
21.1 Induced emfs 437
21.1.1 What Is Electromagnetic Induction? 437
21.1.2 Electromagnetic Induction Experiments 438
21.2 The Laws of Electromagnetic Induction 441
21.2.1 Magnetic Flux and Magnetic Flux Linkage 441
21.2.2 Faraday’s Law of Electromagnetic Induction 443
21.2.3 Changing the Flux-Linkage in a Coil 445
21.3 Inductance 446
21.3.1 Self-inductance 446
21.3.2 The Rise of Current in an Inductor 447
21.3.3 The Energy Stored in an Inductor 448
21.3.3 Mutual Inductance 448
21.4 Transformers 449
21.4.1 An Ideal Transformer 450
21.4.2 Transmission of Electrical Energy 451
21.4.3 Real Transformers 452
FP.CH00_FM_4PP.indd 21 3/17/2023 12:51:19 PM

xxii • Contents
21.5 A Simple AC Generator 453

21.6 Electromagnetic Damping 454
21.7 Induction Motors 456
21.8 Exercises 458
CHAPTER 22: AC 463
22.1 AC and DC 463
22.1.1 AC Power and rms Values 464
22.2 Resistance and Reactance 465
22.2.1 Resistors in AC Circuits 465
22.2.2 Capacitors in AC Circuits 466
22.2.3 Inductors in AC Circuits 467
22.3 Resistance, Reactance, and Impedance 469
22.3.1 Phasor Diagrams for AC Series Circuits 469
22.3.2 Impedance 470
22.4 AC Series Circuits 471
22.4.1 RC Series Circuit 471
22.4.2 RL Series Circuit 472
22.4.3 RCL Series Circuit 473
22.4.4 Parallel Circuits Containing Resistors, Capacitors,
and Inductors 475
22.5 Electric Oscillators 476
22.5.1 A Mechanical Analogy 478
22.6 Exercises 479
CHAPTER 23: THE GRAVITATIONAL FIELD 483
23.1 Gravitational Forces and Gravitational Field Strength 483
23.1.1 Newton’s Law of Gravitation 483
23.1.2 Gravitational Field Strength 484
23.1.3 The Gravitational Field Strength of the Earth 486
23.2 Gravitational Potential Energy and Gravitational Potential 487
23.2.1 Change in Gravitational Potential Energy 488
23.2.2 Gravitational Potential 489
23.2.3 Gravitational Field Lines and Equipotentials 490
FP.CH00_FM_4PP.indd 22 3/17/2023 12:51:19 PM

Contents • xxiii
23.2.4 Gravitational Potential Energy in the Earth’s Field 491

23.2.5 Escape Velocity 492
23.3 Orbital Motion 494
23.3.1 Early Ideas About Planetary Motion 495
23.3.2 Circular Orbits 496
23.3.3 Artificial Satellites 498
23.4 Tidal Forces 499
23.4.1 The Origin of Tidal Forces 499
23.4.2 The Earth’s Ocean Tides 501
23.5 Einstein’s Theory of Gravitation 503
23.5.1 Space–Time Curvature 504
23.5.2 The Equivalence Principle 505
23.5.3 Gravitational Time Dilation 506
23.5.4 Gravitational Waves 508
23.6 Exercises 510
CHAPTER 24: SPECIAL RELATIVITY 513
24.1 The Postulates of Special Relativity 513
24.1.1 Absolute Space 513
24.1.2 Einstein’s Ideas About the Laws of Physics 516
24.2 Time in Special Relativity 517
24.2.1 Time Dilation 517
24.2.2 The “Twin Paradox” 521
24.2.3 The Relativity of Simultaneity 522
24.3 Length Contraction 523
24.4 The Lorentz Transformation 524
24.4.1 The Lorentz Transformation Equations 525
24.4.2 The Velocity Addition Equation 526
24.5 Mass, Velocity, and Energy 527
24.5.1 Mass and Velocity 527
24.5.2 Mass and Energy 528
FP.CH00_FM_4PP.indd 23 3/17/2023 12:51:19 PM

xxiv • Contents
24.6 Special Relativity and Geometry 529

24.6.1 Invariants 530
24.6.2 Space–Time 531
24.6.3 Mass, Energy, and Momentum 533
24.7 Exercises 534
CHAPTER 25: ATOMIC STRUCTURE AND RADIOACTIVITY 537
25.1 The Nuclear Atom 537
25.1.1 The Rutherford Scattering Experiment 537
25.1.2 Closest Approach and Nuclear Size 539
25.1.3 Using Electron Diffraction to Measure Nuclear Diameter 540
25.1.4 The Nuclear Atom 541
25.2 Ionizing Radiation 542
25.2.1 Types of Ionizing Radiation Emitted by Radioactive Sources 542
25.3 Attenuation of Ionizing Radiation 543
25.3.1 Inverse-Square Law of Absorption 543
25.3.2 Exponential Absorption and the Attenuation Coefficient 544
25.3.3 Absorption of Beta Radiation 546
25.3.4 Absorption of Alpha Particles 547
25.4 The Biological Effects of Ionizing Radiation 547
25.4.1 The Natural Background Radiation 548
25.4.2 Measuring Radiation Dose 549
25.4.3 The Effect of Radiation Dose on Human Health 550
25.4.4 Reducing Risks in the Laboratory 550
25.5 Radioactive Decay and Half-Life 551
25.6 Nuclear Transformations 554
25.6.1 Alpha Decay 554
25.6.2 Beta-Minus Decay 554
25.6.3 Gamma Emission 555
25.6.4 Beta-Plus Emission 556
25.6.5 Electron-Capture 557
FP.CH00_FM_4PP.indd 24 3/17/2023 12:51:19 PM

Contents • xxv
25.7 Radiation Detectors 557

25.7.1 The Spark Counter 558
25.7.2 The Geiger Counter 558
25.7.3 Using a Geiger Counter to Measure Count Rates 560
25.8 Using Radioactive Sources 561
25.8.1 Radiological Dating 561
25.8.2 Radiological Dating of Rocks 562
25.9 Exercises 563
CHAPTER 26: NUCLEAR PHYSICS 567
26.1 Nuclear Energy Changes 567
26.1.1 Nuclear Binding Energy 567
26.1.2 Atomic Mass Units (amu) 569
26.1.3 Energy Released by Nuclear Decays 569
26.2 Nuclear Stability 571
26.2.1 Nuclear Configuration and Stability 571
26.2.2 Nuclear Binding Energy and Stability 573
26.3 Nuclear Fission and Nuclear Fusion 574
26.3.1 Nuclear Fission 575
26.3.2 The Principle of the Atomic Bomb 576
26.3.3 Nuclear Reactors 578
26.3.4 Plutonium 580
26.3.5 Nuclear Fusion 580
26.3.6 Nucleosynthesis 581
26.3.7 Thermonuclear Weapons 583
26.3.8 Fusion Reactors 584
26.4 Particle Physics 586
26.4.1 Leptons 587
26.4.2 Hadrons and Quarks 587
26.4.3 The Fundamental Interactions 588
26.4.4 The Conservation Laws 590
FP.CH00_FM_4PP.indd 25 3/17/2023 12:51:19 PM

xxvi • Contents
26.4.5 The Standard Model 590

26.4.6 Dark Matter and Dark Energy 592
26.5 Exercises 593
CHAPTER 27: QUANTUM THEORY 597
27.1 Problems in Classical Physics 597
27.1.1 Planck and the Black Body Radiation Spectrum 599
27.1.2 Explaining Heat Capacities 600
27.1.3 Explaining the Photoelectric Effect 600
27.1.4 Characteristics of Photoelectric Emission 601
27.1.5 Measuring the Planck Constant 602
27.2 Matter Waves 605
27.2.1 The de Broglie Relation 605
27.2.2 Electron Diffraction 606
27.2.3 The Compton Effect 608
27.3 Wave-Particle Duality 609
27.3.1 Young’s Double Slit Experiment Revisited 610
27.3.2 Interpreting Wave-Particle Duality 613
27.3.3 The Schrödinger Equation 614
27.4 The Quantum Atom 615
27.4.1 Bohr’s Model of the Hydrogen Atom 616
27.4.2 Explaining the Hydrogen Line Spectrum 618
27.4.3 Electron Waves in Atoms 620
27.4.4 The Schrödinger Atom 621
27.5 Interpretations of Quantum Theory 625
27.5.1 The Copenhagen Interpretation 626
27.5.2 Heisenberg’s Uncertainty (Indeterminacy) Principle 628
27.5.3 The Sum-Over-Histories Approach 630
27.5.4 The Many-Worlds Theory 632
27.5.5 Schrödinger’s Cat 634
27.6 Exercises 635
FP.CH00_FM_4PP.indd 26 3/17/2023 12:51:19 PM

Contents • xxvii
CHAPTER 28: ASTROPHYSICS 639

28.0 Physics Astrophysics and Cosmology 639
28.1 Stars 640
28.1.1 Mass 640
28.1.2 Stars as Black Bodies 643
28.1.3 Stellar Spectra and the Hertzsprung–Russell Diagram 644
28.2 Distances 645
28.2.1 Trigonometric Parallax 645
28.2.2 The Inverse-Square Law and Cepheid Variables 646
28.2.3 Hubble’s Law 648
28.3 Cosmology 650
28.3.1 The Origin and Age of the Universe 650
28.3.2 Evidence for the Big Bang 652
28.4 Exercises 653
CHAPTER 29: MEDICAL PHYSICS 655
29.1 Ultrasound 655
29.1.1 Overview of Ultrasound 655
29.1.2 Ultrasound and the Eye 656
29.1.3 Doppler Ultrasound for Blood Flow Measurements 657
29.1.4 Using Ultrasound to Break Kidney Stones 658
29.2 X-rays 658
29.2.1 Overview of Medical X-rays 658
29.2.2 Generating X-Rays 659
29.2.3 Attenuation of X-Rays in Matter 661
29.2.4 Creating X-Ray Images 662
29.3 Magnetic Resonance Imaging (MRI) 664
29.3.1 Overview of MRI 664
29.3.2 The Physics of MRI 664
29.4 Radioactive Tracers 666
29.4.1 Overview of the Use of Radioactive Tracers 666
29.4.2 The Gamma-Camera 666
FP.CH00_FM_4PP.indd 27 3/17/2023 12:51:19 PM

xxviii • Contents
29.5 Positron Emission Tomography (PET Scans) 667

29.5.1 The Physics of PET Scans 667
29.6 Exercises 669
APPENDIX A: ESTIMATIONS AND FERMI QUESTIONS 671
A.0 Fermi and the Trinity Test 671
A.1 Making Estimations 673
A.1.1 How Many Air Molecules in the Earth’s Atmosphere? 673
A.1.2 What Is the Minimum Area for a Parachute? 675
A.2 Useful Values 677
A.3 Fermi Questions 678
A.4 The Drake Equation 679
A.5 Try These: Estimates and Fermi Questions 680
APPENDIX B: EXPERIMENTAL INVESTIGATIONS 683
B.0 Introduction: The Nature of Science 683
B.1 Carrying Out an Experiment 684
B.1.1 Variables 685
B.1.2 Selecting Measuring Equipment 685
B.1.3 Planning a Procedure 687
B.1.4 Risk Assessments 687
B.1.5 Writing Up an Experiment 688
B.2 Investigations 690
APPENDIX C: UNITS, CONSTANTS, AND EQUATIONS 693
C.1 SI Units 693
C.2 Simple Approximate Combinations of Uncertainties 694
C.3 Useful Derivatives 694
C.4 Differential Equations 695
C.5 Differentials and Integrals 695
C.6 Equations 696
C.7 Constants 710
APPENDIX D: SOLUTIONS TO EXERCISES 711
GLOSSARY737
INDEX763
FP.CH00_FM_4PP.indd 28 3/17/2023 12:51:19 PM

Preface
The aim of this book is to draw on the essential physical principles that typical
physics courses use to provide a strong conceptual base for the further study
of more advanced topics. As such this book provides support for both intro-
ductory courses (calculus-based) and for readers interested in a basic review
of key topics in physics. It will also be a useful reference work for instructors.
The focus is on physical principles. Applications are used to exemplify the
physics but do not divert attention from the underlying concepts. Mathematics
is the language of physics and a mathematical approach is taken throughout,
drawing mathematical techniques including basic calculus. The approach
here acknowledges this and helps to secure a foundation of relevant math-
ematical skills in the context of real physical problems.
Practical techniques, including the collection, presentation, analysis and eval-
uation of data, are discussed in the context of key experiments linked to the
theoretical spine of the work. There are also sections on testing mathematical
relationships, the analysis of uncertainties, and how to approach, carry out,
and write-up experimental investigations.
Every chapter concludes with a set of exercises and an appendix on Fermi
problems provides an open-ended challenge that allows the reader to practice
their skills in unfamiliar contexts.
How to use the book

Although the order of topics in this book mirrors that of most physics courses,
it is not intended to be read in order or from cover to cover, and most chap-
ters can be read in isolation. The book is there to be consulted, as and when a
FP.CH00_FM_4PP.indd 29 3/17/2023 12:51:19 PM

xxx • Preface
topic is studied or revised. The early sections on the language of physics and
representing and analyzing data can be referred back to from any of the other
sections. The appendices contain summary lists of units and useful data, equa-
tions, solutions to problem and a guide to planning, carrying out and writing
up experiments. There is also an extensive glossary of terms.
Note added for the 2nd edition

Somehow, since the publication of this book, I find myself back teaching phys-
ics, am Head of STEM Outreach and train instructors through the NMAPS
Initial Teacher Training (SCITT) program. It is a pleasure and a privilege
to teach and to share the joy of trying to understand and communicate the
elegant beauty of physics. It has also been a pleasure to revisit Principles of
Physics and make a few minor changes that make it easier to use. Enjoy!
S. Adams
February 2023
FP.CH00_FM_4PP.indd 30 3/17/2023 12:51:19 PM

CHAPTER
1
The Language of Physics
1.0 INTRODUCTION
NASA’s Mars climate orbiter was launched in 1998 and should have gone
into orbit around Mars 286 days later. Instead, it fell too close to the
planet and broke up in the atmosphere. The mission had cost upwards of
$100 000 000…… Why did this happen? Because Lockheed Martin, who was
calculating the thrust to maneuver the spacecraft, used English units (pound-
seconds) and NASA, who controlled the thrusters, was expecting metric units
(newton-seconds).
1.1 THE SI SYSTEM OF UNITS

The international system of units is based on seven base units:
Quantity Name Symbol
SI base unit
Length Meter m
Mass Kilogram kg
Time Second s
Electric current Ampère A
Temperature Kelvin K
Amount of substance Mole mol
Luminous intensity Candela cd
The definition of each base unit is related to the experimental method that
is used to establish the unit in the laboratory. All mechanical quantities can
FP.CH01_3pp.indd 1 3/15/2023 12:25:38 PM

2 • Foundations of Physics 2/E
be expressed in terms of just three base units: length (the meter), mass (the
kilogram), and time (the second).
Base unit Definition
Meter The distance traveled by light in a vacuum in a time of 1/299 792 458 seconds
kilogram In 2019 the kilogram was redefined in terms of the fixed value of the Planck constant and
is measured electromagnetically using a Kibble balance.
second The duration is equal to 9 192 631 770 periods of radiation corresponding to the
transition between the two hyperfine levels of the ground state of a cesium 133 atom.
You will notice that the meter is actually defined in terms of the speed of
light. This has been the case since 1983 when the speed of light, which had
been measured with ever-increasing precision, was defined to have the value
299 792 458 m s− 1.
1.1.1 Derived Units

Many physical quantities are measured in derived units. These are combina-
tions of base units such as m3 for volume or kg m− 3 for density. Some common
derived units which are given their own names are shown in the table.
Derived quantity Name Symbol
SI derived units SI base units
Force Newton N kg m s− 2
Pressure Pascal Pa kg m− 1 s− 2
Energy Joule J kg m2 s− 2
Power Watt W kg m2 s− 3
When you encounter a new physical quantity that is unfamiliar you can work
out its SI units by making sure that an equation containing the new quantity
is balanced. All equations in physics must balance in terms of units. Here are
two examples.
1.1.2 Energy
Any equation involving energy will suffice, for example, E = ½ mv2. The units
of the right-hand side are kg m2 s− 2 so these must be the units of energy. The
name “joule” is a convenient alternative for such a common physical unit so 1
J = 1 kg m2 s− 2.
FP.CH01_3pp.indd 2 3/15/2023 12:25:38 PM

The Language of Physics • 3
1.1.3 Viscosity
There is an equation for the viscous drag on a sphere moving through a fluid
called “Stokes’ law.” This has the form: F = 6πη rv, where r and v are the radius
and velocity of the sphere, respectively. η is the viscosity of the fluid. What are
the SI units of viscosity? First, rearrange the equation to give η = F/6πrv, and
then balance the units (6π has no units, it is simply a number). The right-hand
side has units of N m− 2 s or Pa s (this is because of 1 Nm− 2 = 1 Pa). This can
be reduced to base units by substituting for N (=kg m s− 2). The base units for
viscosity are therefore kg m− 1 s− 1.
1.2 DIMENSIONS
All mechanical quantities depend on mass (M), length (L), and time (T).
These are the fundamental dimensions of mechanics and are measured in
terms of the base units of kg, m, and s. All equations in physics must balance
in terms of numerical value, units, and dimensions. When they do they are
said to be “homogeneous.” This can be very useful for checking your work
during a derivation (if you made a mistake in the algebra then the dimensions
might not balance) and for testing proposed equations to see if they are viable.
It can also be used to construct possible equations if you have an idea of the
relevant parameters.
To indicate that we are dealing with dimensions we use square brackets
so that [E] means “the dimensions of” energy and [v] means “the dimensions
of” velocity. The fundamental dimensions are related in a similar way to base
units so it is possible to work them out by balancing simple equations. For
example, if we want to find the dimensions for energy, we can use the equa-
tion E = mgh. The dimensions of E will be the same as the dimensions of mgh:
[E] = [m][g][h] = M LT − 2 L = ML2T − 2
If the dimensions are now replaced by base units, we see that the SI unit for
energy is kg m2 s− 2 as before.
Five basic dimensions can be used to express most quantities in physics.
Dimension Mass Length Time Current Temperature
Symbol M L T I θ
Dimensionless numbers play an important role in many areas of physics.

These are combinations of physical quantities where the dimensions cancel
FP.CH01_3pp.indd 3 3/15/2023 12:25:38 PM

so that the result is a pure number. These often have (or seem to have) great
significance. The fine structure constant α in quantum electrodynamics is a
good example:
e2
=
α = 0.007 297 352 5664
2 0 hc
where e is the charge on an electron, ε0 is the permittivity of free space, h is
the Planck constant, and c is the speed of light. This dimensionless constant
determines the strength of the interactions between electrons and photons. It
is also, approximately, the speed of an electron in units of the velocity of light.
Since α << 1, the motion of electrons in atoms does not have to be treated
relativistically.
Another interesting dimensionless number is the ratio of the mass of a proton
to the mass of an electron:
mp
=
µ = 1 836.152 673 89
me
Physicists think that we should be able to derive numbers such as µ from a
fundamental physical theory. The fact that, so far, we have been unable to do
this suggests that there is more new physics to be discovered.
1.2.1 Method of Dimensions

The method of dimensions uses the requirement that all physical equations
must balance to construct possible equations for physical quantities. The
starting point must be some physical intuition about the system. For example,
a few experiments with a mass-spring oscillator should convince you that the
time period of the oscillator depends only on the mass m and spring constant
k. If this is correct and if the equation for time period has the form:
T = constant × mx × ky
We can find x and y by balancing the dimensions (the constant is assumed to

be dimensionless).
[T] = [m]x × [k]y
Dimensions of k are MT− 2
T1 = Mx (MT − 2)y = Mx+y T − 2y
FP.CH01_3pp.indd 4 3/15/2023 12:25:39 PM

Equating powers of each dimension:

For the dimension of time 1 = − 2y so y = − ½
For the dimension of mass 0 = x + y so x = + ½
This suggests that a coherent form of the equation could be:
T = constant × m1/2 × k− 1/2
m
or T = constant
k
m
Theory shows that the equation is actually: T = 2π
k
The method of dimensions cannot determine the numerical value of the con-
stant (because it is dimensionless) but it can show a possible coherent form
for the equation.
1.3 SCIENTIFIC NOTATION, PREFIXES,

AND SIGNIFICANT FIGURES
Physicists have to deal with a huge range of numerical values. For example,
the charge on an electron is 0.000 000 000 000 000 000 160 C and the break-
ing stress of steel is 210 000 000 000 Pa. Writing numbers out in full like this
is tedious and makes them hard to manipulate so physicists use scientific nota-
tion which reduces them to a number between 1 and 10 multiplied by a power
of 10. In scientific notation:
− 19
Charge on an electron = 1.60 × 10 C
9
Young modulus of steel = 2.10 × 10 Pa
Another way to deal with very large and very small values is to use prefixes.
These represent multiplication by a power of ten. For example, the prefix
“milli” represents multiplication by 10− 3 so 1 mm = 10− 3 m = 0.001 m. Here is
a list of common prefixes and their multiplication factors.
Name Symbol Multiplier Name Symbol Multiplier
−3
Milli m 10 Kilo k 103
−6
Micro µ 10 Mega M 106
Nano n 10− 9 Giga G 109
− 12
Pico p 10 Tera T 1012
Femto f 10− 15 Peta P 1015
− 18
Atto a 10 Exa E 1018
FP.CH01_3pp.indd 5 3/15/2023 12:25:39 PM

These increase (or decrease) in multiples of 103 but there are some prefixes
that are in common use that do not fit this pattern: “centi” multiples by 10− 2
and “deci” multiples by 10− 1. In chemistry, it is quite common to state volumes
in decimeters cubed. One decimeter is 10 cm or 0.10 m so this is a volume of
1000 cm3 or 1 liter…. In SI base units, it is 0.001 m3.
When presenting data, it is important to use an appropriate number of signifi-
cant figures, even if some of these are zeroes. The number of significant figures
used represents the precision of the data, so a length of 1.20 m is more pre-
cise than 1.2 m. In principle, 1.2 m could have been rounded from anything
between 1.150 m and 1.249 m so has an uncertainty of ±0.05, whereas a length of
1.20 m must really lie between 1.195 m and 1.205 m, and uncertainty of ± 0.005.
Quoting the third significant figure has increased the precision by a factor of 10.
However, it is also important not to quote data to too many significant figures
if this is not justified by the measurements that were used to obtain the data.
For example, if you were trying to calculate the density of a block of wood and
had measured a mass of 40.5 g and a volume of 24.2 cm3 the value for density
is 40.5/24.2 = 1.673553719 g cm− 3. The result must be rounded off so that it
is consistent with the data used to calculate it. In this case, data were given to
three significant figures so an appropriate result is 1.67 g cm− 3.
A useful rule of thumb: quote calculated values to the same number of
significant figures as the least precise piece of data used in the calculation.
1.4 UNCERTAINTIES
In 2012, the discovery of the Higgs boson was announced at the Large Hadron
Collider at CERN. Physicists had tuned the collider to the energy range in
which the particle was expected to be found and sure enough, it appeared.
This allowed physicists to measure the mass of the Higgs: 125.09 ± 0.21 GeV/c2
(the GeV/c2 is a convenient mass unit used in particle physics). The quoted
uncertainty is less than 0.2% of the mass. This pins the mass of the Higgs into
a small enough range so that its properties can be compared with theoretical
predictions.
If the uncertainty in a measured value is too large it is not very useful. For
example, if you were asked to measure the acceleration due to gravity in a
laboratory and you got a value of 9.8 m s− 2 you might be quite happy and feel
that you had done a good job. However, if the uncertainty associated with that
value was ±1.0 m s− 2 then the acceleration due to gravity as measured in your
experiment could lie anywhere between 8.8 m s− 2 and 10.8 m s− 2 so the fact
FP.CH01_3pp.indd 6 3/15/2023 12:25:39 PM

that the actual value turns out to be close to the true value was probably just
by chance. If you repeated the experiment in the same way you would prob-
ably get a very different value.
Whenever you calculate a value from experimental data you should
include the estimated uncertainty. This is a measure of the reliability of
the measured value.
While a detailed analysis of uncertainties and their effect on calculated values
involves a lot of statistics there are some simple methods that will give a rea-
sonable estimate and that can be used quite easily.
1.4.1 Types of Uncertainty

The table below defines types of uncertainty. In each case, the measured
quantity is a variable x.
Type Example
Absolute δx g = 9.81 ms− 2 ± 0.05 ms− 2
uncertainty
Fractional δx Fractional uncertainty in g is 0.05/9.81 = 0.0051
uncertainty x
Percentage δx % uncertainty in g is 0.0051 × 100 = 0.51 %
uncertainty × 100%
x
1.4.2 Combining Uncertainties

It is often important to be able to calculate the uncertainty in a calculated
quantity that depends on several other measured quantities each with its own
uncertainties. The simplest way to do this is to calculate the maximum and
minimum possible values of the quantity using the known uncertainties in the
measured values and then to use these extreme values to find the average and
range. For example, in an experiment to measure the acceleration of free fall
the following results are obtained:
Distance fallen from rest: s = 2.500±0.005 m
Time taken: t = 0.710±0.020 s
From theory: g = 2s/t2
gmax = (2 × 2.505)/(0.690)2 = 10.52 m s− 2
gmin = (2 × 2.495)/(0.730)2 = 9.36 m s− 2
g = 9.93 ± 0.58 ms−2
FP.CH01_3pp.indd 7 3/15/2023 12:25:39 PM

While this method can be used there are also some simple ways in which
uncertainties can be combined mathematically. Here are some simple rules.
Combination Rule
Uncertainty in a sum: Add absolute uncertainties δy = δa + δb
y=a+b
Uncertainty in a product: δy δa δb
Add fractional uncertainties = +
y = ab y a b
δy δa δb
Uncertainty in a quotient: y = a Add fractional uncertainties = +
b y a b
Uncertainty in a power: Multiply fractional δy δa

uncertainty by power =n
y = an y a
Using the data from the example above (acceleration of free fall) and applying
the rules from the table we have:
g = 2s/t2 = (2 × 2.500)/(0.710)2 = 9.92 m s− 2
δg δs δt 0.005 0.020
= +2 = +2 =0.058
g s t 2.500 0.710
δg = 0.58 ms− 2
This gives a result of g = 9.92 ± 0.58 ms−2 (almost identical to the first
method).
1.5 DEALING WITH RANDOM AND SYSTEMATIC

EXPERIMENTAL ERRORS
In 2011, results from CERN suggested that neutrinos might be traveling faster
than the speed of light. This made the front page of newspapers around the
world. However, the scientists whose data had led to the claim realized that, if
it was true, it would turn physics upside down so they invited other scientists to
check their work and repeat the measurements. A year later it was confirmed
that the original experiment had introduced measurement errors that had not
been accounted for. When they were included, it was clear that neutrinos had
not traveled faster than the speed of light. It is always important to consider
possible sources of error in your experiments and to try to reduce them as much
as possible. Of course, it will be impossible to reduce them to zero, so you must
try to estimate their size in order to calculate the uncertainty in your results.
FP.CH01_3pp.indd 8 3/15/2023 12:25:40 PM

1.5.1 Random Errors

If a quantity is measured repeatedly the results are likely to vary as a result of
measurement errors. If these errors are random the results will be scattered
above and below the actual value of the quantity being measured.
The best way to reduce the impact of random errors is to use a large num-
ber of repeats and to take an average.
Repeated measurements are also helpful for estimating experimental uncer-
tainties. A useful estimate is half the range of the data (but it is important to
eliminate any anomalous results before calculating the range).
1.5.2 Systematic Errors

A systematic error is one that introduces a consistent bias to all of the meas-
ured data, usually making it all too big or too small. Taking an average will
not help to reduce the impact of a systematic error. However, if a systematic
error is known then it can be corrected by adjusting the measurements. For
example, if there is a zero error this must be subtracted from all subsequent
measurements.
If a systematic error is present but unidentified the final result might seem
very precise and yet disagree with an accepted value. It is always important
to compare your results with accepted values (where these exist). If you have
carried out a good experiment, your value with its uncertainties should over-
lap the accepted value for the quantity measured.
1.6 DIFFERENTIAL CALCULUS

1.6.1 Derivatives and Rates of Change
Calculus was invented by Isaac Newton in order to solve physics problems.
Newton realized that he needed to be able to deal with quantities that were
continuously changing, so he needed a technique that could cope with changes
that occurred in vanishingly small (infinitesimal) intervals of time. Differential
calculus deals with instantaneous rates of change. For example, to find the
velocity of a moving object we need to measure its change in displacement
δs during a time δt but if velocity is changing continuously the value of δs/δt
will be an average during the interval δt and the instantaneous velocity will
vary during the time interval. In order to find the instantaneous velocity at a
FP.CH01_3pp.indd 9 3/15/2023 12:25:40 PM

particular moment, we would need to use an infinitesimally short time inter-

val δt→0. This is not possible in an experiment but mathematically the deriva-
tive of displacement with respect to time (ds/dt) represents this instantaneous
value. It is a mathematical limit of the ratio δs/δt as δt → 0:
ds δs
=
v = lim
dt δt → 0 δt
This approach is used for rates of change throughout physics:
Velocity = ds/dt,
Acceleration = dv/dt,
Electric current = dQ/dt,
Rate of change of flux-linkage = d(NΦ/dt), and so on.
The “operator” (d/dt) … can be read as “rate of change of.”
All of the above examples are rates of change with time but rates of change
with respect to other variables are also common. For example, dV/dx can rep-
resent a potential gradient in an electric field.
The process of deriving a rate of change is called “differentiation” and the
result is a “derivative.” Derivatives can be represented graphically as the gra-
dient of a graph. The graph below shows how the charge builds up on a capac-
itor as it is charged. The gradient is dQ/dt which is equal to the current that
flows at a particular instant as the capacitor charges.
charge
Gradient represents
current at this instant:
I = dQ/dt = gradient
time
FP.CH01_3pp.indd 10 3/15/2023 12:25:40 PM

Some common derivatives (rates of change) that often occur in physics are
listed in the table below.
dy
Function: y Derivative (rate of change):
dx
Constant value, for example, y = 8 0
dy
Power law, for example, y = Axn = nAx n−1
dx
dy
Exponential function: y = ex = ex
dx
dy
Exponential relationship, for example, y = Aebx = bAe bx
dx
dy
Sine function: y = sin x = cos x
dx
dy
Cosine function: y = cos x = − sin x
dx
dy
Sinusoidal variation, for example, y = A sin (bx) = bAcos ( bx )
dx
dy
Cosinusoidal variation: y = A cos (bx) = − bAsin ( bx )
dx
1.6.1.1 Second Derivatives

A derivative is a rate of change, so the derivative of displacement with respect
to time is velocity and the derivative of velocity with respect to time is accel-
eration. This means that acceleration is the rate of change of displacement:
dv d  ds 
=
a =  
dt dt  dt 
This is called the “second derivative” of displacement and is written like this:
d2 s
a=
dt 2
Whereas first derivatives are equal to gradients, second derivatives are related
to the sharpness of curvature of a graph (the rate of change of the gradient).
Acceleration is related to the sharpness of curvature of a graph of displace-
ment against time.
FP.CH01_3pp.indd 11 3/15/2023 12:25:41 PM

1.6.2 Maximum and Minimum Values

Calculus can be used to power
find where something has
its maximum or minimum Pmax
value. Look at the graph
below, which shows how
the power transferred from
a supply to a resistor in an
electric circuit depends
on the value of the load resistance
resistor. Rpeak
The maximum power Pmax occurs at some resistance Rpeak. This is where the
curve has a gradient of zero. However, gradients are equal to derivatives so we
can find the maximum value if we can find where the gradient is zero. To do
this, we need an equation for power in terms of the resistance, P(R). It turns
out that this is easy to find (see Section 18.5.1). Once we have the equation
we differentiate it, set it equal to zero, and solve the resulting equation to find
the value of R:
dP ( R )
=0
dR
To find a maximum or minimum of some function y(x):
dy
Differentiate to find the derivative ;
dx
dy
Set the derivative equal to zero: = 0;
dx
Solve for x.
Calculus can be used to show the nature of the stationary point:
d2 y
Find ;
dx2
d2 y
If is positive the gradient is increasing: minimum position;
dx2
d2 y
If is negative the gradient is decreasing: maximum position;
dx2
FP.CH01_3pp.indd 12 3/15/2023 12:25:42 PM

d2 y
If is zero the position is a saddle point, neither a maximum nor a
dx2
minimum.
y
maximum : gradient
decreasing through zero
d2y
negative
dx 2
minimum: gradient
increasing through zero
d2y
positive
dx 2 saddle point: gradient
d2y
stationary zero
dx 2
1.7 DIFFERENTIAL EQUATIONS

An algebraic equation such as s = ut + ½ at2 can be solved to find the value
of one of the variables if all of the others are known. For example, if we know
the initial velocity, acceleration, and time for which an object accelerates we
can use the equation to find its displacement. Differential equations include
derivatives and their solution is not a value but a function.
For example, the simple equation for constant velocity v:
ds
=v
dt
is an example of a first-order differential equation. It is called first order
because it only involves first derivatives, there are no second or higher-order
derivatives in the equation. Its solution is an expression for displacement as a
function of time s(t) that satisfies the equation. It turns out that there are an
infinite number of such expressions. One of them is:
s = vt
FP.CH01_3pp.indd 13 3/15/2023 12:25:42 PM

where k is a constant (representing a constant velocity).

We can show that this is a solution by differentiating it:
ds d
= = ( vt ) v
dt dt
However, the expression
=
s vt + k
Is also a solution.
ds d
= ( vt + k=) v
dt dt
The general solution to this differential equation is:
=
s vt + s0
where s0 is another constant (equal to the initial displacement at t = 0).

Differential equations are used to express many of the fundamental relation-
ships in physics including Newton’s laws of motion, Maxwell’s equations of
electromagnetism, and Schrödinger’s equation in quantum theory. If you
study physics at a higher level you will need to get to grips with a range of
methods for solving differential equations. The table below lists some exam-
ples of differential equations that will feature later in this book along with
some useful solutions.
Topic Differential equation Solution Conditions
dN
Radioactive decay = −l t N = N0 e − l t N = N0 at t = 0
dt
dQ Q −
t
Capacitor discharge = − Q = Q0 e RC
Q = Q0 at t = 0
dt RC
Newton’s second law d2 x F 1 F  F, m, u constants

(constant acceleration)∗ = =
x ut +   t 2
dt 2 m 2m x = 0 at t = 0
Simple Harmonic d2 x =x Acos (w t + f ) A, ω , ϕ constants

Motion)∗ = −w 2 x
dt 2 (ω = 2πf)
∗ The equations for Newton’s second law and for simple harmonic motion are second-order differential
equations because they involve second derivatives. Whereas first-order differential equations generate
one arbitrary constant, second-order differential equations generate two.
FP.CH01_3pp.indd 14 3/15/2023 12:25:43 PM

1.8 INTEGRAL CALCULUS

Integration is the inverse process to differentiation and is related to the limit
of a sum. The graph below shows how velocity varies with time for a par-
ticular object. The area under the graph represents displacement and can be
approximated by adding up a large number of thin rectangular strips, each of
width δt.
velocity
t
v(t
time
t1 t t2
The area of the small shaded strip, v(t) δt, approximates to the extra displace-
ment during a short time δt at time t. The area between any two times t1 and
t2 is equal to the displacement during that time interval and is given, approxi-
mately, by the sum of the areas of all such strips between those two times:
t2
=s ∑v ( t ) δt
t1
This becomes a better approximation to the actual area if we take thinner and
thinner strips by making δt smaller and smaller. It would be a precise value
in the limit that δt approached zero: δt→0. In this limit, the sum becomes a
continuous process called an integral:
t2 t2
=s lim ∑
=v ( t ) δt ∫v ( t ) dt
δt → 0
t1 t1
FP.CH01_3pp.indd 15 3/15/2023 12:25:44 PM

The table below gives some derivatives and related integrals that will be used
in this book.
Context Differential form Integral form
ds
Dynamics v= s = ∫vdt
dt
dv
Dynamics a= v = ∫adt
dt
d ( mv )
Newton’s laws F=
dt ∫Fdt
mv − mu =
dQ
Electric circuits I= Q = ∫Idt
dt
dN dN
Radioactivity
dt
= −l N ∫ N
= − ∫ l dt
dQ Q dQ dt
Capacitors
dt
= −
RC
∫ Q
= −∫
RC
1.9 VECTORS AND SCALARS

Scalars are physical quantities that have magnitude but no direction. for exam-
ple, distance, speed, mass, energy, power, and temperature. Scalar quantities
simply add together, so if 3.0 kg is added to a body of mass 7.0 kg the resulting
body has a mass of 10 kg. Scalars do not have a direction but can have a sign,
and this must be taken into account when they are added.
Vectors are physical quantities that have magnitude and direction, for exam-
ple, displacement, velocity, force, and momentum. Vectors can be represented
by arrows; the length of the arrow represents magnitude, and the direction of
the arrow is the direction of the vector. They are often distinguished from
scalar quantities by underlining them, for example, v is a velocity vector and
v is its magnitude.
1.9.1 Adding Vectors

When vectors are combined, we need to consider their direction as well as
their magnitude, so we cannot simply add the magnitudes. For example, if
we walk 10 m north and then 10 m east, we have walked a distance of 20 m
(scalar) but my displacement is 14 m NE (vector). This can be shown on a
vector diagram.
FP.CH01_3pp.indd 16 3/15/2023 12:25:45 PM

10 m
10 m
14 m
10 m + 10 m
= 14 m
A displacement vector 10 m north has been added to a displacement vector

10 m east to obtain a resultant displacement vector of 14 m northeast. The
resultant vector is found by placing all the vectors to be added end to end and
then connecting the start of the first vector to the end of the last vector.
This diagrammatic method of adding vectors can be used to solve prob-
lems if all the vectors are drawn to the same scale.
Pythagoras’s theorem and trigonometry can be used to find the resultant
vector from the vector diagram.
Vector subtraction is carried out by adding the negative of the vector (i.e.,
reversing its direction).
1.9.2 Resolving Vectors into Components

If a projectile is fired at some angle to the ground its velocity will be partly
vertical and partly horizontal. The velocity has a “horizontal component” and
a “vertical component.”
Any vector can be resolved into two perpendicular components using
trigonometry.
A sin A

x
A cos
FP.CH01_3pp.indd 17 3/15/2023 12:25:45 PM

The vector A has been resolved into two components:

x-component has magnitude Ax = A cos θ;
y-component has magnitude Ay = A sin θ.
The components can also be used to reconstruct the original vector:
The magnitude of A is, by Pythagoras’s theorem: A2 = Ax2 + Ay2;
Ay
The angle θ is found from: tan è = .
Ax
Vector addition can also be carried out by adding components along a com-
mon set of perpendicular axes:
If C = A + B then Cx = Ax + Bx and Cy = Ay + By.
1.9.3 Multiplying Vectors

There are two different ways to multiply two vectors: the scalar product and
the vector product.
1.9.3.1 Scalar Product

The work done by a force is the product of the force and the displacement
of the point of action of the force parallel to the force. Since both force and
displacement are vectors but work is a scalar, this is an example where the
product of two vectors results in a scalar. For this reason, it is called a scalar
product (sometimes referred to as a “dot product”).
The magnitude of a scalar product is equal to the product of the magnitudes
of the two vectors multiplied by the cosine of the angle between them, for
example
F
Scalar product =F.s = Fs cos
s
The scalar product is zero when the vectors are perpendicular.
FP.CH01_3pp.indd 18 3/15/2023 12:25:46 PM

1.9.3.2 Vector Product

The force on a moving charged particle in a magnetic field has a magnitude
given by F = qvB sin θ, where B is the magnetic field strength, v is the veloc-
ity of the particle, and θ is the angle between v and B. Since force, magnetic
field strength, and velocity are all vectors this is an example where the product
of two vectors results in another vector. For this reason, it is called the vector
product (sometimes referred to as the “cross-product”). The direction of the
resulting force is found by using a right-hand rule and is perpendicular to both
of the vectors v and B.
A vector product is written like this: vector product of v and B = v^B.
The force on a moving charge in a magnetic field is therefore: F = q (v^B).
The magnitude of the vector product is equal to the product of the magni-
tudes of the two vectors multiplied by the sine of the angle between them.
The direction of the vector product is perpendicular to the plane defined by
the two vectors being multiplied together. To work out the direction of the
product vector imagine rotating from the first vector to the second vector and
then the resultant is in the direction of movement of a right-handed screw!
The vector product is zero if the vectors are parallel (θ = 0).
v^B direction of resultant
B B

v
v
1.10 SYMMETRY PRINCIPLES

Symmetry in nature is often linked to beauty. Apparently, human beings with
more symmetrical faces are more attractive to others. But in physics sym-
metry is a powerful tool that can be used as a guide to the underlying laws of
nature and that makes the equations of physics both powerful and beautiful.
FP.CH01_3pp.indd 19 3/15/2023 12:25:47 PM

A geometrical symmetry leaves a shape unchanged under certain rotations or

reflections. For example, if an equilateral triangle is rotated through 60° or
reflected about a line passing through a vertex and perpendicularly through
the opposite side, it is unchanged. The diagram shows three symmetry axes
for an equilateral triangle. A circle would have an infinite number of these, all
parallel to diameters.
The shape of the triangle is “invariant” under these rotations and reflections.
In Newtonian mechanics the laws of physics are the same in all uniformly
moving (inertial) reference frames, so Newton’s laws are invariant under
a change of velocity. Einstein’s theory of special relativity goes further and
includes all of the laws of physics (see Chapter 24). Hermann Minkowski real-
ized that Einstein’s equations for relativity were similar to those for a geo-
metrical rotation and identified physical quantities that are the same for all
inertial observers—these are four-dimensional quantities called invariants
and are constructed from space and time components.
Emmy Noether showed that symmetry principles are linked to conservation
laws. This is not really surprising because a conservation law identifies some
quantity that stays the same (is invariant) when other things change. For
example, the total linear momentum of a collection of colliding bodies is the
same before and after the collisions, and the total energy of the Universe is
the same before and after an explosion. Noether showed that the conservation
of momentum is linked to the laws of physics remaining the same under trans-
lation, the conservation of angular momentum is linked to the laws staying the
same under rotation, and the law of conservation of energy is linked to the
laws staying the same at all times. This link between mathematical symmetries
and conservation laws is a powerful idea in theoretical physics.
FP.CH01_3pp.indd 20 3/15/2023 12:25:47 PM

In particle physics, the search for symmetry in mathematical equations can

lead to the prediction and discovery of new types of particles. Paul Dirac real-
ized that the equation for the rest energy of an electron is a quadratic equa-
tion, one that has both positive and negative solutions:
=
E2 p2 c2 + m0 2 c 4
When p = 0 we can write:
E2 = m 0 2 c 4
which has the solutions:
E = ± m0 c 2
The positive solutions correspond to normal electrons. By taking the nega-

tive energy solutions seriously he predicted the existence of anti-electrons
or positrons. The symmetry between matter and antimatter is now built into
the Standard Model and is linked to other fundamental symmetries such as
time reversal. There is even a sense in which anti-matter can be regarded as
ordinary matter traveling backward in time.
The time-reversal symmetry in particle physics is one of three symmetries
that are fundamentally linked:
T: Time reversal symmetry—changing the sign of the time coordinate.
P: Parity symmetry—changing the sign of all three spatial coordinates.
C: Charge (charge conjugation) symmetry—changing the signs of all of the
charges.
It turns out that none of these symmetries are observed in all interactions, but
if C, P, and T are all applied the symmetry is observed. No violation of CPT
symmetry has ever been seen.
The Standard model also contains some apparent symmetries that have not
been explained. There are three pairs of leptons and three pairs of quarks
(see Section 26.4.5) but no known physical process that links them or that will
allow the transformation of quarks into leptons or vice versa. Attempts to con-
struct a theory that would allow this has led physicists to predict the existence
of additional “super-symmetric” particles for all the existing particles. So far
none have been discovered.
FP.CH01_3pp.indd 21 3/15/2023 12:25:47 PM

Symmetry principles have also been used to investigate alternative approaches

to physics. Richard Feynman and John Wheeler for example, realized that
Maxwell’s equations for electromagnetism could be solved both forward and
backward in time. This sounds odd, but by including both solutions they were
able to construct a version of the theory that worked just as well as the original
one and did not neglect half of the solutions. However, the interpretation of
the so-called “absorber theory” requires us to take the idea of future events
influencing the present seriously.
1.11 EXERCISES
1. Express:
(a) 267 g in kg, (b) 25 km in mm, (c) 5.0 m3 in cm3,
(d) 80 km/h in m s−1, (e) 45 cm2 in m2
2. Light travels at 3.0 × 108 m s−1 and it takes about 8 minutes for light to
travel from the Sun to the Earth.
(a) How far away is the Sun in km?
(b) How far is a light year? Give your answer in meters.
The distance from the Earth to the Moon is 380 000 km.
(c)
How far is this in light seconds?
3. Round off the following to three significant figures:
(a) 2.000009 (b) 0.0020900 (c) 0.009502 (d) π
4. Newton’s equation for gravitational forces is F = Gm1m2/r2. Use the
method of dimensions to find the correct SI units for the universal gravi-
tational constant G.
5. The lengths of the sides of a rectangular box are measured using a ruler
marked with an mm scale. The measurements give side lengths of 85,
62, and 20 mm, respectively. The uncertainty in each measurement is
± 1 mm.
(a) Calculate the fractional uncertainty in the length of each side.
(b) Calculate the percentage uncertainty in the length of each side.
FP.CH01_3pp.indd 22 3/15/2023 12:25:48 PM

(c) What is the area of the largest face of the box? Include its absolute
uncertainty.
(d) What is the volume of the box? Include its absolute uncertainty.
6. The time period of a mass-spring oscillator is given by;
m
T = 2p
k
where m is the mass oscillating and k is the spring constant (N m-1)
In an experiment, the time period is measured to be 0.68 s± 0.04 s and the
spring constant is 20 N m-1 ± 2 N m-1
Calculate the mass of the oscillator including its absolute
(a)
uncertainty.
(b) Which value (k or T) contributed most to the uncertainty in m?
Explain the difference between a systematic error and a
7. (a)
random error.
(b) How would you reduce the size of random errors when carrying out
an experiment?
(c) A micrometer screw gauge reads 0.02 mm when it should read zero.
The diameter of a wire is measured to be 0.34 mm using the same
micrometer screw gauge. What value should be recorded?
8. Convert these numbers to scientific notation:
(a) 5500 (b) 0.0000000007
(c) 1 200 000 000 000 000 000 000 000
9. Evaluate the expressions below (do not use a calculator):
(a) 2.0 × 106 × 4.5 × 109 (b) 2.0 × 106 × 4.5 × 10−9
(c) 6.0 × 1012/3.0 × 104 (d) 6.0 × 1012/3.0 × 10−4
FP.CH01_3pp.indd 23 3/15/2023 12:25:48 PM

10. The Apollo spacecraft traveled to the Moon at an average speed of

1.5 km s−1. How long would it take at this speed to get to:
(a) Mars (b) Pluto
(c) The nearest star? (d) To cross our galaxy?
Speed of light = 300 000 000 m s−1 = 3.0 × 108 m s−1

Distance to Mars (varies) = 50 000 000 000 m = 5.0 × 1010 m
Distance to Pluto (varies) = 6 000 000 000 000 m = 6.0 × 1012 m
Distance to the nearest star (about) = 4.5 light-years
Diameter of our galaxy (about) = 100000 light-years
= 1 × 105 light-years
FP.CH01_3pp.indd 24 3/15/2023 12:25:48 PM

CHAPTER
2
Representing and Analyzing Data
2.0 INTRODUCTION
Experiments generate data and it is important to know how to select, record
and process this data in order to find out what the experiment has revealed.
This is a huge problem for large experiments. According to the CERN
website, the four experimental stations on the Large Hadron Collider each
generate between 750 Mbs−1 and 4 Gbs−1 of data when the accelerator is oper-
ating! The task of selecting and analyzing relevant data is carried out by the
Worldwide LHC Computing Grid which has to reject the majority of results
in order to focus on those that might reveal interesting new physics. Writing
the algorithms for this system is extremely challenging.
When you carry out experiments in a laboratory you will not face such a
daunting task. However, presenting your data clearly and to an appropriate
precision, rejecting anomalous results, deciding how to process it, and then
extracting information from the processed data, is an essential part of physics.
2.1 EXPERIMENTAL VARIABLES

In most physics experiments you will vary one parameter and measure another.
The variable you change is called the “independent variable” and the one that
responds to this is called the “dependent variable.” For the experiment to be
a fair test you must keep all other parameters that might affect the dependent
variable “constant,” these are called “control variables.”
For example, imagine you are investigating how the acceleration of a trolley
along a runway depends on the force applied to it. The dependent variable
FP.CH02_3pp.indd 25 3/15/2023 12:26:07 PM

is acceleration, which might be measured using light gates. The independent

variable is applied force, which might be varied using falling masses attached
to the trolley via a pulley. There are several control variables, all of which must
be kept constant. The obvious one is the total mass that is being accelerated;
this includes the falling masses. If total mass was not constant then changes
in acceleration might be because of changes in mass rather than force, so
the results of the experiment would not be clear. Another control variable is
the angle of the runway; it should be horizontal in each trial. It would also
be important to control frictional forces (e.g., by using the same trolley and
runway and measuring the acceleration over the same distance on the same
part of the runway).
Independent variable: the one you deliberately vary.
Dependent variable: the one that changes in response to the change in the
independent variable.
Control variables: the ones that you have to keep constant so that they do
not also affect the dependent variable.
When you display results on a graph it is conventional for the dependent vari-
able to go on the y-axis and the independent variable to go on the x-axis.
dependent
variable
independent
variable
2.2 RECORDING DATA

Here are some guidelines for recording experimental data:
Record all of the raw data: If you repeat measurements and calculate an
average record the repeats, not just the average.
Record data to the precision of the instrument: If you are measuring time
periods using a clock that reads to 0.01 s then all times should be recorded
to 0.01s (e.g., 3.54 s, 3.57 s, etc.).
FP.CH02_3pp.indd 26 3/15/2023 12:26:08 PM

Representing and Analyzing Data • 27
Record calculated values to an appropriate number of significant figures;

usually the smallest number of significant figures used in the calculation.
If velocity v is calculated from a displacement of 4.26 m and a time of
0.45 s a calculation gives v = 9.466666…. ms−1. This should be rounded to
two significant figures: 9.5 ms−1.
Include units at the head of the table and not with the numerical values.
The correct way to label a column is quantity/unit. This means quantity
divided by the unit, and that is why just numbers appear in the column.
For example, displacement/m labels a column of displacements measured
in meters.
Here is an example (showing just two rows in a table).
A student carried out an experiment to find out how the time period of a
mass oscillating on a spring depends on the mass. In order to reduce the
effect of timing errors at the start and end of the timing period he measured
20 oscillations. In order to reduce the effect of random errors he repeated
the experiment 3 times. The table includes all of his raw data, averages, and
calculated values.
Mass (kg) Times for 20 oscillations (s) Time period (s)
Trial 1 Trial 2 Trial 3 Average
0.100 15.78 15.57 15.69 15.68 0.7840
0.150 X X X X X
Etc. X X X X X
Note that, while the time period has been given to four significant figures
(consistent with the raw data) the uncertainty is likely to make one or more of
the final figures meaningless.
Spreadsheets (such as Excel) are often used to record and analyze data. They
have some strong advantages over simple written tables. The main one is that
the data, once it is recorded, can be processed within the spreadsheet. For
example, if you have calculated displacement and time values it is simple to
use the spreadsheet to calculate velocities or accelerations. Mean values and
standard deviations can also be calculated. Once the data has been processed
it can be selected and displayed graphically in a wide variety of ways (scatter
graphs, bar charts, pie charts, etc.). It is also possible to insert error bars, to
extrapolate a line backward or forward, and to fit a line or curve to plotted data
and display an equation for the fit.
FP.CH02_3pp.indd 27 3/15/2023 12:26:08 PM

Much of the analysis described above is done “behind the scenes” but when
you need to present your data in a report it is important to make sure that you
include the correct table headings (and units) and that you round data to an
appropriate number of significant figures.
2.3 STRAIGHT-LINE GRAPHS

A linear relationship means that the rate of change of the dependent variable
with respect to the independent variable is constant. Linear graphs are impor-
tant in physics and it is also important to interpret them correctly.
2.3.1 Interpreting Straight-Line Graphs

Here are three linear graphs each representing a different relationship
between the variables.
y y y
x x x
Straight line through the Straight line with Straight line with
origin: y is directly positive intercept y negative intercept y
proportional to x increases linearly with x decreases linearly with x
Beware—a common mistake is to think that a linear graph implies direct pro-
portion. This is only the case if the graph passes through the origin.
It is also important to realize that a linear graph with negative gradient does
not imply inverse proportion.
2.3.2 Analyzing Straight-Line Graphs

A straight-line graph can be represented by the equation y = mx + c where:
y is the dependent variable,
x is the independent variable,
m is the gradient, and
c is the intercept on the y axis when x = 0.
FP.CH02_3pp.indd 28 3/15/2023 12:26:08 PM

Intercept gradient = m
at y = c
0 x
0
If there is a linear relationship between two physical variables x and y then a

graph of y against x can be used to find m and c and hence determine an equa-
tion for this relationship. If c = 0 then y is directly proportional to x.
2.4 PLOTTING GRAPHS AND USING ERROR BARS

2.4.1 Plotting Graphs by Hand
Graphs are used to display data, to test mathematical relationships, and to
calculate physical values from intercepts and gradients. They can be plotted
by hand or using a spreadsheet program, such as Excel, but however, they
are produced it is important to present them properly and in such a way that
relationships are shown clearly and any significant values can be extracted
accurately.
Here are some guidelines for plotting graphs by hand:
Give the graph a heading.
Select scales such that the plotted data extends over at least half of the
width and height of the graph paper. This might require the use of a “false
origin,” that is a graph origin that is not at (0, 0).
The two graphs below display the same data. However, the range
of y-values is from 92 to 98. The use of a false origin makes the
relationship clearer and would make calculation of a gradient
from the graph more accurate.
100 100
0 90
0
0 0
FP.CH02_3pp.indd 29 3/15/2023 12:26:14 PM

If the origin (0, 0) is included on the graph then mark it with a zero on the
scale on each axis.
Mark values onto each axis in equal simple intervals, for example, 0.0,
10.0, 20.0, 30.0 ….. or 0.02, 0.04, 0.06 …. etc… Avoid unusual or awk-
ward intervals such as 3, 5, 7 ….
Label both axes with the physical quantity and unit in the same way as if
you were putting in a table heading, for example, velocity/ms.
Mark each data point carefully with a cross or a point surrounded by a
circle using a sharp pencil. If you are plotting on millimeter graph paper
the points need to be within half a millimeter of the correct value.
Look carefully at the pattern of the data and decide whether any of the
points are anomalies. This means that they seem “out of place” compared
to the others. If possible, these data points should be checked by repeat-
ing the experiment. If not, they should be labeled and ignored when
drawing a best-fit line.
Look at the pattern of data and decide whether it is best represented by
a straight line or a smooth curve. Remember, this is experimental data, so
each point has a degree of uncertainty—this means that the line does not
have to pass through the points, it is there to represent the relationship
revealed by the data.
If the data is best represented by a straight line then use a ruler to draw it,
trying to balance approximately equal numbers of points on either side of
the line. It is also possible to find the gradient and y-intercept of a linear
graph directly from the data by using an algebraic method called linear
regression (omitting data from anomalous points).
If the data is best represented by a curve then draw a smooth curve, again
trying to balance the distribution of points on either side of the curve.
2.4.2 Finding a Gradient from a Straight-Line Graph

Many physical relationships can be tested by plotting straight-line graphs and
it is often important to find the gradient of the graph.
Here are some guidelines for finding an accurate gradient from a linear graph:
Choose two points on the line that are well-spaced from one another—at
least half the length of the line apart.
If possible, choose points at convenient x or y values so that their coordi-
nates can be read easily from the graph.
Use a ruler to draw lines across to the y-axis and down to the x-axis and
read off coordinates for each point (x1, y1) and (x2, y2).
FP.CH02_3pp.indd 30 3/15/2023 12:26:14 PM

Mark these coordinates on the graph beside the chosen points.

Calculate the gradient from the equation:
gradient
y2 y1
x2 x1
Don’t forget units! The units for the gradient are the units of y divided by
the units of x. For example, a graph of charge/C against time/s would have
units of Cs−1 or amps.
Here is an example of a gradient calculation using data for the extension of a
steel spring.
25
(6.0, 20.2)
20
extension /cm
15
10
(1.6, 5.0)
5
0
0 1 2 3 4 5 6 7
load / N
20.2 5.0
gradient of line 3.5 cmN1
6.0 1.6
2.4.3 Using a Spreadsheet Program (e.g., Excel)
Whether you are plotting a graph by hand or using computer software your
aims are the same—to represent the relationship between two variables accu-
rately and clearly and to extract any useful physical information or values from
the graph as accurately and precisely as possible. The great advantage of a
spreadsheet program is that once data has been stored it can be used to gener-
ate a graph at the click of a button. However, this can also lead to problems.
The default settings of the program will determine how the graph appears and
this may not be the best way to display this particular data set. The ease and
FP.CH02_3pp.indd 31 3/15/2023 12:26:18 PM

rapidity of creating a graph can also mean you forget to do basic things such as
ensuring the axes are labeled or labeled correctly, having sensible scales with
appropriate numbers of significant figures, etc.
Here are some questions to ask yourself when using a spreadsheet to produce
a graph and analyze your data:
Could I do a better job by plotting the graph by hand?
Do I want the program to fit a line or curve to this data or shall I print out
the graph once the points have been plotted and then draw the line in by
hand?
Does the graph have an appropriate heading?
Do I want to include gridlines and if so with what divisions?
Are the axes correctly labeled with quantity and unit?
Are the scales marked correctly and are they easy to read?
Are values shown with the correct number of significant figures?
Do I need to use a false origin?
Should the line be extrapolated back or forward?
Should the line go through the origin?
What kind of fit do I want to use (e.g., linear, exponential, polynomial, etc.)?
Do I want to include error bars?
Do I want the program to display a formula and if so how should it be
formatted?
How large should the graph appear in my report?
Remember: you should be in control, not the computer.
2.4.4 Using Error Bars

All experimental measurements will have a degree of uncertainty. This can be
included on a graph by the use of error bars. These can then be used to work
out the uncertainty in an intercept or gradient that is calculated from the
graph.
If a data point has values x and y with uncertainties ± δx
and ± δy respectively then the error bar will be a line
y
extending from x − δx to x + δx parallel to the x-axis and
from y − δy to y + δy parallel to the y-axis.
It is often the case that the error bars for one variable x
are far more significant than for the other. When this is
the case only one set of error bars is drawn. This is usually
the dependent variable (y).
FP.CH02_3pp.indd 32 3/15/2023 12:26:18 PM

Once all of the error bars have been drawn onto the graph you can add a
trend line. If you are plotting a graph that is linear there will be a range of
possible lines that can be drawn that pass through all of the error bars. The
extreme lines (steepest and shallowest) are called the “worst acceptable” lines.
These can be used to find the range of possible gradients and the range of
possible intercepts.
The graph below uses the same data as in the previous example (for stretch-
ing a steel spring) but error bars (± 1 cm) have been added to the data for
extension. In addition to the original best fit, a “worst acceptable line” (WAL)
has also been drawn. This is slightly steeper than the best fit line so it gives a
greater value for the gradient:
24.5 1.2
gradient of worst acceptable line 3.6 cmN1
7.0 0.6
(7.0, 24.5)
25
20
15
extension /cm
10
5
(0.6, 1.2)
0
-1 0 1 2 3 4 5 6 7 8
-5
load / N
Using this value and the gradient of the best fit line gives:
gradient = 3.5 ± 0.1 cmN−1
2.5 LOGARITHMS
2.5.1 Logarithmic Scales and Logarithms
Many quantities in physics have values that spread over an extremely wide
range, so it is often convenient to represent them using a scale that increases
FP.CH02_3pp.indd 33 3/15/2023 12:26:23 PM

in multiples rather than equal amounts. Such a scale is called logarithmic, for
example,
1, 10, 100, 1000, etc.
1, 2, 4, 8, 16, etc.
Each step raises the power of some base quantity by 1:
100, 101, 102, 103, etc.
20, 21, 23, 24, etc.
The logarithm of any number to a particular base is the power that the base
must be raised to in order to get the number. For example:
Base 10: logarithm to base 10 of 1000 = 3 or log10 (1000) = 3.
Base 2: logarithm to base 2 of 8 = 3 or log2 (8) = 3.
Another common base is the number e. Logarithms to base e are called
“natural logarithms” and are written using the prefix “ln.”
Base e: logarithm to base e of 10 = 2.3026 or ln (10) = 2.3026.
The values of logarithms to base 10 and of natural logarithms can be found
directly from your calculator.
If you are working with logarithms you will also need to be able to find anti-
logarithms or inverse-logarithms. For example, if you know that the logarithm
to base 10 of some physical quantity is 5, you can find the value of the physical
quantity by raising the base to the power 5:
That is, if log10 (x) = 5 then x 10 log10 x 10 5 100 000
if log10 (x) = 2.3 then x 10 log10 x 10 2.3 199.5
Antilogarithms (or inverse logarithms) can be found directly from your
calculator.
2.5.2 Using Logarithms

Logarithms behave like powers, reducing multiplication to addition and divi-
sion to subtraction:
log10 (6×7) = log10 (6) + log10 (7)

log10 (6÷7) = log10 (6) − log10 (7)
FP.CH02_3pp.indd 34 3/15/2023 12:26:27 PM

Powers are reduced to multiples:

log10 (67) = 7 log10 (6)
These relationships are useful when analyzing data for variables that are linked
by a power law (if the base is omitted it is assumed that everything is to base 10):
If y = Axn then log (y) = log (A) + n log (x)
and a graph of log (y) against log (x) (called a “log-log graph”) will be linear
with gradient n and intercept log (A).
Exponential relationships can also be analyzed using natural logarithms:
If y = Aebx then ln (y) = ln (A) + bx
a graph of ln (y) against x (called a “log-lin graph”) will be linear with gradient
b and intercept ln (A)
2.6 TESTING MATHEMATICAL RELATIONSHIPS BETWEEN

VARIABLES
Graphs are often used to identify or test mathematical relationships between
two physical variables and it is important to understand how different rela-
tionships can be identified. In most cases, it involves plotting a suitable
straight-line graph and then interpreting the gradient and intercept. The sec-
tion that follow give examples of common mathematical relationships that can
be analyzed using straight-line graphs.
2.6.1 Direct Proportion

If y is directly proportional to x then when x is multiplied by any number y will
be multiplied by the same number. For example, y will double when x doubles.
To test for this, you can plot a graph of y against x. The variables are directly
proportional if the graph is a straight line AND it passes though the origin (0, 0).
The relationship can be represented by y = kx where k is the gradient of the
graph of y against x.
2.6.2 Inverse Proportion

If y is inversely proportional to x then when x is multiplied by any number y
will be divided by the same number. For example, y will halve when x
FP.CH02_3pp.indd 35 3/15/2023 12:26:27 PM

1
doubles. To test for this, you can plot a graph of y against . The variables are
x
inversely proportional if the graph is a straight line AND it passes through the
origin (0, 0).
k
The relationship can be represented by y = where k is the gradient of the
1 x
graph of y against .
x
2.6.3 Inverse-Square Law

If y is related to x by an inverse-square law then when x is multiplied by any
number y will be divided by that same number squared. For example, y will
fall to ¼ of its initial value when x is doubled. To test for this, you can plot a
1
graph of y against 2 . The relationship is an inverse-square law if the graph is
x
a straight line AND it passes through the origin (0, 0).
k
The relationship can be represented by y = 2 where k is the gradient of the
1 x
graph of y against 2 .
x
You can also test an inverse-square law by plotting a log-log graph as for any
power law. The gradient will be −2.
2.6.4 Power Law

If y is related to an unknown power of x by a relationship of the form y = Axn
then this power can be found by plotting a graph of log (y) against log (x). This
can be shown by taking the logarithms of both sides of the equation:
y = Axn
log(y) = log(A) + n log(x)
A graph of log (y) against log (x) should be linear with a gradient equal to n
(the power) and an intercept equal to log (A).
2.6.5 Exponential Decay or Growth

If y is related exponentially to x by a relationship of the form y = Aebx then
this can be tested by plotting a graph of ln (y) against x. This can be shown by
taking natural logarithms (base e) of both sides of the equation:
FP.CH02_3pp.indd 36 3/15/2023 12:26:34 PM

y = Aebx
ln (y) = ln (A) + bx
A graph of ln (y) against x will be linear if the relationship is exponential. The

gradient of this graph is b and the intercept on the ln (y) axis is A. For exponen-
tial growth the gradient b is positive and for exponential decay (e.g., capacitor
discharge or radioactive decay) the gradient b will be negative.
Here is a summary of how to test the relationships described above, all of
which are tested using straight-line graphs.
Relationship Equation Plot Interpretation/comments

Direct proportion y = kx y against x Gradient = k
Passes through origin
k Gradient = k
Inverse proportion y= y against x
x Passes through origin
k 1 Gradient = k
Inverse-square law y= y against
x2 x2 Passes through origin
Power law y = Axn log(y) against Gradient = n
log(x) Intercept = log A
Exponential y = Aebx ln(y) against x Gradient = b
Intercept = ln(A)
2.7 EXERCISES
1. Find the logarithms to base 10 of the following numbers:

(a) 1000 000 (b) 0.001 (c) 56 (d) 0.325 (e) 1
2. Find the numbers whose logarithms to base 10 are:

(a) 4 (b) 2.7 (c) 0.05 (d) −2.7 (e) −0.05
3. When a thin convex lens is used to form the image of a bright object on
a screen the relationship between object distance from the lens u, image
distance from the lens, v, and the focal length of the lens, f is:
1 1 1

u v f
FP.CH02_3pp.indd 37 3/15/2023 12:26:40 PM

A student carries out an experiment to find the focal length f. He varies

the object distance and measures pairs of values for u and v.
(a) Suggest a suitable graph that he can plot in order to find f.

(b) Explain how he would obtain a value for f from his graph.
4. The data below shows how the pressure and volume of a gas change when
it is compressed isothermally (at constant temperature).
Pressure (kPa) 100 128 155 180 215 250 280 305 335 345
Volume (cm3) 35.5 28.0 22.7 19.6 16.0 13.5 12.0 11.0 10.0 9.5
Assume that the uncertainty in pressure measurements is ± 10 kPa and the uncertainty in volume
measurements is ± 1.0 cm3.
Boyle’s law suggests that the pressure is inversely proportional to volume

under these conditions.
(a) Use a graphical method to show that Boyle’s law applies. Include
error bars.
(b) Find an equation to relate the pressure to volume and calculate the
value of any constants in this equation (stating their units and the
associated uncertainty).
5. The table below gives some data for the planets in the solar system.
Planet Av. Distance from the Sun Orbit time days (d) years (y)
(million km)
Mercury 58 88 d
Venus 108 ?
Earth 150 365 days
Mars 228 687 days
Jupiter 778 11.9 years
Saturn 1430 29.5 years
Uranus 2870 84 years
Neptune 4500 165 years
Kepler proposed that the orbital period of the planets T is linked to their
mean radius of orbit r by an equation of the form:
T = rn
FP.CH02_3pp.indd 38 3/15/2023 12:26:41 PM

(a) Use a graphical method to verify that such a power law is valid and
determine the value of n.
(b) Use your graph (or the equation) to predict the orbital period of
Venus (check online to see if your value is acceptable).
6. When a capacitor discharges through a resistor the discharge current I

falls exponentially according to an equation of the form:
t

I I0 e
where I0 is the initial value of the current and τ is the “time constant” for
the decay (in seconds).
The table below shows how current falls with time when a particular
capacitor is discharged through a resistor.
Current (mA) 48 38 30 24 19 15 12 9 8 6 5
Time (s) 0 10 20 30 40 50 60 70 80 90 100
Use a graphical method to find the time constant τ.
FP.CH02_3pp.indd 39 3/15/2023 12:26:45 PM

FP.CH02_3pp.indd 40 3/15/2023 12:26:45 PM
CHAPTER
3
Capturing, Displaying, and
Analyzing Motion
3.0 INTRODUCTION
Kinematics is the study of motion. Dynamics is the study of how forces affect
motion. In this chapter, we focus on how motion is described in space and time
and how we can capture data from moving objects in order to display their
motion graphically. In Chapter 5, we will see how Newton’s laws describe how
motion is affected by unbalanced or resultant forces.
3.1 MOTION TERMINOLOGY

The table below lists the key terms used in this chapter.
Term Symbol SI unit Comment
Displacement s m Distance moved in a particular direction: a vector.
Distance d m How far something has traveled along the path
taken: a scalar.
Initial velocity u ms−1 Initial rate of change of displacement: a vector.
−1
Final velocity v ms Final rate of change of displacement: a vector.
Speed v ms−1 Rate of change of distance: a scalar.
−2
Acceleration a ms Rate of change of velocity: a vector.
Time t s Duration of motion.
It is often important to distinguish between instantaneous and average values

of velocity:
s
average velocity
t
FP.CH03_3pp.indd 41 3/15/2023 12:26:44 PM

where ∆s is the change in displacement during a time interval ∆t. If the time
interval is small (but finite) this might be written:
s
average velocity
t
To obtain the instantaneous velocity we would need to take the ratio of δs to
δt in the limit that δt→0. This is the same as taking the derivative of displace-
ment with respect to time:
s ds
instantaneous velocity lim =
t 0 t dt
ds
or v =
dt
The table below lists symbols used to describe changes and rates of change.
∆x A change in x
δx A small change in x
d x
The rate of change of x
dt
Average and instantaneous acceleration are defined in a similar way:

v
average acceleration
t
dv
instantaneous acceleration =
dt
3.2 GRAPHS OF MOTION

There are three key variables, displacement, velocity, and acceleration, so it
is important to be able to recognize and interpret graphs where any one of
these is plotted against time. It is also useful, when any one of these graphs is
known, to be able to work out the form of the others.
The most useful graph is usually one of velocity against time because:
The area under a velocity–time graph is equal to displacement.
The gradient of a velocity–time graph is equal to acceleration.
The relationships between the three graphs are summarized in the diagram
below which shows graphs of motion for an object which starts from rest and
accelerates at a constant rate.
FP.CH03_3pp.indd 42 3/15/2023 12:27:03 PM

Capturing, Displaying, and Analyzing Motion • 43
Differentiate: displacement
v ds/dt
velocity is gradient of
displacement time
graph
time
velocity
Differentiate: Integrate:
a dv/dt
d
acceleration is
gradient of velocity displacement is area
time graph under a velocity time
graph
time
acceleration
Integrate:
d
velocity is area under

an acceleration time
graph
time
FP.CH03_3pp.indd 43 3/15/2023 12:27:04 PM

3.3 EQUATIONS OF MOTION FOR CONSTANT

ACCELERATION: THE SUVAT EQUATIONS
There are five variables: s, u, v, a, and t so it is possible to construct five differ-
ent equations, each of which omits one of these variables. We will do this in
two ways—first using the graphs of motion and second using calculus.
3.3.1 Derivation 1: From Graphs of Motion

The velocity–time graph below is for a particle accelerating from an initial
velocity u to a final velocity v in a time t.
velocity
(v u)
B
u
0 time
0 t
v u
Acceleration is the gradient of a velocity–time graph so: a
t 0
v u
a
t
v u at (suvat equation 1)
Displacement is the area under a velocity–time graph and this can be config-
ured in various ways:
FP.CH03_3pp.indd 44 3/15/2023 12:27:07 PM

u v t
(i) s =Area = A B ut ½ v u t
2
u v t
s (suvat equation 2)
2
(ii) If we use equation 1 we can eliminate v from equation 2:
1 1
s = Area ut u at u t ut at 2
2 2
1 2
s ut at (suvat equation 3)
2
(iii) Alternatively, we can use equation 1 to eliminate u from equation 2
leading to:
1
s vt at 2 (suvat equation 4)
2
(iv) Finally, we can use equation 1 to eliminate t from equation 2 leading to:
v2 u2 2 as (suvat equation 5)
3.3.2 Derivation 2: Using Calculus

dv
We start with the definition of acceleration: a= .
dt
v t
This is equivalent to the integral: dv adt (a is constant).
u 0
Which gives: v u at or v u at (suvat equation 1).

dv
Now use the definition of velocity: v = .
dt
s t
Which is equivalent to the integral: ds u at dt (a is constant).
0 0
1
Which gives: s ut at 2 (suvat equation 3).
2
The other equations can be derived from these by a series of substitutions.
These equations are valid for motion at constant acceleration.
FP.CH03_3pp.indd 45 3/15/2023 12:27:40 PM

3.4 PROJECTILE MOTION

3.4.1 Independence of Horizontal and Vertical Components of Motion
A projectile is an object moving under the influence of gravity. In the simplest
case, this is an object moving in a uniform gravitational field when frictional
forces can be neglected. The resultant force acting on the object is its weight
and this acts vertically downwards. In order to analyze the motion, we con-
sider the horizontal and vertical components of the motion separately. This is
because:
Horizontal and vertical components of projectile motion are independent
of one another.
The horizontal motion is at constant velocity.
The vertical motion is at constant acceleration.
The graph below shows the trajectory of a particle launched horizontally at
8.0 ms−1 from a point 19.6 m above the Earth’s surface. Its vertical accelera-
tion is constant and equal to 9.81 ms−2.
25
20
vertical position/m
15
10
0
0 5 10 15 20
horizontal position/m
Each data point represents the position of the projectile at 0.10 s intervals.
Notice that the horizontal displacement in each 0.10 s interval is always the
same (0.80 m).
3.4.2 Parabolic Paths

The path of a projectile in a uniform gravitational field (in the absence of
frictional forces) is parabolic. This can be shown by eliminating t from the
equations for horizontal and vertical displacement. While this can be done for
FP.CH03_3pp.indd 46 3/15/2023 12:27:41 PM

a projectile launched at any speed or direction the example below is for the
simple case of a projectile launched horizontally from a point at x = 0, y = 0
at t = 0.
Horizontal motion: x = ut where u is the initial horizontal velocity.
Vertical motion: y = vt − ½ gt2 where v is the initial vertical velocity.
Substituting for t in terms of x in the equation for y gives:
g 2 v
y 2 x ux
2u
Which has the form y ax2 bx c (a parabolic curve with c = 0).
The equation above is only valid if g is constant and there are no frictional
forces. This is approximately true for projectile motion over small distances
close to the surface of the Earth. However, when considering the motion of
rockets varying g and frictional forces must be considered.
3.4.3 The Range of a Projectile

Imagine a projectile launched from a horizontal surface with an initial speed
of u at an angle θ to the horizontal. What is its horizontal range? In the
absence of friction, the horizontal range is equal to the constant horizontal
velocity u cos θ multiplied by the time of flight. To find the time of flight we
can analyze the vertical motion.
FP.CH03_3pp.indd 47 3/15/2023 12:27:44 PM

Initial vertical component of velocity = u sin θ

Final vertical component of velocity = − u sin θ (the motion is symmetric up
and down)
Vertical acceleration = −g
The displacement when the projectile returns to the surface is s = 0
Using suvat equation 1 from Section 3.2: u sin u sin gt
2 u sin
so the time of flight is: t
g
2 u2 sin cos u2 sin 2
And the range is: R
g g
This has a maximum value when sin 2θ = 1 suggesting that maximum range
will be achieved when the projectile is launched at an angle of 45° to the
horizontal (in the absence of friction).
3.5 EQUATION OF MOTION

The suvat equations cannot be used when the acceleration varies. However, it
is still possible to analyze, the motion if the nature of the variation is known.
The general approach to analyze, these situations starts with an equation of
motion, that is, an equation for the variation of the acceleration.
For example, a simple harmonic oscillator (such as a mass on a spring) has an
acceleration a that is directly proportional to its displacement x from an equi-
librium position and directed back toward that position:
a kx
where k is a constant (for a mass-spring system it is the spring constant).

This is a second-order differential equation:
d2 x
kx
dt 2
This can be solved to find equations for the displacement x(t) and velocity v(t)
for the oscillator (see Section 11.2.1).
FP.CH03_3pp.indd 48 3/15/2023 12:27:56 PM

3.6 METHODS TO CAPTURE AND DISPLAY

GRAPHS OF MOTION
There are many different ways to capture motion data. The simplest is to use
a stop clock and meter ruler and this method should not be neglected. The
main problem with this method is the uncertainty in time caused by human
reactions in starting and stopping the clock. Typical human reaction times
for a visual cue, such as watching someone release a ball, is about 0.25 s, so
this method is unlikely to be useful unless the measured time is significantly
longer than this.
3.6.1 Motion Sensors and Dataloggers

Motion sensors and dataloggers have the advantage that they can take rapid
repeated measurements and record the data electronically. This can then be
processed by a computer in order to plot graphs of displacement, velocity, and
acceleration. Many motion sensors work by sending out ultrasound pulses and
detecting their reflection from the moving object. The displacements are cal-
culated automatically using the known speed of ultrasound in air and the time
between the pulse being emitted and its reflection being detected.
To computer
reflector
Ultrasound
pulses Moving trolley
Datalogger
Ultrasound
transducer
3.6.2 Light Gates

Light gates can also be used. Each light gate contains a light source and a
detector. When an object passes between these the beam is cut by a card
and a datalogger measures and records the time at which the beam was cut
and how long it is blocked. When an experiment is carried out the moving
FP.CH03_3pp.indd 49 3/15/2023 12:27:57 PM

object (e.g., a dynamics trolley) usually has a vertical card attached and the
card travels between the “jaws” of the light gate. The length of the card must
be measured and used as a parameter in the setup to enable to software to
calculate velocities.
The diagram below shows an enlarged view of a light gate and a typical experi-
mental setup.
light gate
to datalogger light gate
Light gate A Light gate B
Card
Light beam
FP.CH03_3pp.indd 50 3/15/2023 12:27:58 PM

If two light gates (A and B) are used, the following measurements can be
made and recorded:
Time at A and time at B (from starting the experiment)
Time to move between A and B
Velocity at A and velocity at B
Acceleration from A to B.
Bear in mind that if the object is accelerating the velocity measurements at
each light gate will be average values over the time taken for all of the card to
pass through the beam.
3.6.3 Mobile Phones and Tablets

Mobile phones and tablets usually have a built-in camera with both slow
motion and video capabilities as well as a range of sensors, such as a three-axis
accelerometer. This makes them ideal for capturing, displaying, and analyz-
ing data, especially since there is a wide range of apps designed for just this
purpose.
3.6.3.1 Accelerometer Sensor

This detects the real-time acceleration of the mobile phone along three per-
pendicular axes. The accelerometer samples the acceleration (a sampling
interval of 20 ms is typical) and the captured data can be exported to a spread-
sheet programs such as Excel. When used with a suitable app, tables of data
and graphs for acceleration, velocity, and displacement against time can be
displayed.
The acceleration due to gravity can be measured by dropping the phone (but
make sure it has a soft landing) and the acceleration of a dynamics trolley can
be measured by simply attaching the phone to the trolley. The portability of
the phone makes it ideal for investigating acceleration in real-world situations
such as cars or trains.
3.6.3.2 Video Capture

The high frame rates available on some cameras are ideal for investigating
short-lived, transient phenomena such as the impact of a water droplet with
the surface of water or the formation of a crater when a ball bearing falls into
sand. The cameras in mobile phones typically offer 240 frames per second
FP.CH03_3pp.indd 51 3/15/2023 12:27:58 PM

for slow-motion playback but more sophisticated cameras can record video at
several thousand frames per second.
Slow-motion cameras can also be used to make accurate measurements. For
example, if you are trying to measure the bounce height of a squash ball it is
difficult to judge this by eye. A slow-motion video of the bounce can be ana-
lyzed to locate the maximum height much more accurately.
A great deal of software is available for the analysis of videos of motion. One
of the best free packages is called Tracker. This is ideal for the analysis of
projectile motion or rotational motion. Once your video file has been loaded
into Tracker you can use the pointer to mark the position of the object you
wish to track and then advance the video a few frames at a time to mark sub-
sequent positions. This positional data is used by the software to generate
displacement, velocity, and acceleration data (and graphs) in two dimensions.
For rotational motion, you are able to track angular displacement and angular
velocity.
Here is a screen grab from an experiment to capture and analyze the motion
of a “magnus glider” a spinning object projected from top left of the image.
You can see the individual tracking points behind the object and the data table
and graph of the motion are on the right.
While these applications enable us to track and display complex motions they
do have limitations, and their accuracy will be limited by the quality of the
information put into them. Displacements in a video file are measured by the
number of pixels across the image, they are not absolute measurements. In
FP.CH03_3pp.indd 52 3/15/2023 12:27:59 PM

order to measure displacements in meters the user must calibrate the soft-
ware by identifying a known distance in the image. However, parallax effects
across the field of view can distort results so it is important for the experi-
menter to think hard about the measurements that are being taken.
3.7 EXERCISES
1. A ball is thrown vertically upwards at 4.0 ms−1 from an initial position

1.2 m above the ground on Earth and is allowed to fall down to the ground.
Sketch graphs to show:
(a) displacement versus time
(b) velocity versus time
(c) acceleration versus time.
Take upwards as positive and label your axes with appropriate values.
2. An experiment performed on the Moon finds that a feather falls 20.75 m
from rest in 5 s.
(a) Calculate its speed as it hits the Moon’s surface.
(b) Calculate the acceleration of free fall at the Moon’s surface.
3. A sprinter reaches his maximum velocity of 12 ms−1 after running 30 m
from rest. How long does this take him and what is his average accelera-
tion (assume acceleration is constant).
4. A car is moving at 50 kmh−1, what is this speed in ms−1?
(a) When the brakes are applied the car can stop in 4.0 s, what is its
braking distance?
(b) A cat runs out in front of your car when you are traveling at 50 kmh−1
and your reaction time is 0.6 s. What is the total distance the car will
travel before stopping (i.e., its “stopping distance”)?
5. A stone is dropped into a deep well. It strikes the surface of the water at
the bottom of the well 2.2 s after its release. How deep is the well?
6. A boat slows down from 6.0 ms−1 to 4.0 ms−1. During the deceleration, it
travels 150 m. Calculate the average deceleration.
FP.CH03_3pp.indd 53 3/15/2023 12:27:59 PM

7. Two dragsters line up for a 500 m race. Car A accelerates at 4.0 ms−2 and
then maintains a constant maximum speed of 50 ms−1. Car B accelerates
at 5.0 ms−2 and then maintains its maximum speed of 45 ms−1. Which car
wins the race and by how much?
8. A cricket ball is bowled horizontally at a speed of 20 ms−1 from a height of
2.5 m above the ground.
(a) How far from the bowler does it first hit the ground?
(b) What is the maximum distance from the cricketer that the ball hits
the ground if he throws the ball upwards at the same initial speed at
an angle of 45° to the horizontal? Ignore air resistance.
9. Describe an experiment to measure the acceleration of free fall on Earth.
You should include: a labeled diagram of the apparatus; a list of measure-
ments to be taken including the instruments used to take these measure-
ments; an explanation of how you will maximize precision and accuracy;
an explanation of how you will process the data to get an accurate value
of the acceleration.
FP.CH03_3pp.indd 54 3/15/2023 12:27:59 PM

CHAPTER
4
Forces and Equilibrium
4.1 FORCE AS A VECTOR

Forces have direction and magnitude, they are vector quantities and add and
subtract as vectors. When several forces act on the same body the vector sum
of these forces is called the “resultant force.” If this is zero the forces are said
to be in “equilibrium.” If there is a non-zero resultant force the object acceler-
ates in the direction of this force.
4.1.1 Free-Body Diagrams

Free-body diagrams show all the forces acting on a single object and are use-
ful when starting to solve a problem involving several forces. For example, if
a block rests on a rough slope there will be several forces acting on the block
and on the surface but what happens to the block is determined only by the
forces that act on it. Here is an example, showing the free-body diagram for
the block. If the resultant force is zero the block remains in equilibrium. If it
is non-zero the block will accelerate down the slope.
N: normal
contact force
Block G: friction with
surface
Slope wi
w
with
th
surface
rough surfa
f ce

W: weight of
block
FP.CH04_3pp.indd 55 3/15/2023 12:27:56 PM

4.1.2 Resolving Forces

It is often helpful to resolve forces along two perpendicular axes. In the exam-
ple above it might be helpful to consider the components of forces acting par-
allel (call this the x-direction) and perpendicular (call this the y-direction) to
the slope. The reason for doing this is that it is obvious that forces perpendicu-
lar to the slope must be in equilibrium because the block cannot accelerate
in this direction. It is constrained so that it is only free to move, if it moves at
all, parallel to the slope. If we are only considering forces (and not moments)
we can simplify the free-body diagram by treating the object (the block) as a
point when we are resolving forces.
y The resultant component Fx in

N
the x-direction is given by:
Fx = W sin θ − G
G
The resultant component in
the y-direction is given by:
x Fy = N − W cos θ = 0
W
4.1.3 Finding a Resultant Force

The resultant force on an object is the vector sum of the forces that act on it.
This can be found by:
Scale drawing
Constructing the resultant from its components
Here is an example. Three forces act in the same plane on a single object. We
need to find the resultant force.
6.0 N
20
9.0 N
120
8.0 N
FP.CH04_3pp.indd 56 3/15/2023 12:27:58 PM

Forces and Equilibrium • 57
Method 1: Scale Drawing

Draw the vectors to the same scale, for example, 2.0 mm = 1.0 N, and place
them end to end in any order. The resultant vector can then be drawn from
the start of the first vector to the end of the final vector. Now measure its
length and use the scale to find its magnitude. Its direction relative to any of
the other vectors can be measured using a protractor.
8.0 N
6.0 N
9.0 N
Method 2: Constructing the Resultant from Its Components

Choose two convenient perpendicular axes. At least one of these should be
parallel to one of the existing forces because that force will not then need to
be resolved.
y 6.0 N
20º 9.0 N x
120º
8.0 N
Resolving parallel to the x-axis: Fx = 9.0 + 6.0 cos (20°) − 8.0 sin (30°) = 10.64 N
10.64 N x

4.88 N
FP.CH04_3pp.indd 57 3/15/2023 12:27:59 PM

Resolving parallel to the y-axis: Fy = 6.0 sin (20°) − 8.0 cos (30°) = −4.88 N
Resultant magnitude is found using Pythagoras’s theorem:
F 10.64 2 (4.88)2 11.7 N
The angle between the resultant and the x-axis is found by:
tan ϕ = 4.88/10.64 = 0.459 so φ = 24.6°
An analytic approach like the one used here is preferable to scale drawing
because it will give a more accurate answer.
4.2 MASS, WEIGHT, AND CENTER OF GRAVITY

4.2.1 Mass
Mass and weight are often confused. Mass is a scalar quantity related to the
amount of matter in an object and does not depend on the strength of
the gravitational field. A body of mass 1.0 kg has the same mass on Earth,
on the Moon, and in deep space. Mass determines the inertia of a body, that
is how hard it is to change its motion, and this will be discussed in much more
detail later when we look at forces and motion. This property of a body is
called its inertial mass.
Mass also determines how strongly an object responds to a gravitational field,
this is often called the gravitational mass of a body. The fact that all objects fall
with the same acceleration in the same gravitational field (see Section 5.1.4) sug-
gests that inertial and gravitational mass are directly proportional to each other:
F mgravitational g mgravitational
a g g
minertial minertial minertiial
Einstein assumed that inertial and gravitational mass are equivalent. This
helped him to construct the general theory of relativity (see Section 23.6).
4.2.2 Weight
Weight is the gravitational force acting on a body and is a vector quantity. The
weight of a body depends on the strength of the gravitational field in which it
is placed and is given by the formula:
W = mg
FP.CH04_3pp.indd 58 3/15/2023 12:28:02 PM

where m is the mass in kilograms and g is the gravitational field strength meas-
ured in Nkg−1.
The gravitational field strength near the surface of the Earth is on average
about 9.81 Nkg−1 (standard gravity is 9.80665 Nkg−1) but varies by about 0.7 %
at different locations (from about 9.76 Nkg−1 on a mountain in Peru to about
9.83 Nkg−1 in Oslo).
Here are some values for gravitational field strength elsewhere in the solar
system.
Moon Sun Venus Mars Jupiter Saturn
Surface gravity (Nkg−1) 1.62 275 8.87 3.69 24.8 10.5
Surface gravity (gEarth) 0.165 28.0 0.904 0.376 2.53 1.07
4.2.3 Center of Gravity

The center of gravity is the point through which the resultant gravitational
force on a body (its weight) can be considered to act. This point can lie inside
or outside the body itself depending on its shape and the distribution of mass
within it. For a body of uniform density and shape the center of gravity is at
its geometric center as shown in the diagrams below:
CG CG CG
weight weight weight
If an object is suspended from a point, it will align itself with the center of
gravity vertically below the point of suspension. This means that the loca-
tion of the center of gravity can be found by suspending the object sepa-
rately from two different points (A and B) and noting where the two lines
of suspension intersect inside the object. This is shown below for a two-
dimensional object.
FP.CH04_3pp.indd 59 3/15/2023 12:28:02 PM

Center of
A gravity where
B
both lines cross
A
B
The reason that the center of gravity lies beneath the point of suspension is
that its line of action then passes through the point of suspension and so has
no moment or turning effect at that point. If the object is rotated slightly so
that the center of gravity lies on either side of this line then there would be
a resultant moment causing the body to rotate back toward the equilibrium
position. Equilibrium of moments (see Section 4.4.4) can be used to calculate
the position of a center of gravity for an extended body. The idea is simple.
The resultant moment of the body about any point must be equal to the
moment produced by its weight acting through the center of gravity. From
this equality, we can find the distance of the center of gravity from the point
about which we are taking moments. Here is an example of a uniform rod of
length l and mass m. We take moments about one end of the rod.
A xCG
x x
mg
x
Mass of strip of length δx = m
l
FP.CH04_3pp.indd 60 3/15/2023 12:28:06 PM

mgxδx
Moment about A of strip of length δx =
l
l
mgxdx mgl
Moment of entire rod about A =
0
l 2
This must equal the moment of the weight, mg, acting through the center of
gravity, about A.
mgl
mgxCG =
2
l
so xCG = , the center of gravity is at the midpoint of the rod, as expected.
2
The terms “center of gravity” and “center of mass” are often used interchange-
ably, but they are only in the same position when the object concerned is placed
in a uniform gravitational field. If the field varies significantly across the object
then they will be in different positions. For example, for the rod above, if the
value of g increases from left to right along the bar then the contributions to
the resultant moment would be greater from strips near the right-hand end.
This would move the position of the center of gravity to the right of the center
of the bar, that is, xCG > l/2. The center of mass on the other hand would not
be affected and would remain at the center of the uniform bar. In practice, the
difference between the two positions is rarely significant.
4.3 EQUILIBRIUM OF COPLANAR FORCES

4.3.1 Using the Triangle of Forces to Solve Equilibrium Problems
Many problems in physics involve coplanar forces, often in a horizontal or
vertical plane. The forces will be in equilibrium if they add to give zero result-
ant force. When this is the case the forces themselves will, when placed end
to end in any order, form a closed shape. In the case of three forces this is
called a “triangle of forces” and if two of the forces are known the triangle can
be solved to find the third force that puts the system into equilibrium. Here
is an example, it shows a picture suspended from two strings that pass over a
nail in a wall. The three forces acting on the picture come from the tension in
each side of the string and from its weight. The system is in equilibrium so the
forces form a closed triangle of forces.
FP.CH04_3pp.indd 61 3/15/2023 12:28:11 PM

T1 T2
T1
W
T2
W
If there are more than three forces in equilibrium then they will form a closed
quadrilateral of four or more sides.
Here is an example where the triangle of forces is used to calculate an unknown
force keeping the system in equilibrium. The diagram shows a pendulum held in
equilibrium by a horizontal force F. The problem is to find the magnitude of F.
25º
Triangle of forces
25º
F
8.0 N
8.0 N
From the triangle of forces: tan (25°) = F/8.0 so F = 8.0 tan (25°) = 3.7 N.
FP.CH04_3pp.indd 62 3/15/2023 12:28:12 PM

Using a triangle of forces is sometimes a convenient method to solve a sim-

ple equilibrium problem, especially if two of the forces are perpendicular to
one another. However, for more complex problems we need a more general
method, this involves resolving forces along perpendicular axes.
4.3.2 Resolving Forces to Solve Equilibrium Problems

If a system is in equilibrium there is no resultant force acting on it. This means
that the components of the resultant force along any set of axes will be zero.
We can use this fact to solve equilibrium problems—resolve all forces along
a common set of axes and set each component separately equal to zero. This
method can be illustrated using the example from Section 4.3.1. We will
resolve along horizontal and vertical axes since two of the forces act in these
directions (taking right and up as the positive directions). Let the tension in
the supporting string be T.
Resolving horizontally: F − T sin (25°) = 0 or T sin (25°) = F(1)
Resolving vertically: T cos (25°) − 8.0 = 0 or T cos (25°) = 8.0 (2)
Dividing (1) by (2): tan (25°) = F/8.0 so F = 8.0 tan (25°) = 3.7 N as before.
In this particular case, the method of resolving is a little more involved than
using the triangle of forces, but this method is much simpler if there are a
large number of forces and if none of them are perpendicular.
4.4 TURNING EFFECTS OF A FORCE: MOMENTS, TORQUES,

AND COUPLES
4.4.1 Moments and Torques
The moment of a force about a point P (regarded as the pivot) is defined as
the magnitude of the force multiplied by the perpendicular distance of its line
of action from that point, as shown below:
P .
Moment about P = Fd
FP.CH04_3pp.indd 63 3/15/2023 12:28:12 PM

The SI unit for moment is the Nm and the direction of the moment is usually
described as clockwise or counterclockwise about the pivot. In this example,
it is clockwise.
You might be tempted to think that this unit is equivalent to the joule, since 1
J = 1 Nm. However, when calculating a moment, the force and displacement
are perpendicular, whereas for work done they must be parallel.
Notice that d is the perpendicular distance from the pivot, not the distance
between the point where the force acts on the body and the pivot. This means
that we often have to resolve the force to calculate the moment. The example
below shows how to calculate the moment of an inclined force acting on a
uniform rod at a distance l from the pivot P.
This can be regarded as the magnitude of the force multiplied by the compo-
nent of l perpendicular to the line of action of the force (F × l sin θ) OR the
component of force perpendicular to l multiplied by l (F sin θ × l).
The term torque is often used in engineering or when studying rotational
dynamics. This is simply another name for a moment and is calculated in the
same way and measured in the same units.
If the line of action of a force acts through P its moment about P is zero.
4.4.2 Resultant Moment

If several coplanar forces act on an object the resultant moment (or torque)
is the sum of the moments created by each force taking into account their
FP.CH04_3pp.indd 64 3/15/2023 12:28:13 PM

direction (clockwise or counterclockwise). Here is an example in which sev-

eral blocks rest on a uniform beam of weight 3.0 N:
0.7 m
0.3 m 0.5 m
5.0 N
8.0 N 3.0 N
12.0 N
Resultant moment about P (taking clockwise is positive) = (0.5 × 12.0) +

(5.0 × 0.7) − (0.3 × 8.0) = 7.1 Nm clockwise. This would result in the beam
rotating clockwise with angular acceleration.
Note that the weight of the beam has zero moment about P because the
center of gravity of the beam is directly above P so the line of action of the
weight acts through P.
4.4.3 Couples
When the line of action of a resultant force on a body acts through its center
of mass (CM) it causes a change in translational motion (e.g., an acceleration).
if it acts through another point in the body, it will change its translational and
rotational motions.
. . . .
F
CM CM CM CM
F
translational acceleration translational acceleration

and rotational acceleration
A “couple” is a pair of parallel forces of equal magnitude acting in oppo-

site directions on the same body. They produce a resultant moment but no
resultant force.
FP.CH04_3pp.indd 65 3/15/2023 12:28:14 PM

F
y
x
.
CM
d
Moment of a couple = Fd
The perpendicular distance of each line of action from the center of mass is x
and y, respectively. The
Moment of couple = Fx + Fy = F ( x + y ) = Fd
Notice that the moment of the couple is independent of the individual values
of x and y and depends only on their sum, in other words on the separation of
the two lines of action. This means that the moment of a couple has the same
value about any point in the body (or outside it).
Moment of a couple = magnitude of one force × perpendicular separation of
lines of action
4.4.4 The Principle of Moments

For an extended object to be in equilibrium there are two conditions:
No resultant force.
No resultant moment.
We have already seen how to determine if there is a resultant force. The prin-
ciple of moments states that there will be no resultant moment if, for a par-
ticular object:
The sum of clockwise moments about any point is equal to the sum of
counterclockwise moments about that same point.
Since this applies to any point, we are free to take moments about a conveni-
ent point—for example, one through which one or more of the applied forces
acts. This will simplify calculations. Here is an example.
FP.CH04_3pp.indd 66 3/15/2023 12:28:14 PM

force from strut (S) hood
300 N
28
hinge strut
The hood of a car is held open in equilibrium by a strut as shown above. The
problem is to find the force from the strut. However, there is also an unknown
force from the hinge. The easiest way to solve this problem is to take moments
about the hinge position because this immediately eliminates the force that
acts through the hinge (because it will have no moment about this position).
Taking moments about the hinge position:
Clockwise moment from weight of hood = 300 cos (28°) × 0.80 = 212 Nm
counterclockwise moment from strut force = S × 0.45 Nm
Applying the principle of moments: 0.45 S = 212 therefore S = 471 N
471 N
H

28
28 300 N
We could continue to solve for the unknown force H from the hinge by con-
sidering the equilibrium of forces. This gives us two more equations—one for
the horizontal components and one for the vertical components:
Resolving horizontally: H sin θ = 471 sin (28°) or H sin θ = 221 (1)
Resolving vertically: H cos θ + 471 cos (28°) = 300 or H cos θ = 116 (2)
Dividing (1) by (2) gives tan θ = 221/116 = 1.91 and θ = 62.3°
Substituting back into (1) gives H = 250 N
FP.CH04_3pp.indd 67 3/15/2023 12:28:15 PM

When coplanar forces act on an extended structure (as in the example above)
the principle of moments gives one equation and the equilibrium of forces
gives two more, so with three independent simultaneous equations, it is pos-
sible to solve for up to three unknowns.
4.5 STABILITY
4.5.1 Types of Mechanical Equilibrium
When a system is in equilibrium the forces and moments acting on it are bal-
anced. However, if the system is disturbed it might return to equilibrium or
depart from it. The behavior of the system when disturbed is determined by
its stability. Consider the three objects below, all of which are in equilibrium
and all of which are given a small clockwise displacement from equilibrium.
A B C
A B C
Moment returns Moment takes No resultant

object to object further moment:
equilibrium: from equilibrium:
NEUTRAL
STABLE UNSTABLE
EQUILIBRIUM EQUILIBRIUM EQUILIBRIUM
FP.CH04_3pp.indd 68 3/15/2023 12:28:16 PM

Another way to look at this is in terms of potential energy. In A, any dis-

turbance tends to raise the center of mass and increase GPE so forces
act toward the equilibrium position because this is the local minimum of
potential energy. In B, the disturbance lowers the center of mass and the
GPE so forces continue to act in the direction of the disturbance to lower
the potential energy of the system. In C the disturbance has no effect on
the height of the center of mass so the potential energy is unchanged, refer
graph below.
In terms of potential energy:
Potential Potential Potential

energy energy energy
A B C
position position position
Equilibrium occurs when the gradient of potential energy with position is

dPE
zero: = 0.
dx
d 2 PE
Stable equilibrium occurs at a local minimum of potential energy:
is positive. dx2
Unstable equilibrium occurs at a local maximum of potential energy
d 2 PE
is negative.
dx2
dPE d 2 PE
Neutral equilibrium occurs when = 0 and = 0.
dx dx2
4.5.2 Degrees of Stability

If an object in stable equilibrium is displaced a small amount and released it
will return to its equilibrium position. However, a large displacement might
cause it to become unstable.
FP.CH04_3pp.indd 69 3/15/2023 12:28:22 PM

Small Larger
displacement displacement
W W W
The limit of stability is reached when the line of action of the weight passes
beyond the corner of the base which acts as a pivot. The resultant moment
then changes from counterclockwise to clockwise and when the object is
released it continues to rotate and falls over. The potential energy curve looks
something like this:
Potenal
energy
angular
displacement
The dotted lines represent the limits of stability. These will move farther apart
if the object has a wide base and low center of gravity.
4.6 FRICTIONAL FORCES

Frictional forces can arise in a variety of ways, for example, surfaces rubbing
against one another, air resistance, or the drag on an object moving through a
liquid. However, all frictional forces oppose motion or the tendency to move
FP.CH04_3pp.indd 70 3/15/2023 12:28:23 PM

(e.g., when a car is parked on a hill frictional forces act up the hill). We will
discuss fluid friction when we consider viscosity (see Section 6.3) but here we
will concentrate on the frictional force between two surfaces in contact with
one another.
4.6.1 The Origin of Frictional Forces Between Surfaces in Contact

On a microscopic scale, all surfaces are uneven. This means that when they
are put together they only actually make contact at points and these are
under considerable pressure. The frictional force between the surfaces arises
because temporary bonds form at these points these must be broken in order
for one surface to slide over the other. Work must be done to break these
bonds and this transfers energy to heat as they slide past one another.
Applied
Friconal Force
Force
temporary bonds
The work done by the frictional force is W = Fs where s is the displacement

of one surface relative to the other.
In most cases, when two surfaces are in contact the frictional force depends on:
The nature of the surfaces.
The normal reaction force between the surfaces. The frictional force F
between two surfaces is directly proportional to the normal reaction force,
N: F ∝ N.
But is independent of:
The area of contact.
These simple rules work very well in many cases but there are exceptions. For
example, a pointed object might dig into a surface. Another common excep-
tion is for car tires in snow. Having wider tires does not affect the normal force
but it does increase the friction because the snow does not pack so much.
Another way of thinking about this is to say that the coefficient of friction is
dependent on the normal force.
FP.CH04_3pp.indd 71 3/15/2023 12:28:23 PM

4.6.2 Static and Dynamic (Kinetic) Friction

The frictional force between two surfaces is not constant, it depends on the
applied forces. For example, if a block rests on a rough surface and the hori-
zontal force applied to the block is gradually increased the block remains at
rest until the applied force reaches a limiting value and then it suddenly starts
to accelerate. This means that the frictional force must have been equal to
the applied force up until the point of slipping and must then have decreased
for the block to have an initial acceleration. This is shown in the graph below:
Frictional force
limit of static
gradually increasing friction
frictional applied force
dynamic
force friction
Rough surface
Applied force
The “limit of static friction” is the maximum frictional force acting between
the two surfaces when they are at rest. To move the block an applied force
greater than this limit must be used. Once the block moves and the surfaces
are sliding over one another the friction drops to a lower dynamic value.
4.6.3 The Coefficients of Friction

The limiting frictional force between two surfaces is directly proportional to
the normal reaction between them:
Flimit = constant × N or F limit = l N
where µ is the “coefficient of friction,” a dimensionless constant dependent

upon the nature of the surfaces.
We have already seen that there is a difference between the limit of static
friction and dynamic friction so there are two coefficients of friction between
a pair of surfaces:
µS = coefficient of static friction
µK = coefficient of dynamic (kinetic) friction
FP.CH04_3pp.indd 72 3/15/2023 12:28:24 PM

4.6.4 Measuring the Coefficient of Static Friction

There are two simple methods that can be used to measure µS.
Method 1: By measuring minimum force needed to make

the block slip.
Here is a diagram of the apparatus:
N
Frictional fo
fforce
rce F M Pulley
Mg
m
Masses are added to the hanger until the block just begins to slip. The limit of
static friction is then mg. The normal reaction is N = Mg.
Therefore mg = µSMg
so µS = m/M.
Method 2: By tilting the surface until the block just slips
M
gradually lifted
until block
starts to slip.
angle at which
w ich block
wh

just begins to slip
FP.CH04_3pp.indd 73 3/15/2023 12:28:24 PM

Here is a free-body diagram for the block when the limiting angle after which
the block just begins to slip is reached.
Flimit = SN
W = Mg
At the limiting angle, the block is in equilibrium, so we can resolve forces
parallel and perpendicular to the surface:
Resolving parallel to the surface: µSN = Mg sin θ(1)
Resolving perpendicular to the surface: N = Mg cos θ(2)
Dividing (1) by (2) gives: µS = tan θ
The coefficient of static friction is equal to the tangent of the limiting
angle.
4.6.5 Measuring the Coefficient of Dynamic (Kinetic) Friction

A method similar to method 2 mentioned above can be used. However, in this
case, the angle must be increased in small increments and each time the block
should be given a small push. When you reach the angle ϕ at which the block,
once pushed, continues to move down the plane at a small constant velocity
then µK = tan ϕ.
FP.CH04_3pp.indd 74 3/15/2023 12:28:25 PM

4.7 EXERCISES
1. In which of the following situations (if any) are the forces acting on a man
in equilibrium?
Lying still in his bed.
Sitting in a car seat when the car is traveling at constant velocity along
a motorway.
Standing in a lift that is moving upwards at constant velocity.
Floating is apparently weightless inside an orbiting spacecraft.
Floating in a swimming pool.
2. Calculate the magnitude and direction of the resultant force acting on
each block in the free-body diagrams below:
5.0 N
3.0 N 6.0 N
60
30
6.0 N 4.0 N
1.0 N
(a) (b) 3.0 N
3. A picture of mass 1.2 kg is suspended from rigid support by two wires as

shown below.
40 cm
15 cm
30 cm
Calculate the tension in each wire.
FP.CH04_3pp.indd 75 3/15/2023 12:28:25 PM

4. It is possible to balance the weight of a 1 meter ruler by placing the pivot

off-center as shown below. Assume that the center of gravity of the ruler
is at its geometric center.
0.30 m
0.15 m
1.0 N
Calculate the weight of the ruler.
5. The diagram below shows a paving slab of mass 300 kg resting on two
wooden supports, A and B. Assume that the center of gravity of the slab is
at its geometric center and that the forces between the supports act at the
centers of their areas of contact with the slab.
2.00 m
0.30 m 0.50 m
A B
FP.CH04_3pp.indd 76 3/15/2023 12:28:26 PM

(a) Calculate the support forces from A and B.

(b) Calculate the minimum downward force required at the right-hand
end of the slab to cause it to tip.
6. A wooden block of mass 0.65 kg is at rest on a rough plank. One end of
the plank is slowly lifted and the block just begins to slip down the slope
when the angle to the horizontal is 25°. The plank is held at that angle and
the block moves with a constant acceleration of 0.20 ms−2 until it reaches
the bottom of the slope.
(a) Calculate the coefficient of static friction between the block and the
plane.
(b) Calculate the coefficient of kinetic friction between the block and
the plane.
(c) Sketch a graph to show how the frictional force between the block
and the plank varies from the time the plank is first lifted until
the block reaches the bottom of the plank. Include values on the
force axis.
FP.CH04_3pp.indd 77 3/15/2023 12:28:26 PM

FP.CH04_3pp.indd 78 3/15/2023 12:28:26 PM
CHAPTER
5
Newtonian Mechanics
5.0 INTRODUCTION
Newton’s masterwork, Philosophiæ Naturalis Principia Mathematica
(“Mathematical Principles of Natural Philosophy”) is probably the most
famous and important book (actually three books) in the history of physics. It
sets out the laws of motion and the law of gravitation and then applies these
laws to the motion of planets in the solar system. It also marks the beginning
of mathematical physics. This is not because mathematical arguments had not
been used in physics before Newton but because Newton provided a math-
ematical framework in which to tackle an enormous range of physical prob-
lems. Newton’s work had, and still has, a remarkably wide impact on science
and philosophy and is even used as a model in other apparently unrelated
disciplines (e.g., economics). If you want to be good at physics you need to be
good at mechanics.
5.1 NEWTON’S LAWS OF MOTION

5.1.1 Newton’s First Law of Motion
This is a deceptively simple law but, like many simple statements in physics, it
has deep significance. It describes the natural state of motion of objects when
they are not subjected to a resultant force. While we know it as Newton’s first
law it had already been stated by Galileo. He used a thought experiment to
explain it and this is described below.
Imagine releasing a small ball from the top of a U-shaped ramp. In the absence
of friction, we would expect it to rise up to the same height on the far side of
FP.CH05_2pp.indd 79 3/6/2023 12:03:24 PM

the U (A below). Now let the ramp on the far side slope up more gradually.
We would expect the ball to travel further along the ramp until, once again,
it reached its starting height (B below). Now let the ramp on the far side
continue horizontally, what will happen to the ball? Logically it seems it must
continue to move at a constant velocity until such time as the ramp rises back
up again (C below). Galileo argued that this showed that there is no need for
an unbalanced force to keep things moving at constant velocity.
Newton’s First Law:
An object continues to remain at rest or move at constant velocity (in a straight
line) unless acted upon by a resultant force.
Comments
Most moving objects are acted upon by many forces (e.g., thrust, grav-
ity, contact forces, drag, etc.). If they are moving at constant velocity the
resultant of all these forces must be zero.
If we know that the forces acting on a particular object are in equilibrium,
we cannot assume it is at rest, it might be moving at constant velocity.
If an object is accelerating, decelerating, or changing direction the forces
acting on it must be unbalanced, there is a resultant force. An example is
an object moving at constant speed along a curved path: since the speed is
unchanging there is no force component parallel to the motion but there
must be a component of force perpendicular to it in order to cause the
change of direction.
5.1.2 Galilean Relativity

Galileo thought deeply about the nature and causes of motion. He realized
that the law of inertia (effectively Newton’s first law) implied that the laws of
mechanics will be the same in a moving reference frame as they are in one at
rest. He used another thought experiment to show this:
Shut yourself up with some friend in the main cabin below decks on some large
ship, and have with you there some flies, butterflies, and other small flying
FP.CH05_2pp.indd 80 3/6/2023 12:03:24 PM

Newtonian Mechanics • 81
animals. Have a large bowl of water with some fish in it; hang up a bottle that
empties drop by drop into a wide vessel beneath it. With the ship standing
still, observe carefully how the little animals fly with equal speed to all sides
of the cabin. . . . When you have observed all these things carefully (though
doubtless when the ship is standing still everything must happen in this way),
have the ship proceed with any speed you like, so long as the motion is uniform
and not fluctuating this way and that. You will discover not the least change
in all the effects named, nor could you tell from any of them whether the ship
was moving or standing still.
Dialogue Concerning the Two Chief World Systems, translated by Stillman
Drake, University of California Press, 1953, pp. 186–187 (Second Day).
This was a profound observation. Einstein realized that if the laws of phys-
ics are the same in stationary and uniformly moving reference frames
then there is no fundamental difference between rest and motion, it just
depends on what you choose as your reference frame. It led Einstein to
the special theory of relativity (see Chapter 24) but in Galileo’s time, it
served a different purpose. Galileo was convinced that the Earth orbited
the Sun rather than the other way around as was believed by most people
at the time. They thought that it was obvious that the Earth was not moving
because we cannot feel the motion. Galileo’s thought experiment showed
that you would not expect to feel the motion – everything would happen
on Earth as if it was at rest so the argument against the Earth’s motion was
flawed.
Galilean relativity is the idea that:
The laws of mechanics are the same in all uniformly moving reference
frames.
5.1.3 Newton’s Second Law of Motion

The second law of motion deals with the effects of a resultant force. When a
resultant force acts on something it causes a change in the motion: accelera-
tion. The definition of acceleration is the rate of change of velocity and since
velocity is a vector this means that accelerations include speeding up, slowing
down, and changing direction. The second law can be stated in a number of
different ways. Here, we will state how it applies when a resultant force acts
on a body of constant mass. Later we will consider situations in which the
mass might not be constant.
FP.CH05_2pp.indd 81 3/6/2023 12:03:24 PM

Newton’s Second Law:

When a resultant force F acts on a body with constant mass m it produces an
acceleration a in the direction of the resultant force that is directly propor-
tional to the resultant force and inversely proportional to the mass:
F
a∝
m
We can replace proportionality with equality if we define the units in which
we measure resultant force in the following way.
Definition of the Newton
A resultant force of 1 N accelerates a mass of 1 kg at 1 ms-2.
This makes the constant of proportionality 1 and the equation becomes:
F
a=
m
Comments
This is a vector equation: the resultant force and acceleration vectors are
in the same direction.
The term inertia is given to the property of mass that resists acceleration.
The m in the equation above is sometimes called the “inertial mass.” The
greater the mass of an object the greater its inertia.
Newton argued that because objects in free fall accelerate toward the
Earth, they must be acted upon by a resultant force directed toward the
Earth – a gravitational force.
Newton realized that because the planets follow curved paths they must
also be acted upon by a resultant force. This is a gravitational force toward
the Sun.
Newton thought that the orbital motion of the Moon was caused by a
gravitational force toward the Earth. This force has the same origin as the
force that makes objects close to the surface fall downwards. By compar-
ing the acceleration of an object near the surface with the acceleration of
the Moon he was able to derive the famous inverse-square law of gravita-
tion (see Section 23.1.1).
5.1.4 Free Fall

The ancient Greek philosopher Aristotle thought that more massive objects
should fall faster than less massive objects. Galileo proposed a thought
FP.CH05_2pp.indd 82 3/6/2023 12:03:26 PM

experiment that showed that they should fall at the same rate. It is simple but
compelling and it goes something like this.
1. Assume that more 2. Now join them

massive objects fall together and drop the
faster than less massive composite object –
objects. how fast will it fall?
?
According to Aristotle the combined object should fall faster than either of
the original objects because it is more massive. However, shouldn’t attach-
ing the slower smaller mass to the larger faster mass slow the larger mass
down and shouldn’t attaching the larger mass to the smaller mass speed the
smaller mass up? This argument suggests that the composite body should
have a speed intermediate between the speeds of the small mass and large
mass alone. Aristotle’s idea leads to a contradiction. On the other hand, if all
objects fall at the same rate then there is no problem.
Galileo is said to have tested this idea by dropping two cannon balls of differ-
ent sizes from the top of the tower of Pisa and showing that they landed at the
same time. Many historians doubt that he actually did this but, in 1971, during
the Apollo 15 mission to the Moon, a hammer and a feather were dropped to
the Moon’s surface. They landed at the same time.
Newton’s second law can explain this. The resultant force on an object of
mass m in a gravitational field of strength g is its weight, mg. Using the equa-
tion above:
F mg
=
a = =g
m m
gravitational field strength is equal to the acceleration of free fall in that field.
Comment
It seems obvious that we can cancel the two m’s in the equation above.
However, if we think about this a little more deeply it is not so obvious that
they are the same thing. The m on the bottom of the equation is the “iner-
tial mass,” the m on the top is related to how strongly the mass responds
to a gravitational field, sometimes called the “gravitational mass.” They
are two different properties of mass, so the fact that all objects fall with
FP.CH05_2pp.indd 83 3/6/2023 12:03:27 PM

the same acceleration in the same gravitational field shows that they are
at least proportional to one another and possibly identical. This subtle
point was one of the clues that helped Einstein with his general theory of
relativity, a new theory of gravity.
5.1.5 Newton’s Third Law of Motion

The third law is the one that is most often misunderstood. It is usually stated as:
For every action there is an equal and opposite reaction.
However, to really understand this we must unpack it. What are “actions” and
‘reactions’? Here is another statement of the third law.
Newton’s Third Law.
When A exerts a force on B, B exerts a force of equal magnitude on A. The
two forces are of the same type and act in opposite directions.
Comments
Forces never arise by themselves, they always come in pairs as opposite
ends of an interaction.
The “action” and “reaction” forces act on different bodies (otherwise they
would always cancel out and it would be impossible to change the motion
of anything).
The “action” and “reaction” forces are always of the same type – for exam-
ple, both are contact forces, or both are gravitational forces.
The fact that forces arise from interactions means that changing the motion
of one body must cause an opposite change in the motion of another – this
leads to the law of conservation of momentum (see Section 5.3.4).
Examples
1. The gravitational forces on the Earth and Moon are an “action–reaction” pair.
Gravitaonal force Gravitaonal

from Moon on Earth force from Earth
on Moon
2. When a ball rests on the ground there are two “action–reaction” pairs.
FP.CH05_2pp.indd 84 3/6/2023 12:03:27 PM

Notice that while it is true that the weight of the ball and the upward contact
force from the ground have equal magnitudes and act in opposite directions,
they do not form an action-reaction pair. They fail in three respects: they are
not the same type of force, they do not act on different bodies and they are
not part of the same interaction.
3. When a ball rests on an accelerating surface the contact force and the
weight are not equal. While each action-reaction pair remains balanced
there is now a resultant upward force on the ball, as shown by the free-
body diagram on the right in the diagrams below.
Contact force Contact force

from ground from ground
acng on the ball acng on the ball
Ground Ball
accelerang accelerang
upwards upwards
Gravitaonal
force from
Earth on ball.
Gravitaonal
Gravitaonal force from Earth
force from on ball.
Contact force from ball
ball on Earth.
acng on the ground
FP.CH05_2pp.indd 85 3/6/2023 12:03:28 PM

5.2 LINEAR MOMENTUM

Linear momentum is one of the most important physical quantities. It is con-
served in all closed systems and by all interactions.
Linear momentum p is defined as the product of mass and velocity:
p = mv
It is a vector quantity and its SI units are kg m s-1 (equivalent to Ns).
5.2.1 Newton’s Second Law in Terms of Linear Momentum

The statement of Newton’s second law in Section 5.1.3 leads to the equa-
tion F = ma. We can rewrite this in terms of the linear momentum of the
moving mass.
For a constant mass and force:
v u mv mu change in momentum
F ma m
t t time
Or resultant force = rate of change of momentum

We can derive the same result using calculus:
dv d dp
Fm mv
dt dt dt
dv d mv
Here we have used the fact that m is constant so that m but the
dt dt
final equation is valid more generally, even if m is changing. This allows us to
restate Newton’s second law of motion.
Newton’s Second Law
The resultant force acting on a body is equal to the rate of change of linear
momentum of that body:
dp
F=
dt
The statement and equation (F = ma) in Section 5.1.3 is a special case, for
constant mass, of the general equation given here.
FP.CH05_2pp.indd 86 3/6/2023 12:03:36 PM

5.2.2 Impulse and Change of Momentum

The longer a force acts on a body the greater the change of momentum of the
body. This suggests that the product of force and time might be a useful physi-
cal quantity. It is called “impulse.” When the force and mass are constant we
can derive an equation for impulse:
mv mu
F so Ft (mv mu)
t
Or impulse = change of momentum

The SI unit for impulse is the newton-second (Ns) which is equivalent to the
unit for momentum, kg m s-1.
We can derive a more general equation for impulse by integrating Newton’s
dp
second law F = :
dt
Fdt dp p
The term on the left-hand side is equal to the area under a graph of force
against time, and this is equal to the change of momentum. For example, the
force exerted on a football when it is kicked might vary as shown in the graph
below. The area under the graph will be equal to the momentum transferred
to the ball as it is kicked.
5.2.3 Conservation of Linear Momentum

Consider an interaction between two bodies, this might be an attraction, a
repulsion, or a collision of some kind. From Newton’s third law, each body
FP.CH05_2pp.indd 87 3/6/2023 12:03:45 PM

will exert an equal but opposite force on the other and these forces will act in
opposite directions along the same line and will act for the same time.
Taking positive values in the x-direction the impulse on each body during a
short time dt is given by:
Impulse on A = FB on Adt
Impulse on B = FA on Bdt = - FB on Adt
During the interaction, the impulse given to each body, calculated from the
integral ∫Fdt , is also of equal magnitude but opposite in direction. Since
impulse is equal to change in momentum, the change in momentum of A is
equal and opposite to the change in momentum of B. The momentum change
of the complete system is zero:
t t
Change of momentum of system FA on Bdt FB on A dt 0
0 0
Linear momentum is conserved in an interaction between two bodies:
Newton’s third law tells us that all forces arise as a result of interactions so the
argument above will apply many times over for a complex system of interacting
bodies, as long as we include all the pairs of forces. This can be stated as a law:
The Law of Conservation of Linear Momentum
The linear momentum of a closed system is constant.
Comments
A “closed system” means that we include all pairs of forces within the
system. Another way of saying this is that the linear momentum of a
system is constant if no external resultant force acts on the system.
Linear momentum is a vector quantity, so the conservation of linear
momentum implies a separate conservation of each component of linear
momentum.
An object moving along a curved path has a continuously changing
momentum (it is changing direction even if its magnitude is constant).
FP.CH05_2pp.indd 88 3/6/2023 12:03:48 PM

To solve problems involving the conservation of momentum it is useful to

consider the total momentum of the objects before and after they interact and
then to equate them. Here is a simple one-dimensional example from nuclear
physics.
Alpha decay: an unstable nucleus of mass M is initially at rest and decays by
emitting an alpha particle (this consists of 2 neutrons and 2 protons) of mass m.
The new nucleus recoils with velocity u. What is the velocity, v, of the emitted
alpha particle?
Taking velocities to the right to be positive:

Momentum before = 0
Momentum after = mv - Mu
M
Therefore mv - Mu = 0 and v u
m
In alpha decay, the new nucleus is usually much more massive than the
alpha particle so the alpha particle has a much larger velocity and takes away
most of the energy released in the decay (kinetic energy is proportional to
velocity-squared).
Here is an example of a two-dimensional collision between two balls of masses
m1 and m2.
FP.CH05_2pp.indd 89 3/6/2023 12:03:49 PM

The x- and y-components of momentum must be conserved independently

of one another:
x-momentum before = m1u + 0 = m1u
x-momentum after = m1v1 cos q + m2v2 cos f
therefore m1u = m1v1 cos θ + m2v2 cos φ(1)
y-momentum before = 0
y-momentum after = m2v2 sin f - m1v1 sin q
therefore m2v2 sin φ = m1v1 sin θ(2)
Equations (1) and (2) can then be used to determine the values of up to two
unknown quantities.
Conservation of momentum can also be shown by drawing vectors to repre-
sent the momenta of the bodies before and after the collision.
The sum of the vector momenta after the collision must equal the total vector
momentum before the collision. Solving the triangle using trigonometry is
equivalent to taking components and deriving the equations (1) and (2) above.
5.3 WORK ENERGY AND POWER

When a stone is kicked it moves for a while and then comes to rest. It appears
that the movement energy has disappeared. James Joule realized that the
energy of motion in the moving stone has not disappeared but has been trans-
ferred to other forms of energy in the objects with which it has interacted. If
you slide a book over the surface of a table it eventually stops moving, but the
motion energy it had has been transferred to the particles in the surfaces of
the tabletop and the book and when the book stops moving these particles are
vibrating more vigorously than before. Some thermal energy has been gener-
ated and there has been an increase in temperature. Count Rumford noticed
a similar effect when cannons were bored in the arsenal in Munich and even
carried out an experiment to show that if the cannons were bored underwater
FP.CH05_2pp.indd 90 3/6/2023 12:03:49 PM

the water could be brought to a boil. This suggested that the motion involved
in boring the cannons had produced thermal energy.
Joule realized that there is a mechanical equivalent of heat, in other words,
the book does work as it comes to rest and this transfers its kinetic energy to
thermal energy in the surroundings. Joule’s work set the stage for the idea that
energy is conserved. However, energy comes in a variety of different forms,
and while, it cannot be created or destroyed it can be transferred. When the
book stops moving, its kinetic energy has been transferred to thermal energy
which spreads into the surroundings.
5.3.1 Work
Work is the transfer of energy when the point of application of a force moves
in the direction of the force:
Work done (J) = Force applied (N) × displacement parallel to force (m)
The SI unit of energy is the joule (J).
1 joule of energy is transferred when a force of 1 N moves through 1 m.
(1 J = 1 Nm).
If the force is at an angle to the displacement then the component of force
parallel to the displacement must be used.
Work done to move block through displacement s is: W = Fs cos q

If F and s are expressed as vectors the work done is the scalar product of the
vectors: W = F.s
In many cases the force varies with time or position so the work done can be
found graphically or, if there is a formula for the variation of force, by using
calculus.
1. Work calculated from a graph of force against displacement

(parallel to the force)
FP.CH05_2pp.indd 91 3/6/2023 12:03:49 PM

force / N
The area under a force

against displacement
graph is equal to the
work done by the force.
displacement / m
2. Work done calculated by integration

Consider a varying force F(x) acting along the x-axis. The work dW done as the
force moves from x to (x + dx) is given by:
dW = F(x)dx
the total work Wab done as the force moves from x = a to x = b is given by:
x b
Wab F x x
x a
And if we let the small increments in x tend to zero (dx ® 0) this sum becomes
a continuous integral:
x b
Wab F x dx
x a
5.3.2 Gravitational Potential Energy Changes (Uniform Field)

The strength and direction of the gravitational field g are almost constant
close to the Earth’s surface. g is given by the equation:
g = Fgrav / m
and has a value of about 9.81 Nkg-1 close to the Earth’s surface.
The fact that the field is almost uniform makes it very easy to calculate the
work needed to lift a mass near the Earth’s surface.
The diagram above shows a mass m being lifted through a height h by two
different routes: vertically to A and along an inclined path to B. The vertical
FP.CH05_2pp.indd 92 3/6/2023 12:03:52 PM

force needed to lift the mass is always equal to mg since the applied force
must balance the weight of the mass. In the absence of frictional forces, there
is no horizontal component of force to consider.
Work done lifting mass vertically: W = mgh
Work done along the inclined path: W = mg (h/cos q) cos q = mgh
The work done is independent of the path taken and depends only on the
height through which the mass is lifted. If the mass is allowed to fall back to
the ground it will gain kinetic energy which could be used to do some useful
work (e.g., generating electricity). Lifting it in a gravitational field has given it
the potential to do work.
Gravitational potential energy (GPE) is the potential energy an object has
because of its position in a gravitational field.
In a uniform gravitational field change in GPE is given by: DGPE = mgh
If a mass is moved around a closed loop inside a gravitational field (any gravi-
tational field, not just a uniform field) the work done by an external agent in
lifting it is equal to the work done by the gravitational field as it comes back
down. If it is returned to its original position the work done in the loop is zero.
This is an example of what is called a “conservative field.” Gravitational fields
are conservative fields.
5.3.3 Kinetic Energy

Kinetic energy is the energy a body has because of its motion. We can use the
definition of work to derive an expression for kinetic energy by considering the
work that must be done to accelerate a mass m from velocity u to velocity v.
FP.CH05_2pp.indd 93 3/6/2023 12:03:52 PM

Consider a mass moving at velocity v and acted upon by a resultant force F
In a short time dt the work done on the mass is given by: dW = Fds = Fvdt
v v
Using: F ma m : W m v t mv v
t t
Taking the limit of dt®0 and integrating between initial and final velocities
gives:
v
1 2 1
W mvdv mv mu2
u
2 2
The work done is equal to the change in kinetic energy so: KE = ½ mv2
It is also possible to derive this relationship by considering a constant result-
ant force acting on a constant mass to produce a constant acceleration. The
displacement during acceleration is given by:
s
v 2
u2
2a
Multiplying throughout by the mass m and acceleration a gives:
m v2 u 2
mas
2
Using F = ma we obtain: W = Fs = ½ mv2 - ½ mu2 as before.
5.3.4 The Law of Conservation of Energy

The law of conservation of energy can be stated in several different ways.
Here are two of them:
Energy is never created or destroyed but one form can be transferred to
another.
The total energy of a closed system is constant.
Energy is a scalar quantity so working out how much energy is present in a
system is simply a matter of adding the energies of each part of the system.
Alpha decay is a good example. We have already seen that the total momen-
tum of the system is zero before and after the decay. However, nuclear energy
FP.CH05_2pp.indd 94 3/6/2023 12:04:04 PM

has been transferred to the kinetic energy of the new nucleus formed in the
decay and the emitted alpha particle.
Total momentum = mv - (M-m) u = 0

Total energy = ½ mu2 + ½ mv2 ¹ 0 (direction does not the affect sign of KE)
Notice that, in this example, kinetic energy is not conserved. It is important to
realize that the law of conservation of energy is about total energy and not any
one kind of energy. In some interactions and collisions (usually on the atomic
scale) kinetic energy IS conserved. When this is the case the interaction is
described as an “elastic” interaction. In the vast majority of interactions, some
of the initial kinetic energy is transferred to other forms (such as thermal
energy) and these are described as inelastic interactions.
Type of interaction Momentum Kinetic energy
Elastic conserved conserved
Inelastic conserved not conserved
5.3.5 Energy and Momentum in a 2D Collision

We can use a two-dimensional collision to illustrate how the laws of conserva-
tion of momentum and energy are used in problems. In this example, a ball of
mass m1 coming in from the left strikes another stationary ball of mass m2, and
then the two balls move off in different directions.
We can apply the conservation of momentum along each axis:
x-axis: m1u1 + 0 = m1v1 cos q + m2v2 cos f(1)
y-axis: 0 = m2v2 sin f + m1v1 sin q(2)
Kinetic energy before the collision: EK (before) = ½ m1 u12(3)
Kinetic energy after the collision: EK (after) = ½ m1 v12 + ½ m2 v22(4)
FP.CH05_2pp.indd 95 3/6/2023 12:04:05 PM

If the collision is elastic then: EK (before) = EK (after).

If the collision is inelastic then: EK (before) - EK (after) = KE transferred to
other forms.
5.3.6 Energy Transfers

There are many different types of energy.
Kinetic energy: the energy of a body as a result of its motion.
Potential energy: energy stored as a result of the position of a body in a
field, for example,
Gravitational potential energy
Electrical potential energy
Nuclear potential energy.
There are also other forms of potential energy which are related to elec-
trical potential energy because they relate to the bonds between particles
inside materials:
Chemical potential energy
Elastic potential energy (strain energy).
Thermal energy: sum of random kinetic energies of all particles in a body.
It differs from the kinetic energy above in that these motions average to
zero in the center of mass frame of the body.
FP.CH05_2pp.indd 96 3/6/2023 12:04:05 PM

Radiant energy: energy of electromagnetic waves (or photons).

Rest energy: the energy associated with mass through Einstein’s equation
E = mc2.
Heat and work are not forms of energy. These are two important ways in
which energy is transferred:
Work is the energy transferred by a force when the point of action of the
force is displaced in the direction of the force.
Heat is the energy transferred because of a temperature difference
between two systems.
Devices that transfer energy from one form to another are called “transduc-
ers.” Energy transfers are often displayed using a flow or Sankey diagram. For
example, the diagram below is for a compact fluorescent lamp that converts
60% of the electrical energy supplied to it into visible radiant energy (light).
The efficiency of a transducer is the ratio of the useful output energy to the
total input energy, usually expressed as a percentage. In the case of the com-
pact fluorescent light above this would be an efficiency of 0.40 or 40%:
useful output energy
efficiency= ×100%
total input energy
5.3.7 Power
Power is the rate of transfer of energy.
energy transferred
Power=
time
FP.CH05_2pp.indd 97 3/6/2023 12:04:07 PM

dE
For continuous energy transfers: P =
dt
The SI unit for work is the watt (W) and 1 W = 1 Js-1.
Be careful not to confuse W for watt (a unit) with W for work (a physical
quantity).
If the energy is transferred as work this is the rate of doing work. If work W is
dW d Fs
done in time t the power will be P
dt dt
ds
=
If the force is constant this becomes: P F= Fv
dt
This is the scalar product of force and velocity: P = F.v
Efficiency can also be expressed in terms of power:
useful output power
efficiency= ×100%
total input power
5.4 ENERGY RESOURCES

It seems almost contradictory to say that we may face a global energy crisis
when energy is a conserved quantity. The problem is, that when energy-dense
fuels are used to provide useful work or heating, the energy itself becomes
spread out among an ever-increasing number of particles and usually ends
up as thermal energy at a low temperature (see Chapter 10). The problem
is that it is not possible to convert this efficiently into useful work. The avail-
ability of energy to do work decreases. There is a limit to our ability to extract
work from heat set by the laws of thermodynamics (see Section 10.2.3) so we
continue to look for new sources of fuel (chemical or nuclear) and to improve
technologies that can extract energy from our environment. Primary energy
sources are usually used to generate electrical energy which can be easily
transferred over long distances by transmission lines. The table below gives
some information about some primary energy sources.
Primary source Energy type Primary energy transfer
Fossil: oil, coal, gas Chemical Chemical ® thermal
Uranium, plutonium Nuclear fission Nuclear ® thermal Non-renewable
Isotopes of hydrogen Nuclear fusion Nuclear ® thermal
FP.CH05_2pp.indd 98 3/6/2023 12:04:16 PM

Solar Radiant Radiant ® electrical

The wind Kinetic Kinetic ® electrical
Geothermal Thermal Thermal ® kinetic
Renewable
Hydroelectric GPE GPE ® kinetic
Tidal GPE GPE ® kinetic
Biomass Chemical Chemical ® thermal
Waves Kinetic Kinetic ® electrical
Fossil and nuclear fuels are “non-renewable energy sources.” Once we use
them, they are not regenerated, and eventually, they will run out. However,
some non-renewable fuels, for example, those used for nuclear fusion, would
be capable of providing energy for millennia. Unfortunately, we have not yet
solved the technical problems associated with building an effective working
nuclear fusion reactor. However, a huge international research reactor, ITER,
is being constructed in France as a major step toward a commercial fusion
reactor and it is hoped that this will produce its first plasmas in the mid-2020s.
Renewable energy resources are those that are naturally regenerated in a short
period of time.
5.5 PROPULSION SYSTEMS

A propulsion system changes the way an object moves or maintains its motion
in the presence of opposing forces. In order to generate a driving force that
acts on the moving object Newton’s third law implies that the system must
exert an equal force in the opposite direction on something else. Newton’s
second law tells us that the magnitude of the driving force will be equal to the
magnitude of the rate of change of momentum of the material that the system
pushes against. Wheels, propellers, and jet engines exert forces on their sur-
roundings. Rockets eject burnt fuel and exert force on that.
5.5.1 Jet Propulsion

The simplest way to think of a jet engine is as a system that accelerates a gas
backward and in so doing generates a forward force on the jet engine. The
jet pushes back on the atmosphere and the atmosphere pushes forward on it.
Here is a schematic of a jet engine:
FP.CH05_2pp.indd 99 3/6/2023 12:04:16 PM

The force exerted on the jet engine will be equal in magnitude to the rate of
change of momentum of the gas passing through the engine. While it is true
that the combustion of fuel inside the engine adds mass to the air passing
through it this is in practice a very small contribution so we can simplify the
analysis by assuming that the mass flow in and out of the system is solely due
to the air. This means that the mass flow rate in and out of the jet is the same.
If u is the air speed of the aircraft and v is the speed of the exhaust gases then
the thrust will be given by a simple formula:
dm
F v u
dt
Clearly, v must be greater than u for the jet to produce a forward thrust.
We can also see that there are two ways to increase the thrust:
Increase the mass flow rate (dm/dt).
Increase the exit velocity of the gases (v - u).
Different types of jet use different methods.
5.5.2 Rockets
A jet could not work in a vacuum because there would be no external material
against which to push. Rockets get around this problem by ejecting a large
amount of matter (in the form of burnt fuel) at very high velocity. The change
in momentum of the ejected matter is equal and opposite to the change in
momentum of the rocket. Once again, we can derive an equation for the
thrust of a rocket using Newton’s second law, but his time we are dealing with
an object of changing mass:
d mv
F
dt
FP.CH05_2pp.indd 100 3/6/2023 12:04:21 PM

If the rocket is in space with no external forces acting upon it then we can use
conservation of momentum to show that the change of velocity of the rocket
depends on the proportion of its mass expelled as burnt fuel. Consider a short
time dt during which the rocket expels a mass dm at a velocity u relative to the
rocket. At this time the rocket has a forward velocity v and a mass m.
Taking motion to the right to be positive and using conservation of momentum:

Change of momentum of exhaust gases = - udm(1)
Change of momentum of rocket = (m - dm) dv = mdv -dmdv(2)
The final term in (2) can be neglected because it is the product of two second-
order terms. Now, we use the conservation of momentum before and after
this mass of fuel is ejected:
- udm + mdv = 0
Now separate variables and integrate from an initial velocity v0 to a final veloc-
ity vf during which time the mass of the rocket falls from m0 to mf:
mf vf
dm dv

m0

m v0 u
leading to:
m
v f v0 u ln 0
mf

The final velocity of the rocket can be increased by:

Increasing the velocity of the exhaust gases.
Increasing the ratio m0/mf.
FP.CH05_2pp.indd 101 3/6/2023 12:04:26 PM

The second condition requires that the final mass of the rocket is small com-
pared to its initial mass, so the majority of the rocket’s mass at launch will be
in its fuel.
5.5.3 Radiation Pressure

Einstein’s photon theory assumes that electromagnetic radiation can only
transfer energy to and from matter in discrete quanta or photons of energy
E = hf, where h is the Planck constant. Einstein also showed that energy and
mass are equivalent or E = mc2 where c is the speed of light. Combining these
two equations suggests that the absorption or emission of a photon of fre-
quency f also involves a momentum change of size:
E
p=
c
If the light is absorbed by a surface of area A the rate of change of momentum

of the photons will be:
dp IA
=
F =
dt c
where I is the intensity (number of photons per second ´ photon energy). This
will be equal and opposite to the force exerted on the surface (by Newton’s
third law).
When radiation is reflected from a surface (e.g., light from a mirror) the force
is doubled because the momentum change is doubled for each photon (from
a positive value to an equal negative value). While the radiation pressure from
ordinary light sources on human-sized mirrors is tiny (less than 10-6 Pa for 100
W of radiation falling onto a mirror of area 1 m2) it has been suggested that
high-intensity laser beams directed from Earth might be able to accelerate
reflective micro spacecraft up to very high speeds so that they can make trips
to the nearest stars within human lifetimes.
5.6 FRAMES OF REFERENCE

In order to analyze, the motion of an object we need to measure it against a
fixed set of axes (a coordinate system) using a reliable clock. Often the refer-
ence frame we choose is at rest with respect to the Earth, but the motions
FP.CH05_2pp.indd 102 3/6/2023 12:04:28 PM

would be very different if they were measured with respect to the Sun or the
Moon. Galileo used a thought experiment to show that the laws of mechanics
are the same in all uniformly moving reference frames:
Shut yourself up with some friend in the main cabin below decks on some
large ship, and have with you there some flies, butterflies, and other small
flying animals. Have a large bowl of water with some fish in it; hang up a
bottle that empties drop by drop into a wide vessel beneath it. With the ship
standing still, observe carefully how the little animals fly with equal speed to
all sides of the cabin. The fish swim indifferently in all directions; the drops
fall into the vessel beneath; and, in throwing something to your friend, you
need throw it no more strongly in one direction than another, the distances
being equal; jumping with your feet together, you pass equal spaces in every
direction. When you have observed all these things carefully (though doubt-
less when the ship is standing still everything must happen in this way), have
the ship proceed with any speed you like, so long as the motion is uniform and
not fluctuating this way and that. You will discover not the least change in all
the effects named, nor could you tell from any of them whether the ship was
moving or standing still. In jumping, you will pass on the floor the same spaces
as before, nor will you make larger jumps toward the stern than toward the
prow even though the ship is moving quite rapidly, despite the fact that during
the time that you are in the air the floor under you will be going in a direction
opposite to your jump. In throwing something to your companion, you will
need no more force to get it to him whether he is in the direction of the bow
or the stern, with yourself situated opposite. The droplets will fall as before
into the vessel beneath without dropping toward the stern, although while the
drops are in the air the ship runs many spans. The fish in their water will swim
toward the front of their bowl with no more effort than toward the back, and
will go with equal ease to bait placed anywhere around the edges of the bowl.
Finally, the butterflies and flies will continue their flights indifferently toward
every side, nor will it ever happen that they are concentrated toward the stern,
as if tired out from keeping up with the course of the ship, from which they
will have been separated during long intervals by keeping themselves in the
air. And if smoke is made by burning some incense, it will be seen going up in
the form of a little cloud, remaining still and moving no more toward one side
than the other. The cause of all these correspondences of effects is the fact that
the ship’s motion is common to all the things contained in it, and to the air
also. That is why I said you should be below decks; for if this took place above
in the open air, which would not follow the course of the ship, more or less
noticeable differences would be seen in some of the effects noted.
FP.CH05_2pp.indd 103 3/6/2023 12:04:28 PM

Dialogue Concerning the Two Chief World Systems, translated by Stillman

Drake, University of California Press, 1953, pp. 186–187 (Second Day).
This is an important observation. It means that there is no way to distinguish
rest from uniform motion, so there is no way to be sure that any particular
reference frame is at rest in space. All uniformly moving reference frames are
called “inertial reference frames.” It is often helpful when solving problems
in mechanics to select a particular inertial reference frame to simplify a prob-
lem, and often the best one to choose is the center of mass frame.
5.6.1 The Center of Mass Frame

The center of mass reference frame is at rest with respect to the center of
mass of the system. This is also the reference frame in which the sum of the
momenta of all particles in the system is zero. To transform from a laboratory
reference frame to the center of mass frame of the system we simply subtract
the velocity of the center of mass relative to the laboratory from all of the
velocities of particles in the system.
Here is an example in which switching to the center of mass system makes
it much easier to understand the physics of interaction. Since matter and
antimatter annihilate to form gamma-rays it seems reasonable to think that a
single electron might annihilate with a positron to form a single gamma-ray
photon as shown below:
If this could occur, the photon would take away the energy and momentum of
the electron-positron pair. However, this is in fact impossible, as can be seen,
if we transform to the center of the mass frame by subtracting u/2 from the
electron and photon:
The momentum before annihilation is zero since the electron and positron
have equal mass but opposite velocities. This means that a single photon would
have to carry away energy but have no momentum. We have seen (Section
5.5.3) that photons have momentum p = E/c so this is impossible. In reality,
FP.CH05_2pp.indd 104 3/6/2023 12:04:29 PM

such an annihilation results in a pair of photons emitted in opposite directions

in the center of the mass frame.
This allows energy to be carried away and the total momentum to remain
zero. The creation of a pair of identical photons when an electron and posi-
tron annihilate is utilized in PET scanners (see Section 29.5).
5.6.2 The Galilean Transformation

A transformation between two inertial reference frames is called a Galilean
transformation. This is a set of equations that transforms the coordinates of
an object in one inertial reference frame to its coordinates in another. For
simplicity consider an object at rest at point P with coordinates (x, y, z) in a
particular inertial reference frame. Now consider a second inertial reference
frame that coincides with the first one at time t = 0 but which is moving in the
positive x-direction at a constant velocity v. What are the coordinates (x’, y’, z’)
of P in this reference frame?
The equations for this transformation are:
x’ = x - vt
y’ = y
z’ = z
t’ = t
FP.CH05_2pp.indd 105 3/6/2023 12:04:29 PM

The laws of Newtonian mechanics are invariant (do not change) under
Galilean transformations. These transformations assume that physics takes
place against a background of absolute space and absolute time which are the
same for all observers. However, early in the 20th century, Albert Einstein
realized that the laws of electromagnetism are not invariant under Galilean
transformations. This realization ultimately led him to the special theory of
relativity which postulates that the laws of physics should be the same in all
inertial reference frames. The only way that this could be true was if measure-
ments in space and time were all relative and not absolute. This is explored
further in Chapter 24.
5.7 THEORETICAL MECHANICS

In the century following the publication of Newton’s laws of motion physi-
cists and mathematicians developed alternative ways of solving problems
in Newtonian mechanics. These new methods were equivalent to the use
of equations such as F = ma but often simplified the solutions to complex
problems. They also provided new ways to think about physical processes.
One of the most significant approaches was developed by an Italian-French
mathematician called Joseph-Louis Lagrange in 1788. A great advantage of
the Lagrangian method is that it helps to show the links between Newtonian
mechanics and quantum mechanics.
5.7.1 Force and Energy

It is often possible to solve the same problem by a variety of different methods.
For example, consider how you might calculate the final velocity of an object
dropped vertically from rest through a height h in the earth’s gravitational field.
FP.CH05_2pp.indd 106 3/14/2023 4:28:29 PM

Method 1: using forces

The downward acceleration for the object is a = F/m = mg/m = g
The displacement is h
Using a “suvat” equation: v2 u2 2 gh so v = 2 gh
Method 2: using energy
The change in GPE as the object falls is mgh
The gain in KE is ½ mv2
Energy is conserved so ½ mv2 = mgh so v = 2 gh as before.
In this case, both methods are equally simple but there is a fundamental dif-
ference in what we have done. Method 1, using forces is based on vector equa-
tions. Method 2, using energy, is based on scalar equations. The Lagrangian
method is based on scalars and is related to method 2. It works best for solving
problems involving conservative forces (i.e., where we can neglect frictional
forces so that all of the energies in the system are either kinetic or potential).
5.7.2 Lagrangian Mechanics

The Lagrangian method does not introduce any new physics but it does pro-
vide a different way to look at Newtonian mechanics. The Lagrangian func-
tion L is defined as:
L=T-V
where T is kinetic energy and V is potential energy.
The Lagrangian is a scalar quantity. In general, the kinetic energy will
depend on velocity (vx, vy, vz) and the potential energy will depend on posi-
tion (x, y, z). One of the beauties of the Lagrangian method is that it can
work equally well with polar or spherical coordinates (this is often difficult
when we start from F = ma) so we can set up the theory with an arbitrary set
of “generalized” coordinates:
Positions: q1, q2, q3 (x, y, z) in Cartesian coordinates.
Velocities: q 1 , q 2 , q 3 (v , v , v ) in Cartesian coordinates.
x y z
The dot above the symbol represents differentiation with respect to time
dq
(e.g., q = )
dt
FP.CH05_2pp.indd 107 3/6/2023 12:04:38 PM

Once the Lagrangian for the system is known we can derive the equation of
motion by using the Euler-Lagrange equation:
L L

t q q
The method itself is quite simple:
1. Write down the Lagrangian for the system under consideration.

2. Use the Euler-Lagrange equation to derive the equation of motion.
Consider a simple example where a particle of mass m is released from rest
in a uniform gravitational field of strength g. We will set x = 0 at the surface
and consider only the x-direction (i.e., solve the problem in one dimension).
1 2 1
1. T = mx and V = mgx so L = mx 2 = - mgx
2 2
L
2. LHS of Euler-Lagrange equation: mx
t x t
L
RHS of Euler-Lagrange equation: mg
x

These are equal: mx mg
t
Leading to: x g
This is hardly surprising. It shows that the object accelerates in the - x direc-
tion with an acceleration of magnitude g and that this acceleration is inde-
pendent of the mass. This is exactly the same result as if we had started with
the forces and used F = ma. In fact, as you can see, the Euler-Lagrange equa-
tion generates this equation in the second line of (2).
FP.CH05_2pp.indd 108 3/6/2023 12:04:48 PM

It seems to be a complicated way to solve a trivial problem and in this case, it

is. However, in many complex problems, the Lagrangian method is far sim-
pler than any attempt to use F = ma directly.
5.8 EXERCISES
1. It is possible to lift a 1 kg mass slowly using a cotton thread. However, if

the thread is jerked it snaps. Why?
2. Use Newton’s laws of motion to explain why:
(a) passengers on a moving bus feel as if they are being thrown forwards
when the bus suddenly brakes,
(b) a gun recoils when it is fired,
(c) you feel “heavier” when you are standing inside a lift that is acceler-
ating upwards,
(d) the gravitational force pulling the Earth toward the Sun is equal to
the gravitational force pulling the Sun toward the Earth.
3. The diagrams below show forces acting on an object with a mass of 1600 kg.
Calculate the resultant acceleration in each case.
4. Work out the linear momentum of each of the following:

(a) A 1000 kg car traveling at 30 m/s
(b) A 200 g stone moving at 80 cm/s
5. A 65 kg rugby player traveling due east at 8 m/s collides head-on with a
60 kg rugby player traveling due west at 6 m/s. After the collision, the two
players are initially locked together.
Calculate the velocity (magnitude and direction) of the two players imme-
diately after the collision.
FP.CH05_2pp.indd 109 3/6/2023 12:04:49 PM

6. The diagram below shows a two-dimensional collision between two identical

pucks on an air table.
The initial velocity, u, is 0.80 ms-1.

(a) Calculate the magnitudes of the final velocities v1 and v2.
(b) Show that this is an inelastic collision and calculate the fraction of
the initial energy that is transferred to other forms.
7. When a car of mass 1400 kg is traveling along a horizontal straight high-
way at a constant speed of 30 ms-1 its output power is 75 kW.
(a) Calculate the total drag force on the car (assume all the output
power does work against drag).
(b) The road begins to climb upwards at a constant angle of 5.0° to the
horizontal. The driver maintains the same constant speed of 30 ms-1
and the drag on the car are unchanged.
i. Explain why the power output of the car must increase.

ii. Calculate the new power output.
8. A car of mass m can apply a maximum braking force B. The driver of the
car has a minimum reaction time of T.
(a) Derive a formula for the minimum stopping distance for this car and
driver when the car is traveling at speed u.
(b) Sketch a graph of the variation of stopping distance with u.
(c) Explain why even a small decrease in the speed limit in pedestrian
areas could significantly reduce the number of pedestrians hit by cars.
FP.CH05_2pp.indd 110 3/6/2023 12:04:49 PM

CHAPTER
6
Fluids
6.0 INTRODUCTION
The particles inside a solid vibrate at fixed positions unless the material is
placed under extreme stress. Particles inside a fluid however can move past
each other and change position, this allows them to flow when stresses are
applied to the fluid. Liquids and gases are fluids and their behavior can be
modeled using Newton’s laws.
Here are some key ideas used to describe the behavior of fluids:
Density (kgm-3):
r = m / V where m is the mass of the fluid (kg) and
V is the volume occupied by the fluid (m3).
Pressure (Pa): p = F/A where F is the force (N) exerted by the
fluid perpendicular to an area A (m2). 1 Pa = 1Nm-2.
Pressure in a fluid acts in all directions and the pres-
sure at the same level in a static fluid is constant.
Shear stress (Pa): When two parallel layers of area A are pulled in
opposite directions by a force F the shear stress act-
ing on the layers is s = F/A.
Incompressible fluids: Liquids such as water do not compress easily so a
useful model assumes that they have constant vol-
ume and so their density is constant.
FP.CH06_3pp.indd 111 3/15/2023 12:31:39 PM

Viscosity: When one layer of a fluid moves over another nearby layer
a frictional force between the layers opposes the flow. The
greater the resistance to flow, the greater the viscosity of
the fluid.
Inviscid fluid: For situations in which the viscous forces are negligi-
ble, or for fluids with very low viscosity, a useful model
assumes that the viscosity is zero. Such a fluid is said to be
“inviscid.”
Ideal fluid: The simplest model of a fluid is one which is incompress-
ible and has zero viscosity, this is called an ideal fluid.
Water can often be treated as an ideal fluid.
Ideal gas: An ideal gas is one that obeys the equation of state, pV =
nRT where p is pressure, V is volume, n is number of moles,
T is temperature in kelvin and R is the molar gas constant.
6.1 HYDROSTATIC PRESSURE
6.1.1 Excess Pressure Caused by a Column of Fluid

The increase in pressure that you feel when you go underwater in a swimming
pool is caused by the weight of the water above you. This is called hydrostatic
pressure. A useful equation for hydrostatic pressure can be derived by think-
ing about the forces that support a horizontal layer of fluid of thickness dz and
density r at height z in a column of cross-sectional area A.
FP.CH06_3pp.indd 112 3/15/2023 12:31:39 PM

Fluids • 113
The layer has weight W = rAgdz but it is in equilibrium so there must be a

force F of equal magnitude supporting it. This means that the pressure under
the layer must be higher than the pressure above it by an amount dp so that:
F = - Adp = rAgdz
The negative sign is because the pressure is greater lower down (p decreases
as z increases).
For thin layers this gives a pressure gradient:
dp
g
dz
If the fluid is considered incompressible its density is constant and the expres-
sion above can be integrated to give the excess pressure at the base of a col-
umn of height h caused by the weight of the column.
p0 0
Excess pressure = p0 ph p gdz gh
ph h
This result can also be derived by simply calculating the total weight of the
column and dividing it by the area of the base: p0 = rAgh/A = rgh. However,
the approach above can be used when the density of the fluid changes with
depth, for example, to derive an expression for the atmospheric pressure at
altitude h.
6.1.2 Atmospheric Pressure

Atmospheric pressure is caused by the weight of the atmosphere. However,
air is highly compressible so the density of the air will vary with height having
a high density near the surface and lower density higher up. In order to model
this, we make two assumptions:
Air acts like an ideal gas.
The temperature of the air does not vary with height.
If air is an ideal gas then its density will be related to its pressure. For a mass
M containing n moles of molecules each of mass m:
pV = nRT (ideal gas equation)
r = M/V = Mp/nRT (substitution from ideal gas equation)
FP.CH06_3pp.indd 113 3/15/2023 12:31:42 PM

Using molar mass: M = nNAm (NA is the Avogadro number (6.02´1023) we can
express density in terms of pressure:
r = NAmp/RT = mp/kT k is the Boltzmann constant (k = R/NA).
We can now substitute this expression for density into the equation for hydro-
static pressure gradient in a fluid:
dp mgp
g
dz kT
The expression on the right-hand side depends on p so we have to separate varia-

bles before we can integrate between the ground (z = 0) and some height (z = h):
px x
dp mg
p kT 0dz
p0
p mg
ln x x
p0 kT
mg
x
px p0 e kT
Atmospheric pressure (and therefore density) falls exponentially with height

x above the surface.
In reality, the temperature of the atmosphere is not independent of altitude
and falls significantly (from about 290 K to 220 K) in the first 10 km. These
variations can be built into the model to provide a better fit.
6.1.3 Using a Manometer to Measure Pressure Differences

A manometer or U-tube can be used to measure pressure differences by con-
necting one side to the pressure source to be measured and leaving the other
side open to the atmosphere. The difference in liquid height can be used as a
direct measure of the pressure difference (e.g., mm of water or mm of mercury)
or can be used to calculate the pressure difference in pascals using Δp = rgΔh.
On the left-hand manometer above both tubes are open to the atmosphere at
pressure p0 so the levels are equal. On the right-hand manometer an excess
pressure Dp has been connected to one tube. The pressures at A and B inside
FP.CH06_3pp.indd 114 3/15/2023 12:31:51 PM

Fluids • 115
this manometer must be equal because these points are at the same level in the
same fluid. However, the pressure at B is also equal to the pressure p0 plus the
pressure caused by the column of fluid BC of height Dh.
pA = p0 + Dp = pB = p0 + rgDh
Dp = rgDh
The lower the density of the liquid used, the greater the sensitivity of the
manometer (i.e., the greater the change in height per unit change in pressure).
6.1.4 Barometers
Imagine a manometer with one end open to the atmosphere and the other
end attached to an effective vacuum pump. The pressure difference is equal
to atmospheric pressure pAt so the height of the column can be used to meas-
ure atmospheric pressure. This is the principle of the barometer. However,
instead of connecting one end of a manometer to a pump, one end is sealed
and the fluid is allowed to fall away from the sealed end so that a vacuum
forms above it.
Pressure at A is equal to the pressure at B because both points are at the same
level in the same liquid. Since A is on the surface of the liquid exposed to the
atmosphere the pressure at both points must be atmospheric pressure.
FP.CH06_3pp.indd 115 3/15/2023 12:31:51 PM

Atmospheric pressure = pA = pB = 0 + rgh

Atmospheric pressure = rgh
In practice, the vacuum above the mercury in a barometer
(called the Torricelli vacuum) is not perfect. This is because
some mercury atoms will leave the surface of the mercury.
However, the vapor pressure of mercury at room tempera-
ture is only about 1 Pa (0.001 % of atmospheric pressure)
so it makes no significant difference to measurements of
atmospheric pressure.
Another type of pressure gauge is called a Bourdon gauge (see
the image on the right). This contains a hollow curved tube
that is closed at one end. When the pressure inside the tube
is changed it bends. This small motion is amplified mechani-
cally and used to move a needle against a calibrated scale.
6.1.5 Dams
The design of a dam must take into account all the forces that act on the struc-
ture. The most important of these is caused by hydrostatic pressure from the
trapped water. In a simple case, we can assume that this is the only force on
the dam and that the containing wall of the dam is vertical.
Pressure increases linearly with depth so the total horizontal force on the con-
taining wall will be equal to the average excess pressure (rgh/2) multiplied by
the area in contact with the water (A = hl, where l is the horizontal length of
the dam wall):
FP.CH06_3pp.indd 116 3/15/2023 12:31:52 PM

Fluids • 117
F = ½ rglh2
The line of action of this force is at a height h/3 from the base of the dam.
In reality, the situation is more complicated than this.
There could be a depth of water on the downstream side.
The containing wall might not be vertical.
There will be a hydrostatic pressure gradient under the dam because
water will penetrate the soil and rocks.
In addition to hydrostatic forces engineers must also consider the forces from
wind, seismic activity, and ice (if the water freezes). The dam must remain in
equilibrium under all possible conditions.
FP.CH06_3pp.indd 117 3/15/2023 12:31:52 PM

6.2 BUOYANCY AND ARCHIMEDES PRINCIPLE

6.2.1 Buoyancy Forces
When an object is wholly or partially submerged in a fluid there is an upward
force on the object called the “upthrust” or “buoyancy” force B. This force
arises because the pressure beneath the object is greater than the pressure
above, as shown in the diagrams below for a rectangular block.
Partial submersion: pressure at X = atmospheric pressure

pressure at Y = atmospheric pressure + rgh
Force B = ADp = rghA
Total submersion: pressure at R = atmospheric pressure + rgh1
pressure at S = atmospheric pressure + rgh2
Force B’ = ADp = rg (h2 - h1) A
6.2.2 Archimedes’ Principle

Note that the expressions for buoyancy force derived in Section 6.2.1 are both
equal to the weight of the fluid that has been displaced by the object:
Partial submersion: hA = volume of displaced fluid so rhA = mass of dis-
placed fluid and rghA = weight of displaced fluid.
Total submersion: (h2 - h1) A = volume of displaced fluid so r(h2 - h1)A =
mass of displaced fluid and rg(h2 - h1) A = weight of displaced fluid.
This is an example of Archimedes’ Principle:
The buoyancy force is equal to the weight of the fluid displaced.
FP.CH06_3pp.indd 118 3/15/2023 12:31:52 PM

Fluids • 119
This result was derived using a rectangular object but is valid for an object of
any shape. This can be understood by considering the buoyancy force on each
small vertical column of material inside the object. For each column, we can
apply exactly the same reasoning as used for the rectangular block so the total
buoyancy force is always equal to the weight of the fluid displaced regardless
of the shape of the block.
Using the same reasoning as in Section 6.2.1, the contribution to the buoy-
ancy force from one narrow column will be dB = rghdA. This is equal to the
weight of fluid displaced by the volume of the column. The total buoyancy
force on the object will be the sum of forces on all vertical columns:
B gh A
all columns
This is the weight of fluid displaced by the total volume of the block as stated
in Archimedes’ principle.
6.2.3 Flotation
An object will float if the buoyancy force can support its weight. The maxi-
mum buoyancy force is when the object is completely submerged so, for an
object of volume V and average density robj to float in a fluid of density rfluid:
weight of object < buoyancy force
robject Vg < rfluid Vg
robject < rfluid
FP.CH06_3pp.indd 119 3/15/2023 12:31:55 PM

An object will float if its density is less than the density of the fluid in
which it is placed.
If an object’s density is equal to the density of the fluid in which it is sub-
merged it is said to have “neutral buoyancy.” Divers and submarines use neu-
tral buoyancy to remain at the same depth under water.
6.3 VISCOSITY
6.3.1 The Coefficient of Viscosity
A fluid with high viscosity is very resistant to shear. This means that a relatively
large shear stress is required to move one layer over another. This can be
understood by thinking about the flow of a fluid close to a boundary. Particles
in contact with the boundary are assumed to be at rest because of interactions
with the surface whereas those far from the boundary will be flowing with the
same speed as the body of the fluid. There is a velocity gradient close to and
perpendicular to the boundary, as shown in the diagram.
z
Fluid flow
direcon v(z)
moving layers
staonary layer
shown in the diagram.
The greater the viscosity the smaller the velocity gradient dv/dz for the same
shear stress.
The coefficient of viscosity h is defined as the ratio of shear stress to velocity
gradient:
F

A
d
v

dz
The SI unit is therefore Nm-2s or pascal-second, Pas. Another common unit

for viscosity is the poise (P) and 1 P = 0.1 Pas. The coefficient of viscosity for
FP.CH06_3pp.indd 120 3/15/2023 12:31:57 PM

Fluids • 121
water at room temperature is approximately 1 mPas or 1 cP (centipoise). The

coefficient of viscosity decreases with increasing temperature.
6.4 FLUID FLOW

6.4.1 Laminar and Turbulent Flow
When a fluid flows the path followed by a particle in the fluid is called a
“streamline.” At low flow velocities, these streamlines are uniform and paral-
lel to one another: this is called laminar flow. However, above a certain critical
flow velocity laminar flow breaks down, the streamlines begin to form eddies,
and the flow becomes turbulent.
Laminar flow Turbulent flow
Two regimes exist because there are two competing effects in the moving
fluid: inertial forces related to the density and speed of motion of the fluid,
and viscous forces related to the viscosity of the fluid and inversely to the
physical size of the channel in which the fluid is flowing (e.g., the diameter
of the pipe). If viscous forces dominate, eddies cannot form and the flow is
laminar. If inertial forces dominate the flow will be turbulent.
The Reynolds number Re is a dimensionless constant that represents the ratio
of inertial forces (µrv) to viscous forces (µh/L) in a particular flow situation.
It is defined as:
vL
Re

Where v is the flow velocity, r the fluid density, h the coefficient of viscosity
and L is a characteristic length. For flow in a pipe L would be the diameter
of the pipe; for flow between two parallel plates it would be the separation of
the plates.
As a very approximate rule the flow will be turbulent if Re > 1000.
FP.CH06_3pp.indd 121 3/15/2023 12:31:58 PM

6.4.2 The Equation of Continuity

When a fluid flows in a confined channel such as a pipe the mass of fluid pass-
ing each point will be constant. The diagram below shows a fluid passing along
a pipe which narrows from a cross-sectional area A1 to cross-sectional area A2.
The flow velocity in the wider part of the pipe is v1 and in the narrower part it
is v2. In a short time dt the mass flow through area A1 is r1A1v1dt and the mass
flow through area A2 is r2A2v2dt. These must be equal, so in general:
r1A1v1 = r2A2v2
This is called the equation of continuity.

For an incompressible fluid, the density is constant so:
A1v1 = A2v2
In this case the flow velocity is inversely proportional to the area of the pipe:
v1 A2
=
v2 A1
6.4.3 Drag Forces in a Fluid

When an object moves through a fluid it exerts a force on the fluid to make it
flow around the moving object. By Newton’s third law the fluid exerts an equal
but opposite force on the moving object, this is the origin of the drag force.
The drag force depends on the nature of the flow around the object (e.g.,
laminar or turbulent) and is determined by the properties of the fluid, such
as density and viscosity, and on the velocity, size and shape of the moving
object. It may also be affected by nearby boundaries (e.g., if the fluid is inside
a container).
There are two particular situations which give simple expressions for the drag
force.
FP.CH06_3pp.indd 122 3/15/2023 12:32:01 PM

Fluids • 123
Viscous forces dominate (Re << 1000): drag force is directly proportional
to velocity.
The drag force arises as a reaction to shearing the layers of the fluid as
they flow around the object. These forces depend on the velocity gradient
and therefore on the velocity of the moving object.
Inertial forces dominate (Re >> 1000): drag force is directly proportional
to velocity-squared.
The inertial force arises as a reaction to the force needed to accelerate the
fluid in front of the moving object up to the velocity of the object. This
is directly proportional to the rate of change of momentum of the fluid
in front of the object which is proportional to the mass of fluid encoun-
tered per second multiplied by the velocity of the object. Since the mass
encountered per second is also proportional to the velocity, the drag force
will be proportional to the velocity-squared.
6.4.4 Stokes’ Law

Stokes’ law gives the viscous drag on a small spherical particle moving through
a fluid when the flow around the particle is laminar. The diagram below shows
streamlines around an object moving through a viscous fluid. Streamlines
show the paths of particles in the fluid as the object passes.
fluid
streamlines
moon drag
In these circumstances the drag will be directly proportional to v and we

would expect it to depend on the radius r of the sphere, and the viscosity h of
the fluid. Dimensional analysis can then be used to find the form of an expres-
sion for the drag.
FP.CH06_3pp.indd 123 3/15/2023 12:32:01 PM

F v r
x y z
MLT 2 ML1T 1 LT 1 L
x y z
For the dimension of mass: 1 = x

For the dimension of time: -2 = -x - y so y = 1
For the dimension of length: 1 = -x + y + z so z=1
The drag is therefore: F = constant ´ hrv
Stoke showed that the constant is 6p
Stokes’ law for viscous drag: F = 6phrv
Conditions under which Stokes’ law can be used:
Re << 1000 (flow is laminar): small object, low speed, high viscosity.
Fluid is homogeneous and of uniform density
The radius of the particle is much smaller than the dimensions of the fluid
container.
Stokes’ law can be used to measure the viscosity h of very viscous liquids such
as glycerol. If a small ball bearing of radius r is released from the top of a col-
umn of viscous fluid it will soon reach terminal velocity vt. Once at terminal
velocity the sum of viscous drag (given by Stokes’ law) and buoyancy must
equal the weight of the ball bearing. The viscosity is then given by:
2 gr 2

vt
bb f
where rbb is the density of the ball bearing and rf is the density of the fluid.
6.4.5 Turbulent Drag

We can also construct an equation for the drag on an object moving through
a fluid at speeds such that the fluid motion becomes turbulent (Re >> 1000).
This is useful when considering large objects moving rapidly through fluids
of low viscosity (e.g., air resistance on cars and planes). Here inertial forces
will dominate so the drag is proportional to v2 and we would also expect it to
depend on the cross-sectional area A of the object (in the plane perpendicular
to motion) and the density r of the fluid. Dimensional analysis can again be
used to find the form of an expression for the drag.
FP.CH06_3pp.indd 124 3/15/2023 12:32:05 PM

Fluids • 125
F v A
x y z
MLT 2 ML3 LT 1 L2
x y z
for the dimension of mass: 1=x

for the dimension of time: -2 = - y so y = 2
for the dimension of length: 1 = -3x + y + 2z so z=1
the drag is therefore: F = constant ´ rAv2
This is usually used in the form: F = ½ CD rAv2
Where CD is the “drag coefficient,” a dimensionless number that depends on the
shape of the moving object. Streamlined shapes will have lower values of CD.
6.4.6 The Bernoulli Equation

The static pressure in a fluid was discussed in Section 6.1 and is calcu-
lated from terms of the form rgh. However, when a fluid is flowing there
is an additional dynamic pressure that arises from the forces needed to
stop the flow. The dynamical pressure is related to the potential energy
(per unit volume) of the fluid and the dynamic pressure is related to the
kinetic energy (per unit volume) of the fluid. Dynamic pressure is cal-
culated from terms of the form ½ rv2. For an inviscid fluid (viscosity is
negligible) energy is conserved along a streamline so the sum of pressure
terms is constant. This gives the Bernoulli equation, shown below the
diagram.
FP.CH06_3pp.indd 125 3/15/2023 12:32:09 PM

1 2 1
P1 v1 gh1 P2 v22 gh2
2 2
where h1 and h2 are heights above some reference level.

The Bernoulli equation only applies when several conditions are met:
The flow is steady.
The flow is laminar and not turbulent.
The fluid is inviscid.
6.4.7 The Bernoulli Effect

The “Bernoulli effect” describes a fall in static pressure when the fluid flow
velocity increases. This is easily explained using the Bernoulli equation. The
diagram below shows a fluid flowing through a horizontal pipe with a central
restriction. Since h does not vary we set h = 0 along the streamline.
1 2 1
Using the Bernoulli equation: P1 v1 P2 v22
2 2
The equation of continuity shows that v2 > v1 so P2 < P1 and the static p ressure
falls as the flow velocity increases. The kinetic energy of the fluid has increased
so its potential energy has decreased.
Another way to think about this is by considering Newton’s second law. The
fluid must accelerate as it enters the constriction so there must be a result-
ant force from the wider part of the pipe. This comes from the greater static
pressure.
6.4.8 Viscous Flow Through a Horizontal Pipe – The Poiseuille Equation

For an inviscid fluid in laminar flow through a horizontal pipe there is no
pressure difference along the pipe. This is because none of the terms in the
FP.CH06_3pp.indd 126 3/15/2023 12:32:14 PM

Fluids • 127
total pressure P + ½ rv2 + rgh changes. This can be considered as an example

of Newton’s first law of motion – there are no resultant forces acting on the
particles of fluid so they continue at constant velocity. However, if the fluid
is viscous there will be frictional forces opposing the flow and these must be
balanced by a pressure gradient in the tube.
Poiseuille derived a formula for the volume rate of flow of a fluid through a
pipe when a constant pressure difference is maintained across its ends. The
diagram below shows a horizontal pipe of length l and radius a with pressure
difference p across its ends.
The derivation has two parts – (i) we use the equation for viscosity to work out
an expression for the velocity of flow at radius r from the center. Then (ii) we
use this to work out the total rate of volume flow inside the pipe by integrating
over cylindrical shells.
(i) The applied force created by the pressure difference must balance the
viscous forces along its surface:
F r2p
F π r 2 p rp
shear stress along surface = = =
(area of surface) 2π rl 2 l
rp dv
This must balance viscous forces so:
2l dr
The negative sign is because the velocity decreases from r = 0 to r = a.
FP.CH06_3pp.indd 127 3/15/2023 12:32:24 PM

0 a
pr
dv 2 l dr
v 0
p 2
Which gives: v
4 l
a r2
This shows that the velocity profile in the pipe is parabolic:
Fluid Velocity
flow profile
(ii) Since different layers flow at different velocities the total flow can be
found by integrating the volume flow rates for all thin cylindrical shells
inside the pipe.
The contribution dQ to the total flow rate Q (m3s-1) is: dQ = 2prvdr.

a a
rp
Q 2 rvdr 2 l a r 2 dr
2
0 0
pa4
Q
8 l
This is Poiseuille’s equation. The volume flow rate depends on the fourth
power of the pipe radius a and is directly proportional to the pressure gradi-
ent p/l.
FP.CH06_3pp.indd 128 3/15/2023 12:32:36 PM

Fluids • 129
6.4.9 Measuring the Coefficient of Viscosity

The coefficient of viscosity can be measured by maintaining a constant pres-
sure gradient across a horizontal tube, measuring the volume flow rate and
substituting into Poiseuille’s equation:
a4 p

8Q l
Here is a suitable experimental arrangement.

Constant head apparatus: this is continually filled to maintain a constant
depth of fluid. Excess fluid flows away through the drain. The open end
of the horizontal tube is at atmospheric pressure so the pressure gradient
in the tube is rgh/l.
The horizontal tube must be narrow enough to ensure laminar flow (oth-
erwise the Poiseuille equation is not valid).
The beaker is used to collect fluid over a set time t. The volume of fluid
collected in this time, V, is measured using a measuring cylinder. The vol-
ume flow rate Q = V/t.
The average radius of the capillary tube, a, can be determined by fill-
ing the tube with water and measuring the volume contained: volume of
water = pa2l. As an alternative, the radius can be measured using an aver-
age of three diameters measured using a travelling microscope.
Since the coefficient of viscosity depends on temperature the temperature of
the fluid used should also be recorded.
FP.CH06_3pp.indd 129 3/15/2023 12:32:40 PM

6.5 MEASURING FLUID FLOW RATES

6.5.1 A Venturi Meter
The change in pressure of a fluid when its flow velocity changes can be used to
measure the flow rate in a pipe. The meter works by measuring the change in
static pressure as the fluid passes through a constriction in the pipe, as shown
below. The derivation assumes that the fluid is ideal and the flow is steady.
At position 1 the manometer is connected to the fluid in the wider section of

pipe (area A) and at position 2 the other side of the manometer is connected
to the fluid in the narrower section of the pipe (area a). The pressure differ-
ence Dp is equal to rmgDh where rm is the density of fluid in the manometer.
The pressure difference can be related to fluid flow velocity using Bernoulli’s
equation:
1
p v22 v12
2
The equation of continuity allows v2 to be written in terms of v1 and the ratio

of pipe areas:
A
v2 v1
a
FP.CH06_3pp.indd 130 3/15/2023 12:32:45 PM

Fluids • 131
combining these two equations the flow velocity v1 is given by:
2 p
v1
A
2
1
a
and the volume flow rate in the tube is Q = Av1
2 p
QA
A 2
1
a
Venturi meters are used in many industrial applications, including water flow.
6.5.2 A Pitot Tube

A Pitot tube is used to measure air speed. It does this by comparing the total
or “stagnation” pressure with the static pressure.
stac
pressure
fluid
flow
dynamic stac
pressure pressure
total stac
pressure pressure
pressure transducer
The tube points into the direction of air flow and pressure sensors are used
to measure the difference between the total and static pressure. Since total
pressure is equal to the sum of the static and dynamic pressures the difference
FP.CH06_3pp.indd 131 3/15/2023 12:32:50 PM

between these is just the dynamic pressure ½ rv2. This value can then be used
to calculate the speed of the fluid relative to the Pitot tube.
Pitot tubes attached to the wings of planes are used to measure the aircraft’s
air speed and under boats to measure their speed in the water.
2 ptot pstat
v

6.6 EXERCISES
1. Scuba divers estimate that the excess pressure they experience when they
dive increases by an amount equal to the atmospheric pressure for every
additional 10 m of depth.
(a) Show that this is approximately correct. Density of seawater

= 1030 kgm-3. Atmospheric pressure at sea level = 101 kPa.
(b) Estimate the total pressure at the bottom of the Mariana Trench in
the Pacific Ocean (about 11 km below sea level.
(c) Seawater is slightly compressible – how will this affect your answer
to (b)? Explain.
2. A vertical dam wall contains water of density r to a depth h. The horizon-
tal length of the dam wall is l.
(a) Write down an expression for the excess pressure at depth x below
the surface of the water.
(b) Show that the total horizontal force acting on the wall of the dam
from the water is given by the expression F = ½ rglh2
(c) By considering equilibrium of moments show that the line of action
of this force is 1/3 h above the base of the dam.
3. The laminar flow of a viscous fluid through a narrow capillary tube
depends only on the radius a, the pressure gradient along the tube (p/l),
and the viscosity of the fluid, h.
FP.CH06_3pp.indd 132 3/15/2023 12:32:52 PM

Fluids • 133
(a) Use the method of dimensions to show that the volume flow rate Q
is given by an expression of the form:
(b) Explain why the method of dimensions cannot be used to determine
the value of the constant in the expression above.
(c) Explain why this formula is likely to break down for high flow rates
or larger diameter pipes.
4. (a) Show that the flow of air around a car is likely to be turbulent. You
will need to estimate the relevant quantities.
(b) A formula that is used for turbulent drag forces is;
F = ½ CD rAv2
where CD is the “drag coefficient.”
i. Suggest ways in which the drag coefficient might be reduced.

ii. The drag coefficient for a car is 0.60 and a frontal area of 3.4 m2.
Calculate the aerodynamic drag force on this car when it is travelling
at 25 ms-1 through the air. Air has a density of 1.2 kgm-3.
iii. Suggest, with reasons, how the power required to drive a sports car
at 50 ms-1 compares to the power required to drive the same car at
25 ms-1.
iv. The car above has a maximum power output of 400 kW. Use this to
estimate its maximum possible speed.
v. Suggest a reason why your answer to (iv) is an over-estimate.
5. (a) Estimate the volume of your own body (HINT: your density is simi-
lar to that of water, about 1000 kgm-3).
(b) Estimate your weight.
(c) Estimate the buoyancy force on your body from the atmosphere.
(d) Discuss whether or not bathroom scales display your actual mass.
6. A steel ball bearing has a diameter of 1.2 mm. It is released from just
below the surface of glycerol inside a wide measuring cylinder. The den-
sity of steel is 7700 kgm-3 and the density of glycerol is 1260 kgm-3.
FP.CH06_3pp.indd 133 3/15/2023 12:32:52 PM

(a) Calculate the weight of the ball bearing.

(b) Calculate the buoyancy force on the ball bearing when it is sub-
merged in glycerol.
(c) Explain why the ball bearing accelerates when it is released but soon
reaches a terminal velocity. Your answer should include a free-body
diagram for the falling ball bearing.
(d) The terminal velocity of the ball bearing is 3.6 mms-1. Use this to cal-
culate a value for the viscosity of glycerol and state any assumptions
you use.
(e) In a famous experiment to measure the charge on oil droplets Mil-
likan used Stokes’ law to estimate the viscous drag on tiny oil droplets
(with radii of the order of 2 mm) falling through air. He discovered
that Stokes’ law over-estimated the force on the droplets, they actu-
ally fell faster than the equation predicted. Suggest a reason for this.
7. A raindrop of radius 0.50 mm is falling through the atmosphere.
(a) Explain why the buoyancy force from the air can be neglected when
using Stokes’ law to calculate the terminal velocity of the raindrop.
(b) The viscosity of air is 2.0 ´ 10-5 Pas. Use Stokes’ law to calculate the
terminal velocity of the raindrop.
(c) Use your answer to (b) to calculate the Reynold’s number for the
falling droplet. Comment on your answer.
8. The diagram below shows two pipes, each of length l, but with diameters
d and d/2 respectively. The volume flow rate into X is Q and the flow is
laminar. The pressure at X is pX and the pressure at Z is pz . The viscosity of
the fluid is h. Find an expression for the pressure at Y assuming the fluid
is incompressible.
9. (a) Explain how it is possible to drink water from a glass through a straw.
FP.CH06_3pp.indd 134 20-03-2023 12:21:51

Fluids • 135
(b) Discuss whether there is a limit to the length of straw that can be
used to drink water. The density of water is 1000 kgm-3.
10. Show that the SI base units for viscosity are kgms-1.
11. The pressure difference measured by a Pitot tube on an aircraft’s wing is
20 kPa. What is the aircraft’s air speed?
12. When an inflated balloon is connected to one side of a water manom-
eter the height difference between the manometer arms is 14.0 cm. The
atmospheric pressure is 102 kPa.
(a) What is the excess pressure inside the balloon?
(b) What is the total pressure inside the balloon?
Suggest one advantage and one disadvantage of using mercury
(c)
instead of water in a manometer.
13. In a famous experiment the French physicist Pascal placed one mercury
barometer at the base of a mountain and carried a second one to the top
of the mountain. Both barometers had mercury columns of equal height
when they were together at the base of the mountain. However, Pascal
noticed that the mercury column on the barometer he carried with him
fell gradually as he climbed the mountain. Explain this effect as carefully
as you can.
14. The circulation of blood in the body can be considered as a continuous
circuit. Blood is pumped from the heart into the aorta, splits into the
arteries, splits again into the capillaries and then returns to the heart via
the veins and finally the vena cava. This can be represented in the same
way as an electric circuit consisting of series and parallel resistors:
From Aorta Vena cava To

heart heart
arteries veins
capillariess
cap
The table below gives the total area of each type of blood vessel along with
the average flow speed and volume flow rate.
FP.CH06_3pp.indd 135 3/15/2023 12:32:54 PM

(a) Complete the table.

Area (cm2) Speed (cms-1) Volume flow rate (cm3s-1)
Aorta 3.0 30 90
Arteries 100
Capillaries 900 0.10
Veins 0.45
Vena cava 18
(b) Blood has a viscosity of between 0.003 and 0.004 Pas. Discuss
whether blood flow in the human circulation is likely to be laminar
or turbulent.
FP.CH06_3pp.indd 136 3/15/2023 12:32:54 PM

CHAPTER
7
Mechanical Properties
7.1 DENSITY
Density is a property of each material and is independent of the amount of
that material. This is in contrast to mass, which depends on the amount of
material present. Density is defined by the equation:
mass
density =
volume
the SI unit for density is kgm-3 but gcm-3 is also in common use. The relation
between these is:
1000 kgm-3 = 1 gcm-3
1 gcm-3 = 0.001 kgm-3
The densities of some common materials are listed below.
Density (kgm−3) Density (gcm−3)
Air (sea level, 15°C) 1.225 0.001225

Water 1000 1.000
Wood 160 (balsa) – 1300 (ebony) 0.16–1.3
Concrete ~ 2400 ~ 2.4
Aluminum 2700 2.7
Steel 7750–8050 7.75–8.05
Copper 8960 8.96
Lead 11340 11.34
FP.CH07_3pp.indd 137 3/15/2023 12:32:53 PM

Mercury 13560 13.56

Uranium 19100 19.10
Gold 19320 19.32
Osmium (densest naturally occurring
22590 22.59
element)
The density of a solid is of the same order of magnitude as the density of

an atom because atoms are closely packed together inside a solid material.
However, atoms themselves are mainly empty space and the diameter of an
atomic nucleus is approximately 20 000 times smaller than the diameter of
an atom. This means that the density of nuclear material is around 20 0003
(6´1012) times greater than the density of ordinary matter. Typical nuclear
densities are greater than 1017 kgm-3.
The cores of collapsed stars also have high densities. White dwarf stars have
densities of about 1010 kgm3 but neutron stars have densities comparable to
that of an atomic nucleus (1017 kgm-3) since they are effectively closely packed
nucleons (neutrons). On the other hand, “empty space” is not quite empty,
having around 1 hydrogen atom per cubic centimeter and a density of the
order of 10-33 kgm-3.
7.2 INTER-ATOMIC FORCES

A solid material holds itself together because the individual particles from
which it is made (atoms or molecules) form bonds. If the particles are pulled
further apart, they experience an attraction and if they are pushed closer
together, they experience a repulsion. While they will have some thermal
kinetic energy that causes them to vibrate, they maintain, on average, a con-
stant equilibrium separation. The interatomic forces are electrostatic in origin
and vary with distance as shown below (a positive force represents repulsion
and a negative force represents attraction).
The distance r0 is equal to the equilibrium separation of the particles. For
small displacements either side of this position the graph is approximately
linear in many materials, particularly metals. This has two consequences:
When a force is applied to the material its extension (or compression) will
be directly proportional to the force. In other words it will obey Hooke’s
law in this linear region.
When a particle is displaced from its equilibrium position it will experience
a restoring force directly proportional to its displacement so that oscilla-
tions about the equilibrium will be simple harmonic (see Chapter 11).
FP.CH07_3pp.indd 138 3/15/2023 12:32:53 PM

Mechanical Properties • 139
force
repulsion
r
r0 separaon
aracon
The work that must be done to separate two particles from their equilibrium
separation is equal to the area between the negative part of the graph and infin-
ity. This is also equal to the energy released when the bond between the parti-
cles is formed. Since work must be done to push the particles closer together
or to separate them the potential energy is a minimum when their separation
is r0. They are in a bound state and each particle is in a potential well.
The graph below shows how the potential energy varies with particle
separation.
potenal energy
r0 separaon
separaon.
FP.CH07_3pp.indd 139 3/15/2023 12:32:54 PM

7.3 STRETCHING SPRINGS

An ideal spring obeys Hooke’s law. This states that the extension of the spring
is directly proportional to its tension (or to the load it supports). This is a direct
consequence of the way the bonds between atoms inside the steel behave. If
these bonds obey Hooke’s law then so will the material and so will the spring.
The simple example of a mass on a spring can serve as a model for atomic and
molecular bonds and many other systems in physics and engineering.
Some useful terms:
Tension: the force exerted by a spring when it is stretched. If it supports
a load in equilibrium this will be equal to the weight of the load.
Extension: the difference between the stretched and unstretched lengths
of the spring.
7.3.1 The Spring Constant

When increasing loads are suspended from a steel spring the extension
increases. A graph of force against extension looks like this:
force / N
extension / m
O
permanent plasc
deformaon
Section OP is (almost) a straight line through the origin so in this region the
extension (x) is directly proportional to the force applied to the spring (or the
tension in the spring) (F).
FP.CH07_3pp.indd 140 3/15/2023 12:32:55 PM

F µ x or F = kx
k is the “spring constant” with SI units Nm-1.

The spring constant is a measure of the “stiffness” of the spring.
Point P is the “limit of proportionality” and beyond P the graph becomes
non-linear.
In the region OP the spring will return to its original length when the force is
removed. This is called elastic behavior. Beyond E this is no longer the case
and when the force is removed the spring does not return to its original length
but retains an extension. This is called “plastic” behavior. The dotted line on
the graph shows how a spring contracts when a force beyond E is removed.
Point E is the “elastic limit” for the spring. Beyond E the spring undergoes
permanent plastic deformation.
This graph does not include the fracture point for the spring because once
it unravels the applied forces then stretch a steel wire and this will require a
large increase in force for a relatively small increase in extension.
7.3.2 Springs in Series and in Parallel

Assume the springs in the examples below have negligible weight and obey
Hooke’s law with spring constant k. When any one of these springs is extended
by an amount e, the tension in the spring is F.
Series Combinations
When n springs are connected in series and the system is stretched the ten-
sion in each spring must be the same and equal to F. This means the extension
of each spring will be the same as the extension of an individual spring under
the same load. The total extension of the system of n springs is therefore ne.
Using Hooke’s law for the system of n springs in series gives: kSERIES = F/ne = k/n
Using similar reasoning a system of different springs with spring constants
k1, k2, k3 …., etc. connected in series has an overall spring constant given by
the following equation:
1 1 1 1

kseries k1 k2 k3
Connecting springs in series reduces the stiffness of the system.
FP.CH07_3pp.indd 141 3/15/2023 12:32:57 PM

Parallel Combinations
When n springs are connected in parallel and the system is stretched by a
force F, the tension in each spring must be the same and equal to F/n. The
total extension of each spring (and the system) will be e/n.
Using Hooke’s law for the system of n springs in parallel gives: kparallel = F/(e/n) = nk
The spring constant of a system consisting of several springs in parallel is the
sum of the spring constants of the individual springs. Connecting springs in
parallel increases the stiffness of the system.
7.3.3 Elastic Potential Energy (Strain Energy)

Work must be done to stretch a spring. If there are no energy losses in the
stretching process, then this work transfers energy to elastic potential energy
in the spring that can be released when the spring recompresses. The work
done is equal to the area under a graph of force against extension for the
spring.
Force
work done
extension
x
If the spring obeys Hooke’s law then the work done to stretch it to an exten-
sion x is equal to the shaded area in the graph above:
Area = ½ Fx or, using Hooke’s law: work done = ½ kx2
If we assume that no energy is lost as the spring is stretched, the elastic poten-
tial energy in a stretched spring is given by:
EPE = ½ kx2
In general, the work done to stretch something is calculated from the integral:
FP.CH07_3pp.indd 142 3/15/2023 12:32:57 PM

W F x dx
1 2
If the spring obeys Hooke’s law this is simply W kxdx kx .
2
giving the same result as before.
7.4 STRESS AND STRAIN

When we are discussing material properties, stress and strain are often prefer-
able to force and extension because the relationship between stress and strain
is independent of the dimensions of the sample under test. Stress and strain
relate directly to the type of material rather than the particular piece of mate-
rial used in the test.
Here are some definitions:
Tensile stress s is the force per unit area when the force is applied along the
axis of the sample as shown below.
A
F F
l
force(N)
stress Nm2
cross-sectional area (m2 )
F

A
The SI unit for stress is Nm-2 or Pa.

Tensile strain e is the extension per unit length under a tensile stress.
extension
strain =
original length
FP.CH07_3pp.indd 143 3/15/2023 12:33:07 PM

e
=
l0
This is a ratio of two lengths so it is dimensionless.
7.4.1 The Young’s Modulus

For many materials, particularly metals, tensile strain is directly proportional
to tensile stress up to some limiting stress. When this is the case the Young’s
modulus E is defined as:
stress(Pa)
Young’s modulus (Pa) =
strain
The Young’s modulus is a measure of the “stiffness” of a material. A stiff mate-
rial has small strain for large stress.
Some typical values for the Young’s modulus are:

Material Young’s modulus (Gpa)
Rubber 0.01 – 0.10
Nylon 2-4
Wood 10
FP.CH07_3pp.indd 144 3/15/2023 12:33:10 PM

Bone 14
Concrete 30
Copper 120
Steel 280
Diamond 1150
7.4.2 Experimental Measurement of Young’s Modulus for a Metal Wire

The Young’s modulus for a metal wire of diameter d and original length l0
can be determined by loading it with masses and plotting a graph of mass m
against extension e:
F / A Fl0 4 Fl0 4 mgl0

Young’s modulus E
el
0
eA e d 2 e d 2
d2E
m e
4 gl0
d2E
A graph of m on the y-axis against e on the x-axis has a gradient
4 gl0
The Young’s modulus is given by:
4 gl
E gradient 20
d
A suitable experimental arrangement to determine the Young’s modulus of a
test wire is shown on the next page.
clamp l
test wire
bench
FP.CH07_3pp.indd 145 3/15/2023 12:33:18 PM

The diameter of the wire must be measured using a micrometer screw gauge.
This should be done at least three times in different positions and then an
average should be calculated. The images below show how a reading to 0.01
mm is obtained.
Sha: mm scale (2.50 mm)

gap is closed diameter is
onto the wire read off
by turning here
this ratchet
wire diameter is
measured by placing it
into this gap
collar: mm/100 scale (0.35 mm)
The reading is taken from the point where the lines on the shaft and collar
are aligned. The last cleared reading in mm from the shaft is added to the
reading from the collar. In the example below the gap is 2.50 mm + 0.35 mm
= 2.85 mm. Any zero error for the micrometer must be subtracted from this.
(You can check for a zero error by closing the micrometer gap completely and
checking the reading, it should be 0.00 mm).
The wire must then be clamped securely at one end.
Two light markers are attached to the wire a distance l0 apart. This distance
should be measured with an unkinked, unloaded but taut wire. Using a larger
value of l0 will increase the extensions and give better results.
The wire is then loaded, adding one mass at a time and recording the mass
added and the length of the wire in a suitable results table. This is repeated for
at least 7 different values of mass, and preferably many more. However, care
must be taken not to exceed the elastic limit for the wire. If this does occur
the graph will begin to curve and the Young’s modulus must only be calculated
using the gradient of the straight part of the graph.
Extensions should be calculated for each measured length and added to the
results table:
extension e l l0
FP.CH07_3pp.indd 146 3/15/2023 12:33:20 PM

Care must be taken throughout the experiment because there is always a dan-
ger that the wire could snap. Safety glasses should be worn!
Finally, a graph of mass against extension is plotted and the Young’s modulus can
be determined from the gradient of this graph as shown above (Section 7.4.1).
7.4.3 Stress Versus Strain Graph for a Ductile Metal

A graph of stress against strain is a useful way to display material properties.
The stress–strain curve for a metal like copper is shown on the next page.
The curve shown assumes a constant cross-sectional area equal to its initial
value. In reality, the area reduces as the sample extends so the stress shown on
the graph is lower than the true stress and a graph showing true stress versus
strain would not have the dip and would fracture at the maximum stress.
Region OP: The material obeys Hooke’s law in the initial linear region of
the graph where stress is directly proportional to strain.
The Young’s modulus of the material is equal to the gradient of line OP.
Up to Y the material behaves elastically. If the stress is removed it will
return to its original dimensions.
Y is the “yield point.” Beyond this level of stress the material will deform
plastically and when the stress is removed there will be a residual defor-
mation.
The ultimate tensile strength is at a stress suts.
FP.CH07_3pp.indd 147 3/15/2023 12:33:20 PM

Some typical ultimate tensile strengths are:

Material UTS (MPa)
Carbon fiber 5650
Spider silk 1400
Stainless steel 860
Copper 220
Nylon 75
The dip in the curve is a result of the change in diameter of the sample. As it
begins to yield its diameter becomes smaller but in this graph the stress has
been calculated using the original diameter of the sample so while the calcu-
lated stress decreases the actual stress does not.
The area under a stress-strain graph represents energy per unit volume.
F dx 1
Area under curve d Fdx
A l Al
(Here we have assumed that the cross-sectional area of the sample has
remained constant during strain. This is approximately true for small strains.)
The larger the area up to fracture the more energy the material absorbs before
fracture and the “tougher” the material is said to be.
7.4.4 Rubber Hysteresis

Here is a typical stress-strain graph for rubber showing both loading and
unloading. Rubber consists of long chain hydrocarbon molecules. The bonds
within the chain are stiff covalent bonds but single carbon-carbon bonds allow
for rotation so the long chain can be curled up or stretched out. There are also
many cross links between different chains.
The stress for any particular strain is greater during loading than unloading.
This implies that the work done to stretch the rubber is greater than the work
done by the rubber as it re-contracts. The area in the loop between the two
curves is equal to the work that is transferred to heat internally as the rubber
is stretched. You can easily verify that heat is generated by taking an elastic
band, rapidly stretching it and immediately touching it to your lip. This curve
for loading and unloading is called a “hysteresis curve.” It is interesting to
note that if you heat a stretched piece of rubber it will tend to contract as the
molecules adopt a more random arrangement.
FP.CH07_3pp.indd 148 3/15/2023 12:33:28 PM

stress
O strain
Long chain Most molecules

molecules randomly aligned – stress
Molecules aligning but
arranged, stress acts along
connected by cross-links
begins to straighten chains and is
them by causing resisted by sff
bond rotaon. covalent bonds.
7.5 MATERIAL TERMINOLOGY

Here is a summary of terms used by materials scientists to describe material
properties:
Strong: has a large ultimate breaking stress.
Stiff: needs a large stress for a small strain (high Young’s Modulus).
Elastic: sample returns to its original dimensions when stresses are
removed.
Plastic: undergoes a permanent deformation when stress is removed.
Yield stress: stress needed to cause the onset of plastic deformation.
Tough: undergoes significant plastic deformation before breaking, resists
crack formation and propagation, and absorbs a lot of energy before
breaking.
Ductile: able to be drawn out into wires (linked to plasticity).
FP.CH07_3pp.indd 149 3/15/2023 12:33:28 PM

Malleable: able to be beaten into thin sheets (linked to plasticity).

Hard: resists scratching and indentation.
Brittle: breaks catastrophically as cracks travel through the material.
Almost no plastic deformation prior to fracture.
Creep: gradual continued strain under constant stress.
7.6 MATERIAL TYPES

Materials fall into a range of different types which can be identified by their
macroscopic properties and explained by their microscopic structures.
Crystalline materials have a microscopic structure in which the particles are
arranged in regular geometric patterns. These might form single large crystals
(e.g., a diamond) or might consist of a large number of grains (polycrystalline),
each of which has a regular structure but which are themselves arranged ran-
domly (many metals are like this although it is possible to grow single metallic
crystals). The grain size in a polycrystalline material affects its mechanical
properties and this can be changed by heat treatment. For example, in the
process of annealing a metal is heated up above its recrystallization tempera-
ture and then allowed to cools slowly. This allows individual crystals to grow
larger and makes the metal less hard, more ductile and easier to work.
Crystalline structures are rarely perfect and most contain flaws such as miss-
ing atoms (vacancies), atoms of a different material (impurities) and disloca-
tions (where planes of atoms do not join correctly). These have a large effect
on properties such as strength and stiffness.
Polymers consist of long chain hydrocarbon molecules which can be arranged
randomly (amorphous polymer) or in a semi-regular or regular (crystalline)
way. Their mechanical properties depend on how the polymer chains interact
and move past one another. The individual chains form cross-links and these
affect the stiffness and strength of the material. Rubber and polythene are both
examples of polymeric materials. When a stress is applied to rubber the mol-
ecules initially begin to align, explaining rubber’s ability to undergo very large
strains. However, if the stress is removed, the cross-links pull the molecules
back to their original positions and the rubber behaves elastically. At larger
strains most of the molecules have aligned and the rubber becomes very stiff
before it finally breaks. Polythene can also undergo a large strain as its mole-
cules align but in this case the cross links are unable to pull the molecules back
to their original positions so the polythene undergoes a plastic deformation.
FP.CH07_3pp.indd 150 3/15/2023 12:33:28 PM

Ceramics are solid non-metallic materials such as brick, tile, pottery, and china
that are formed by high temperature firing. They usually consist of tiny ionic
crystals bound together by amorphous glassy regions that formed during fir-
ing. Ceramics are usually stiff, hard, and strong and have high melting points
and good chemical resistance, but they can be brittle.
Glasses are closely related to ceramics but are characterized by being com-
pletely amorphous and result from a rapidly cooled melt. Glasses are distin-
guished from ceramics by their microscopic structure. For example, silica glass
and quartz have identical composition consisting of SiO4 units but in the glass
they are arranged randomly while in the ceramic they are arranged regularly.
Composite materials are combinations of two or more different materials
designed to take advantage of the desirable properties of each individual com-
ponent. Examples of composites are fiber glass, concrete, steel, reinforced
concrete, etc. Many composites consist of a matrix and a reinforcement.
Concrete is strong in compression, but weak in tension and brittle, whereas
steel is strong in tension and can prevent crack formation in the concrete,
providing a versatile and economical building material.
7.7 EXERCISES
1. A rectangular wooden block has sides of length 5.0 cm, 8.0 cm and 12 cm.
Its mass is 960 g.
(a) Calculate its density in kgm-3
(b) The absolute uncertainty in each measurement above (i.e., of side

lengths and mass) is ±4%. Calculate the absolute and fractional uncer-
tainty in the density.
2. A spring is extended 0.18 m by a force of 30 N.
(a) Calculate the spring constant of the spring.
(b) Calculate the work done to stretch the spring.
(c) A load of 20 kg is supported by 10 of these springs connected in parallel

with one another. Calculate the extension of the system under this load.
FP.CH07_3pp.indd 151 3/15/2023 12:33:29 PM

3. The table of data below was collected in an experiment to stretch a steel

spring. The unstretched length of the spring was 10.0 cm.
extension/cm Force/N
0 0
2.8 1.0
6.2 2.0
9.6 3.0
13.2 4.0
16.5 5.0
20.1 6.0
23.2 7.0
26.6 8.0
30.2 9.0
extension/cm Force/N
33.0 10
36.3 11
40.2 12
44.4 13
49.4 14
57.1 15
68.5 16
96.5 17
113.2 18
(a) Plot a graph of force (y-axis) against extension (x-axis).

(b) State the limit of proportionality for this spring.
(c) Calculate the spring constant of the spring.
(d) What was the total length of the spring when it was pulled by a force
of 7.0 N?
(e) Calculate the extension of the spring if it supported a load of 850g.
(f) For forces up to about 10 N the graph is a straight line through the ori-
gin. Describe the relationship between force and extension over this
range. What law is the spring obeying?
FP.CH07_3pp.indd 152 3/15/2023 12:33:29 PM

(g) If two springs like the one above were connected in series how much
would they extend when stretched by a force of 14 N?
(h) When the force of 18N was removed the spring did not return to its
original length. Explain this.
(i) D
oes the spring become stiffer or less stiff beyond the limit of propor-
tionality? Explain your answer.
4. The steel cable used to moor a ship has a length of 12.0 m and a cross-
sectional area of 8.2´10-4 m2. The force in the cable is 15 000 N.
(a) Calculate the extension of the cable.

(b) Calculate the strain energy (elastic P.E.) stored in the cable.
Density of steel to be 8050 kgm-3, Young modulus of steel = 200 GPa
5. When a rubber band is stretched and released rapidly several times in
quick succession it becomes hot. Use a stress-strain graph for the loading
and unloading of rubber to explain this.
6. Here are four stress-strain graphs drawn to the same scale.
stress
A
D
strain
Use the correct terminology to describe the mechanical properties of each
material and to compare them.
FP.CH07_3pp.indd 153 3/15/2023 12:33:29 PM

FP.CH07_3pp.indd 154 3/15/2023 12:33:29 PM
CHAPTER
8
Thermal Physics
8.0 INTRODUCTION
Thermal energy (often simply referred to as heat energy or heat) is the energy
something has as a result of the random thermal motions of its particles. This
is not the same thing as temperature. For example, the Atlantic Ocean has a
lower temperature (perhaps 15°C) than a freshly made mug of coffee (per-
haps 75°C) but has much more thermal energy because it contains many
more particles. While the formal definition of temperature is quite complex
and involves an understanding of the concept of entropy, a simple way to think
about temperature is to relate it to the mean energy per particle. The mean
energy of the water particles in a mug of hot coffee is greater than the mean
energy of water molecules in the Atlantic Ocean.
8.1 THERMAL EQUILIBRIUM

When two systems are placed in thermal contact heat is able to be trans-
ferred between them. On the microscopic level, this is a dynamic process in
which energy transfers take place continuously in both directions. However,
on the macroscopic level, there might be a net flow in one direction. The
macroscopic direction of net heat transfer is always from the system at a
higher temperature to the system at a lower temperature, and this contin-
ues until the two systems are at the same temperature. This can be stated
very simply:
The direction of heat flow is from the system at higher temperature to
the system at lower temperature (or even more simply: from hot to cold).
FP.CH08_3pp.indd 155 3/15/2023 12:35:27 PM

When there is no net heat transfer between systems in thermal contact they
are in thermal equilibrium and are at the same temperature.
The “Zeroth law of thermodynamics” states that: If two systems, A and B are
both in thermal equilibrium with a third system, C, then A and B are also in
thermal equilibrium with each other.
This law allows us to use thermometers and to set up formal temperature
scales.
8.2 MEASURING TEMPERATURE

Temperature is measured using a thermometer. Thermometers are based on
a physical property that changes with temperature, for example, volume of a
gas, length of a mercury column, and resistance of a wire. To get a value from
a thermometer it has to be “calibrated.” This means placing a scale on it so
that temperatures can be read off. Since different physical properties vary
in different ways when temperature changes and it is important to calibrate
different thermometers against the same standard – the absolute or thermo-
dynamic temperature scale. In practice, a constant-volume gas thermometer
is used as the standard to calibrate other types of thermometers. The table
below lists some different kinds of a thermometer.
Thermometer Physical property Useful range
Alcohol Thermal expansion of a liquid. Depends on the type of alcohol used but can
measure down to -100°C and up to 78°C
Mercury-in-glass Thermal expansion of a liquid. -37°C to + 356°C
Thermocouple Thermoelectric effect Depends on the type but can be used from
(Seebeck effect). -270°C to 1250°C
Constant volume Thermal expansion of an ideal Used to calibrate other types of thermometers.
gas thermometer gas. -200°C to 1600°C
Pyrometer Spectrum of infra-red radiation 700°C to 3500°C
emitted by hot object
8.3 TEMPERATURE SCALES

Temperature scales are defined using fixed points that are given fixed values.
The Celsius scale (previously called the Centigrade scale) uses two fixed points:
Lower fixed point: freezing point of water defined to be 0°C.
Upper fixed point: boiling point of water defined to be 100°C.
FP.CH08_3pp.indd 156 3/15/2023 12:35:27 PM

Thermal Physics • 157
To calibrate an unmarked mercury thermometer the thermometer is put into

a water/ice mix and the lower position of the mercury column is marked. It
is then put into equilibrium with steam from boiling water and the upper
position of the mercury column is marked. The distance between these two
fixed points is then divided into 100 equal divisions or degrees. Temperatures
below 0°C and above 100°C can also be marked on the scale by using the
same equal divisions.
A similar method can be used to calibrate other types of thermometers, but in
all cases, this assumes that the physical property that is changing (e.g., length
of column or resistance of a wire) varies linearly with temperature in the range
that is being marked. This is usually a reasonably accurate assumption but is
not exactly correct and so thermometers calibrated in this way would need to
be corrected (using a constant volume gas thermometer) if very precise tem-
perature measurements are required.
The thermodynamic scale or kelvin scale uses two different fixed points:
Lower fixed point: absolute zero of temperature defined to be 0 K
(−273.15°C).
Upper fixed point: triple point of water defined to be 273.16 K (0.01°C).
The triple point of water is the unique temperature at which ice, water, and
water vapor are in equilibrium. By choosing these values, the intervals on the
thermodynamic scale T are equal to those on the Celsius scale q so that the
conversion between the two scales is simply:
T = q + 273.15 q = T - 273.15
Celsius scale Thermodynamic scale

Absolute zero -273.15°C 0K
Freezing point of water 0°C 273.15 K
Triple point of water 0.01°C 273.16 K
Typical room temperature About 20°C About 293 K
Human body core temperature About 37°C About 310 K
Boiling point of water 100°C 373.15 K
The best practical way to approximate the thermodynamic scale is to use a

constant-volume gas thermometer. This consists of a bulb containing a fixed
amount of gas connected to a mercury manometer to measure the gas pres-
sure. When the temperature of the gas in the bulb changes the pressure also
FP.CH08_3pp.indd 157 3/15/2023 12:35:27 PM

changes. The ideal gas equation, pV = nRT shows that pressure p is propor-
tional to temperature T if volume V and amount of gas n are both constant
(which they are). Pressure values can be calibrated to give temperatures.
8.4 HEAT TRANSFER MECHANISMS

Heat transfer occurs because of a temperature difference, and the direction
of transfer is always from the hotter (higher temperature) to the cooler (lower
temperature) system. There are three different mechanisms by which this can
occur: conduction, convection, and radiation.
8.4.1 Conduction
Conduction involves the transfer of energy between particles as a result of
collisions. It is the most important process for heat transfer inside a solid
because particles in a solid remain in fixed positions and so cannot form con-
vection currents. Conduction also occurs in liquids and gases but here it is
often less important than convection since particles in liquids and gases are
able to move in convection currents transferring heat energy as they do so.
The process of conduction is a dynamic one but for macroscopic objects, the
transfer of energy from more energetic particles to less energetic particles has
a higher probability than transfer in the opposite direction so heat flows from
higher to lower temperatures. Conduction cannot occur in a vacuum because
there are no particles to conduct the heat.
dQ
The rate of heat flow (in watts) through an insulated block of material
dt
depends on the type of material, its cross-sectional area A, and the tempera-
ture gradient between opposite sides of the block.
direcon of heat flow A temperature 2

temperature 1
The temperature gradient in the block is:
d 2 1 where q > q .
1 2
dx l
FP.CH08_3pp.indd 158 3/15/2023 12:35:31 PM

The rate of heat transfer is given by:
dQ d
A
dt dx
The negative sign occurs because of the direction of heat transfer, from higher
to lower temperature. A constant of proportionality k can be introduced. This
depends on the material of the block. It is called the coefficient of thermal
conductivity. The equation for thermal conductivity (Fourier’s equation) is:
dQ d
kA
dt dx
The units for k are Wm- 1K- 1.
The table below lists typical thermal conductivities for common materials at
around room temperature.
Material Coefficient of thermal conductivity (Wm− 1K− 1)
Air 0.024
Brickwork 0.6–1.0
Window glass 0.96
Ground or soil 0.33 (dry) to 1.4 (very moist)
Insulation materials 0.035–0.16
Expanded polystyrene 0.03
Dry sand 0.15–0.25
Water 0.58
Rock 2–7
Aluminum 205
Diamond 1000
Gold 310
Iron 80
Stainless steel 16
Timber 0.14
The high values for metals are because they contain large numbers of free elec-
trons which rapidly transfer heat. Diamond has an especially high coefficient
of thermal conductivity because its atoms are bonded very tightly together
and this couples their motions, so that when one atom is disturbed it has an
almost immediate effect on its neighbors and so transfers energy quickly.
FP.CH08_3pp.indd 159 3/15/2023 12:35:34 PM

8.4.2 Convection
Convection can only occur in fluids, that is, liquids or gases. This is because
convection involves the bulk movement of hot matter from one part of the
medium to another. Natural convection currents arise as a result of changes
in density inside the fluid. In most fluids, the material expands as its tempera-
ture increases. This increases its volume and decreases its density so that it is
then displaced by cooler denser material from above it. The overall effect is to
set up a convection current with warmer material rising and cooler material
falling. This is often the dominant mechanism for heat transfer within a fluid.
A: fluid heats up and

expands. Its density is
reduced. It is displaced
FLUID by cooler denser fluid.
A convecon current B B: fluid cools down and

contracts. Its density
rises and it sinks.
Heat source
Forced convection is when an external blower or pump is used to force the

fluid to move – for example, when you cool a cup of hot tea by blowing across
its surface you are using forced convection. Convection cannot occur in a
vacuum because it depends on the movement of particles.
8.4.3 Radiation
All bodies emit and absorb a spectrum of electromagnetic radiation that
depends on the nature of their surface and their temperature. For many of
the hot objects we encounter in everyday life the peak of this spectrum is in
the infra-red region so thermal radiation is often referred to as infra-red radia-
tion although the actual spectrum of thermal radiation is continuous. Thermal
radiation is part of the electromagnetic spectrum so it can be transferred in a
vacuum. While the absorption and emission of radiation is a complex subject
it is usually true that matte black surfaces are good absorbers and emitters
FP.CH08_3pp.indd 160 3/15/2023 12:35:35 PM

whereas light shiny surfaces are poor absorbers and emitters. Thermal radia-
tion, like light, can be reflected from a silvered surface, for example, on the
inside of a thermos flask.
A thermal imaging camera can be used to measure infra-red radiation emitted
from different objects. The images below show the camera itself and thermal
images of a man’s face and hand (thanks to Al). The image is color-coded
(greyscale here) to show different temperatures, and these can be read off
from the scale at the bottom of the screen.
8.5 BLACK BODY RADIATION

An ideal radiator and absorber is called a black body. A black body will absorb
all the radiation that falls onto it, but in order to stay in thermal equilibrium it
must also emit radiation at the same rate.
The spectrum of radiation it emits at a particular temperature has a character-
istic shape that was first measured in the mid-19th century but which could not
be explained until Max Planck’s quantum theory in 1900 (see Section 27.1.1).
As the temperature of a black body increases the total flux of radiation from its
surface increases and the wavelength corresponding to the peak of the spec-
trum becomes shorter. This is apparent when heating a metal. While warm it
emits in the infrared, which we can detect by holding our hand nearby. As its
temperature rises it emits more radiation and begins to glow in the visible part
of the spectrum, it becomes “red hot.” At even higher temperatures it also
emits shorter visible wavelengths and becomes “white hot.”
FP.CH08_3pp.indd 161 3/15/2023 12:35:35 PM

power radiated per unit

wavelength higher temperature: greater
intensity and peak at shorter
wavelength
lower temperature: lower

intensity and peak at longer
wavelength
wavelength
black-body radiaon
spectrum
While 19th century physicists were unable to derive the shape of the spec-
trum from first principles, they did identify two important empirical laws that
are very useful when considering thermal radiation. These laws were later
derived from Planck’s theory.
Wien’s displacement law.
This law states that the product of the wavelength corresponding to the peak
of the spectrum l p and the absolute temperature (in kelvin) is a constant:
T constant 2.9 10 3 mK
Stefan–Boltzmann law.
This law states that the total power P radiated from a black body is propor-
tional to the fourth power of the absolute temperature (in kelvin) T:
P e AT 4
where e is the emissivity (e = 1 for an ideal black-body radiator), s is Stefan’s

constant (s = 5.6703 × 10- 8 Wm- 2K- 4), and A is the surface area of the radiator.
8.6 HEAT CAPACITIES

8.6.1 Specific Heat Capacity
The specific heat capacity c of a material is the energy needed to raise the
temperature of 1 kg of that material by 1°C:
FP.CH08_3pp.indd 162 3/15/2023 12:35:40 PM

E
c
m
where DE is the energy supplied, m is the mass of the sample, and Dq is the
temperature change.
The SI unit for specific heat capacity is Jkg- 1K- 1.
The table below gives values for the specific heat capacity of a range of common
substances.
Substance Specific heat capacity (Jkg− 1K− 1)
Dry air at sea level 1460
Water (pure at 20°C) 4180
Ice (0°C) 2093
Copper 385
Aluminum 897
Iron 449
Mercury 140
Concrete 880
Water has a particularly high specific heat capacity. This makes it an ideal
coolant because it can absorb a large amount of heat for a relatively small
increase in temperature.
Some Terminology
Heat capacity is the energy needed to increase the temperature of a par-
ticular sample of a substance by 1°C (or 1K).
Specific heat capacity is the energy needed to increase the temperature
of 1 kg of a substance by 1°C (or 1K). That is, this is the heat capacity per
unit mass.
Molar heat capacity is the energy needed to increase the temperature of
1 mole of a substance by 1°C (or 1K). That is, the heat capacity per unit
amount (the SI unit of the amount is the mole).
8.6.2 Molar Heat Capacities of Gases

The energy needed to raise the temperature of a mole of gas depends on the
conditions under which the gas is contained. If the gas is allowed to expand
while it is heated it will do work on its surroundings and so the heat capacity
FP.CH08_3pp.indd 163 3/15/2023 12:35:41 PM

will be greater than if the gas volume is kept constant. For this reason, there
are two significant heat capacities for a gas:
Heat capacity at constant volume: cV.
Heat capacity at constant pressure: cP.
For the reasons explained above, cP is greater than cV.
8.6.3 Measuring Specific Heat Capacity

When measuring the specific heat capacity of a substance it is important to
ensure that all of the heat supplied is absorbed by the substance and not trans-
ferred to other objects, for example, the heater itself, the container or the sur-
roundings. The simple method shown on the next page should reduce most
of these losses.
The substance is heated electrically at a constant rate P = IV (where I is the
current in the heater circuit and V is the potential difference across the heat-
ing element). This power must be kept constant by adjusting the variable
resistor if necessary. The initial temperature of the substance under test must
be recorded and then regular temperature measurements (e.g., every 30s) are
taken after the heater is switched on.
V
thermometer
A
insulaon
electrical
heater
substance
under test
FP.CH08_3pp.indd 164 3/15/2023 12:35:41 PM

temperature / °C
At higher temperatures
heat loss through the
insulaon increases so the
Inially some of the graph curves downwards
energy supplied heats
the heater so the graph In the linear
curves upwards secon almost all
of the heat
supplied is heang
the substance
under test
me / s
From the equation for specific heat capacity we have E mc so we can
write:
E
P mc
t t
Rearranging this equation gives:
P
c

m
t
is equal to the gradient of the linear part of the graph above.
t
m can be measured using a top pan balance.
P is calculated from the ammeter and voltmeter readings (P = IV).
8.7 SPECIFIC LATENT HEAT

The specific latent heat L of a substance is equal to the energy that must be
supplied to change the state of 1 kg of the substance at its melting or boiling
point with no increase in temperature. For example, the specific latent heat
of fusion for water is the energy needed to change 1 kg of ice at 0°C to 1 kg of
water at 1°C. There are two significant latent heats:
FP.CH08_3pp.indd 165 3/15/2023 12:35:46 PM

Latent heat of fusion Lf: energy required to change 1 kg of solid to 1 kg

of liquid at its melting point (with no change of temperature). This is also
equal to the energy released when 1 kg of liquid freezes at its freezing point.
Latent heat of vaporization Lv: energy required to change 1 kg of liquid
to 1 kg of gas at its boiling point (with no change of temperature). This is
also equal to the energy released when 1 kg of gas condenses to a liquid
at its boiling point.
For a change of state: E = mL
where E is the energy supplied or released, m is the mass of the substance
that changes state at its freezing or boiling point and L is the relevant specific
latent heat.
The SI unit for latent heat is Jkg- 1.
The table below gives the values for the specific latent heat of a range of
common substances.
Substances Latent heat of fusion (Jkg− 1) Latent heat of vaporization (Jkg− 1)
5
Water 3.34 × 10 (0°C) 2.26 × 106 (100°C)
Ethanol 1.09 × 105 (-109°C) 8.38 × 105 (78°C)
5
Copper 2.07 × 10 (1084°C)
Lead 2.24 × 104 (327.5°C)
Tungsten 1.93 × 105 (3400°C)
Mercury 1.13 × 104 (-39°C) 2.94 × 105
8.8 EXERCISES
1. A brick hut with a flat wooden roof is heated so that the inside tempera-
ture is 22°C when the outside temperature is 12°C. The dimensions of the
hut’s base are 4.5 m by 3.5 m and the walls are 2.6 m high and 10 cm thick.
There is a glass window of area 1.5 m2 in one of the walls. The thickness of
the glass is 6.0 mm and the thickness of the wooden roof is 3.0 cm.
(a) Calculate the minimum power of the heater needed to maintain the
temperature at 22°C when the outside temperature is 12°C.
(b) Explain why, in practice, more powerful heating will be needed.
Thermal conductivities:
Brick: 0.80 Wm- 1K- 1, Wood: 0.16 Wm- 1K- 1, Glass 0.96 Wm- 1K- 1
FP.CH08_3pp.indd 166 3/15/2023 12:35:48 PM

2. An 800 g block of copper at 76.0°C is immersed in 2000 cm3 of water

which is initially at 18.0°C. The water is in a thermally insulated container.
Calculate the final equilibrium temperature of the water and the copper
assuming there is no heat loss to the surroundings.
Specific heat capacities: copper: 385 Jkg- 1K- 1, water: 4180 Jkg- 1K- 1
3. The table below gives the specific heat capacities and molar masses of
four metals:
Metal Specific heat capacity (Jkg− 1K− 1) Molar mass (g)
Copper 385 63.5
Aluminum 897 27.0
Iron 449 55.8
Mercury 140 200.6
Calculate the molar heat capacity of each metal. Comment on your results.
4. Calculate the energy needed to change 2.0 liters of water at 20°C into
steam at 120°C:
specific heat capacity of water = 4200 Jkg-1°C-1
specific heat capacity of steam = 2300 Jkg-1°C-1
specific latent heat of water = 2.26 MJkg-1
temperature
pouring
temperature
ambient
temperature
me
5. When molten metal is poured into a cast it cools down and solidifies. The
graph below shows a typical cooling curve for the metal.
Explain the shape of the graph and the significance of the three regions.
FP.CH08_3pp.indd 167 3/15/2023 12:35:49 PM

FP.CH08_3pp.indd 168 3/15/2023 12:35:49 PM
CHAPTER
9
Gases
9.1 THE GAS LAWS

9.1.0 Introduction
The state of a fixed amount of gas is defined by three parameters: its volume,
pressure, and temperature. These are all dependent on one another so that
when a gas is compressed its volume and temperature might change. The
gas laws are a set of macroscopic empirical laws that describe these relation-
ships and which will be explained later when we consider the microscopic
kinetic theory model of a gas (see Section 9.3.2). Given that there are three
parameters there are also three gas laws, each of which holds when one of the
parameters is constant.
Boyle’s law: dependence of pressure on volume (constant temperature).
Charles’s law: dependence of volume on temperature (constant pressure).
Gay-Lussac’s law: dependence of pressure on temperature (constant volume).
While many gases under a wide range of physical conditions will approximately
obey these laws an ideal gas is a theoretical gas that obeys them perfectly.
The three gas laws are expressed most clearly when temperature is measured
using the thermodynamic or kelvin scale and can be combined into a single
equation, the ideal gas equation.
9.1.1 Boyle’s Law

The apparatus below can be used to investigate how the volume of a fixed
mass of gas varies with pressure. The gas is an air column trapped above some
FP.CH09_2pp.indd 169 3/6/2023 4:40:07 PM

colored water. The experiment must be done slowly so that the gas remains
in thermal equilibrium with its surroundings. Changes that take place at con-
stant temperature, like this one, are called “isothermal” changes.
hand pump to
increase pressure
length of air
column: directly
proporonal to
the volume of
the gas Bourdon
gauge to
measure gas
pressure
colored
water, acts Fluid
as a piston to reservoir
compress the
gas
The length of the air column is measured for a range of different pressures.
Since the tube containing the air has a constant cross-sectional area, the vol-
ume of air is directly proportional to the length of the column. The results
of such an experiment show that the pressure p and volume V are inversely
proportional to one another:
This can be expressed algebraically by p ∝ 1 / V or pV = constant. This is called
Boyle’s law:
Boyle’s law: pV = constant for a fixed amount of an ideal gas at a constant
temperature.
pressure pressure
volume 1 / volume
FP.CH09_2pp.indd 170 3/6/2023 4:40:07 PM

Gases • 171
9.1.2 Charles’s Law

The apparatus below can be used to investigate how the volume of a fixed
amount of a gas varies with temperature when the pressure is kept constant.
An air column is trapped inside a capillary tube by a small bead of concen-
trated sulfuric acid. Concentrated sulfuric acid is used because it absorbs
water vapor, keeping the air dry. The capillary tube is then immersed in a
water bath and the temperature of the water bath is changed. As temperature
increases the gas expands and the bead moves. The capillary tube has a con-
stant cross-sectional area so the length of the column is directly proportional
to the volume of air.
bead of concentrated capillary tube

sulphuric acid
CONSTANT MASS OF TRAPPED AIR
length of air column open end: constant

atmospheric pressure
The length of the air column is measured for a range of different tempera-
tures. The tube is open at one end so the pressure remains constant. The
results of such an experiment show that the volume V increases linearly with
temperature q when the temperature is measured in Celsius:
volume
temperature / ° C
Similar experiments using different amounts of gas also produce linear results
but with different gradients. However, if the graphs are extrapolated back, the
intercept on the temperature axis is always the same (-273.15°C) for all (ideal)
gases and all amounts of gas. This is clearly a physically significant temperature
FP.CH09_2pp.indd 171 3/6/2023 4:40:08 PM

and is used as the zero for the thermodynamic or kelvin scale. Using this
temperature scale, volume, and temperature are directly proportional.
more mass
volume
less mass
temperature / ° C
– 273.15 0
more mass
volume
less mass
temperature / K
0 + 273.15
This can be expressed as V ∝ T (temperature in kelvin) or V/T = constant. This

is Charles’s law:
V
Charles’s law: = constant
T
for a fixed amount of gas at constant pressure (temperature in kelvin).
9.1.3 Gay Lussac’s Law (The Pressure Law)

The apparatus below can be used to investigate how the pressure of a fixed
amount of gas varies with temperature when volume is kept constant.
The pressure is measured for a range of different temperatures. Once again
the results show that the pressure varies linearly with temperature in Celsius
and that there is a common intercept on the temperature axis of - 273.15°C.
Using the kelvin scale pressure is directly proportional to temperature:
FP.CH09_2pp.indd 172 3/6/2023 4:40:10 PM

Gases • 173
to pressure
thermometer
sensor
Water bath
Air
Heat
This can be expressed algebraically as p ∝ T or p/T = constant. This is Gay

Lussac’s law (the pressure law):
p
Gay Lussac’s law: = constant for a constant volume of a fixed amount
T
of an ideal gas (temperature measured in kelvin).
pressure
more mass
less mass
21
0 + 273.15 temperature / K
9.2 THE IDEAL GAS EQUATION

The three gas laws can be combined into a single equation:
Boyle,s law : pV constant

, pV
Charles s law : V/T constant constant
, T
Gay Lussac s law : p/T constant
FP.CH09_2pp.indd 173 3/6/2023 4:40:17 PM

This is called the ideal gas equation or equation of state for an ideal gas.
Holding any one of the three parameters constant reduces it to one of the
three gas laws, but it also applies if all three parameters change together.
The constant is directly proportional to the amount of gas. The value of the
constant for one mole of an ideal gas is R = 8.314 JK- 1. The ideal gas equation
is often written in the form:
Equation of state for an ideal gas: pV = nRT
Where n is the number of moles.
9.3 THE KINETIC THEORY OF GASES

9.3.1 Assumptions of the Kinetic Theory
The gas laws are based on experiments but can be explained using Newton’s
laws if we make a few assumptions about the microscopic nature of a gas. The
key assumptions of kinetic theory are:
Gases consist of a large number of tiny, massive particles in rapid random
motion.
The particles have no long-range interactions.
All collisions are elastic.
The volume occupied by the particles themselves is negligible.
Kinetic theory was developed by Boltzmann and others in the 19th century
and its great success provided strong support for the idea that matter really is
composed of tiny massive particles (the atomic theory). However, even as late
as 1900, some physicists and philosophers opposed this theory and suggested
that it was simply a convenient model and did not correspond to physical real-
ity. The most convincing evidence that atomic theory was correct came from
Einstein’s analysis of Brownian motion in 1905.
9.3.2 Explaining Gas Pressure

Gas molecules inside a container are continually colliding with the walls of
the container and rebounding. In each collision the wall exerts a force on the
molecule that changes its direction so, by Newton’s third law, the molecule
exerts an equal but opposite force on the wall. The net effect of billions of col-
lisions creates an almost constant outward force on the wall. Pressure is force
per unit area and force is equal to the rate of change of momentum, so the
FP.CH09_2pp.indd 174 3/14/2023 4:33:25 PM

Gases • 175
pressure exerted by a gas is equal to the average rate of change of molecular

momentum per unit area.
We will build up an expression for gas pressure by considering first a single
molecule bouncing backward and forward in a box, and then adding more
molecules and allowing them to have random speeds and directions. Let’s
start with one molecule of mass m moving in the positive x-direction and col-
liding with the wall of a cubic container of side a:
m v
x-direcon
Change of momentum at wall (momentum is a vector): Dp = - 2 mvx

Time between collisions at wall: Dt = 2a/vx
p mv 2
Average rate of change of momentum at wall: x
t a
mv2x
Average force on molecule at wall: F
a
mv2x
Average force on wall: (by Newton’s third Law): F=
a
Now, consider N molecules with x-components of velocity vx1, vx2, vx3, …, vxN.
Average force from N molecules:
mv2x1 mv2x 2 mv2 m i N Nm 2 (1)

F xN v2xi vx
a a a a i 1 a
where v2x is “the mean-squared x-component of velocity”:
v2x1 v2x 2 v2xi

v2x
N
FP.CH09_2pp.indd 175 3/6/2023 4:40:32 PM

Now let us consider a three-dimensional gas in which the molecules move in

all directions. The derivation above will still be valid for the x-components of
velocity so we need to relate these to the overall velocity. Any particular mol-
ecule will have velocity components, vx, vy, and vz and the magnitude of the
velocity will be given by:
v2 v2x vy2 vz2 .
The molecules are moving randomly so the mean-squared values of vx, vy, and
vz will all be equal:
=
v2x =
vy2 vz2
Therefore: v2 v2x vy2 vz2 3 v2x (2)
Using equations (1) and (2) above we can find an expression for the average
force on the wall of the container from a gas consisting of N particles in rapid
random motion:
Nm 2 Nm 2
=F = vx v
a 3a
where v2 is the “mean-squared speed” of the molecules.

We can now derive an expression for the pressure:
F Nm 2 Nm v2
=
p = v =
A 3 a3 3V
This is the “kinetic theory equation” for gas pressure and it is usually written
like this:
1
pV = Nm v2
3
Since density r = nm/V we can also write this equation in the form:
1
p v2
3
FP.CH09_2pp.indd 176 3/6/2023 4:40:49 PM

Gases • 177
Mean-squared and root-mean-squared speeds:
v12 v22 v32 vN2

Mean squared speed = v2
N
Root-mean-squared speed = vrms = v2
Note that the rms speed is not equal to the mean speed and that the mean-
squared speed is not the same as the mean speed squared.
9.3.3 Molecular Kinetic Energy and Temperature

Compare the kinetic theory equation with the ideal gas equation. Note that
N is the number of molecules and n is the number of moles. It will be help-
ful to introduce NA, the Avogadro number (number of particles in 1 mole,
NA = 6.022 140 857 × 1023 mol-1 ) so that N can be replaced by nNA:
1
Kinetic theory equation: pV = nNA m v2
3
Ideal gas equation: pV = nRT
The right-hand sides of these two equations must be equal, so the gas tem-
perature is directly proportional to molecular kinetic energy:
1
nNA m v2 = nRT
3
1 3
=
Total KE =N A m v2 RT
2 2
Dividing by NA gives:
1 3 R 3
=
Mean molecular KE = = =
m v2 == T kT
2 2 NA 2
R
The constant k = is called “Boltzmann’s constant”. This is effectively the
NA
gas constant per particle and it is a very important constant in thermodynam-
ics. k = 1.38064852 ´ 10- 23 JK- 1. The mean thermal energy per particle in any
thermodynamic system is of the order of kT; for an ideal gas, it is 3/2 kT.
FP.CH09_2pp.indd 177 3/6/2023 4:41:07 PM

Notice that, for an ideal gas, the mean kinetic energy depends only on tem-
perature (in kelvin). This implies that molecules of different gases, at the
same temperature, have the same mean kinetic energy but different mean
speeds. For example, air consists of about 20% oxygen and 80% nitrogen.
Oxygen molecules are more massive than nitrogen molecules, so they have
lower mean squared speeds. Hydrogen molecules have much lower mass and
therefore move, on average, at much greater speeds than oxygen or nitrogen
molecules. This results in them escaping from the atmosphere much more
rapidly, so there is very little hydrogen in the Earth’s atmosphere.
The rms speed of a molecule in an ideal gas at temperature T is given by the
equation:
3 kT
v=
rms =
v2
m
The table below shows the rms speeds of oxygen, nitrogen, and hydrogen
molecules at 20°C (293 K):
Gas Mass of molecule (kg) rms speed at 20°C (293 K) (ms−1)
- 26
Oxygen 5.32 × 10 478
- 26
Nitrogen 4.66 × 10 510
Hydrogen 3.34 × 10- 27 1910
9.3.4 Molar Heat Capacities of an Ideal Monatomic Gas

The internal energy U of an ideal gas is equal to its total kinetic energy. For
one mole of an ideal gas:
1 3
U ==
Total KE =N A m v2 RT
2 2
The molar heat capacity at constant volume cV is the energy required to raise
the temperature of 1 mole of the gas by 1 K. Since all of the heat Q supplied
increases the internal energy:
Q U 3
cV R
T T 2
Note that this is the molar heat capacity at constant volume. If the gas is
instead heated at constant pressure, it will expand and do external work,
FP.CH09_2pp.indd 178 3/6/2023 4:41:13 PM

Gases • 179
increasing the heat needed to raise its temperature and therefore increasing
its heat capacity.
The first law of thermodynamics (see Section 9.6.2) states that the change
in internal energy of a system is equal to the sum of the heat supplied to the
system and the work done on the system. The heat supplied to the system at
constant pressure is Q = cPDT = DU + pDV where cP is the molar heat capacity
at constant pressure and pDV is the work done by the gas as it expands (see
Section 9.6.3). Using the ideal gas equation we can replace DV with RDT/p
(because p is constant) to give:
3
Q RT RT cP T
2
5
Therefore: cP R cV R
2
The ratio of heat capacities is called the adiabatic gas constant g:
cP

cV
For an ideal monatomic gas g = 5/3, for air it is about 1.4.
9.3.5 Equipartition of Energy

Molecules in an ideal gas are treated as particles with no internal degrees of
freedom. This means that the only motion they can have is translation in three
dimensions, so they have three degrees of freedom. The derivation above
shows that move along the x- y- and z-axes contribute equally to the mean
kinetic energy (3/2 kT) of the molecules so that the mean energy per degree
of freedom is ½ kT. The classical theorem of equipartition of energy states
that when molecules are in thermal equilibrium each independent degree of
freedom has a mean energy equal to ½ kT. This applies to translational and
rotational degrees of freedom as well as potential energies (which are not pre-
sent in an ideal gas because there are no long-range forces).
A diatomic molecule is free to rotate about each of the three axes. These
three rotational degrees of freedom add another 3×(½ kT) to the overall mean
kinetic energy per molecule for. A diatomic molecule can also vibrate along
the bond between the two atoms. These vibrational modes have both kinetic
FP.CH09_2pp.indd 179 3/6/2023 4:41:22 PM

energy and potential energy so these two degrees of freedom add 2×(½ kT)
to the overall mean energy per particle. These additional degrees of free-
dom increase the heat capacity of the gas and in this case, if the additional 5
degrees of freedom were all excited the heat capacity should be 4kT.
Classically all of these degrees of freedom should be excited by thermal
motion and physicists expected the molar heat capacities of gases to be in
accord with this model. However, this is not the case, especially at low tem-
peratures. Some degrees of freedom (e.g., vibrational modes) do not contrib-
ute to the heat capacity and only “switch on” at higher temperatures, resulting
in lower heat capacities than those calculated using the equipartition theory.
This apparent anomaly was only explained by quantum theory, which resulted
in the “quenching” of some degrees of freedom at lower temperatures (see
Section 27.1.2).
9.3.6 The Law of Dulong and Petit

The focus of this chapter is on gases but the equipartition theory can explain
an empirical law derived by Dulong and Petit. This law states that the molar
heat capacity of a crystalline solid has a value of 3R and this works pretty well
for a large number of solid metallic elements at room temperature.
The explanation is quite straightforward. Imagine an atom in a cubic crystal-
line lattice with bonds parallel to the x-, y- and z- axes.
y
x
This atom is free to vibrate along three independent directions. Since each
vibrational mode has two degrees of freedom, the total mean internal energy
is 3×(2× ½ kT) = 3 kT. For one mole of particles this is c = 3NAkT = 3RT.
FP.CH09_2pp.indd 180 3/6/2023 4:41:22 PM

Gases • 181
9.3.7 Graham’s Law of Diffusion

Graham’s law of diffusion states that the rate of diffusion is inversely proportional
to the square root of the gas density:
1
Rate of diffusion ∝
r
This follows from the kinetic theory equation in the form:
1
p v2
3
Diffusion occurs as a result of particles moving randomly and colliding with

one another so it is reasonable to assume that the rate at which this occurs will
depend on the root-mean-square (rms) speed: vrms = v2 .
Rate ∝ vrms = 3p ∝ 1
r r
9.3.8 The Speed of Sound in a Gas

Sound consists of longitudinal vibrations in a material medium. In a gas,
the speed at which these disturbances move through the material is directly
related to the speed at which the molecules themselves move so it is not sur-
prising to find that the speed of sound in a gas is directly proportional to the
rms speed of the molecules. In fact, the speed of sound c in an ideal gas is
given by the equation:
p
c

where g is the adiabatic constant (=1.4 for a monatomic gas and 1.67 for a
diatomic gas). The adiabatic constant is the ratio of the specific heat of a gas
at constant pressure to the specific heat at constant volume.
9.4 THE MAXWELL DISTRIBUTION

While vrms gives a good idea of typical molecular speeds in gas the constant
random motion and collisions results in a distribution of speeds from zero
FP.CH09_2pp.indd 181 3/6/2023 4:41:32 PM

up to very high values. Maxwell derived an expression for the fraction of

molecules DN/N lying within a small range of speeds from v to v + Dv:
E
N
The Maxwell distribution: Av e v
2 kT
N
where A is a constant for a particular gas and E is the kinetic energy ½ mv2 of
a molecule moving with speed v.
The derivation of this result is beyond the scope of this book, but we will be
concerned with the qualitative implications of the distribution, which are use-
ful to understand a wide range of thermodynamic phenomena.
On the next page is a graph to illustrate the main features of the Maxwell
speed distribution.
vp vm vrms
fracon of
molecules
between v
and v+v
molecular speed
vp is the most probable speed. This means that there are more molecules
with speeds between vp and vp + Dv than in any other similar-sized range.
vm is the mean speed (vm = 1.13 vp).

vrms is the root-mean-squared speed vrms v2 1.23 vp .
As the temperature of the gas increases the peak moves to the right and falls.
The number of molecules with higher speeds increases:
The area under each graph is the same because the total number of molecules
has not changed.
FP.CH09_2pp.indd 182 3/6/2023 4:41:36 PM

Gases • 183
fracon of
molecules
between v
and v+v
higher temperature
molecular speed
9.5 THE BOLTZMANN FACTOR AND ACTIVATION PROCESSES

For particles in equilibrium at temperature T with a Maxwell distribution of
speeds, the probability of a particular particle gaining additional energy DE is
proportional to the Boltzmann factor:
E

Boltzmann factor = f e kT
This is equal to the ratio of number of particles in the higher energy (E+DE)
state to the number of particles in the lower energy state (E):
number of particles with energy E+ ∆E

f=
number of particles with energy E
Many physical and chemical processes depend on a fraction of particles gain-

ing an “activation energy” EA. Evaporation is a good example. A molecule with
the mean energy does not have enough energy to escape from the surface but
molecules close to the end of the distribution can. The greater this fraction
the greater the rate of evaporation.
The fraction of molecules with at least the activation energy EA is given by
E

the Boltzmann factor f e kT . Since the rate of evaporation is directly
proportional to this fraction we have:
FP.CH09_2pp.indd 183 3/6/2023 4:41:40 PM

∆E
Rate of evaporation ∝ −
e kT
Other activation processes include chemical reactions, conductivity of a semi-
conductor, viscous flow, and creep. In all of these cases, we would expect the
rate R to be directly proportional to the Boltzmann factor:
E
Rate of activation process R ∝ e kT
If this is the case then a graph of ln(R) against 1/T should be a straight line
with a negative gradient and positive y-axis intercept:
ln (R)
1/T
fracon of
molecules
between v
and v+v
Molecules in the
shaded region can
escape from the
liquid surface
molecular speed
Minimum speed
needed to escape
In chemistry, a graph of ln (reaction rate) against 1/(temperature) can be used

to determine rate constants.
FP.CH09_2pp.indd 184 3/6/2023 4:41:45 PM

Gases • 185
9.6 THE FIRST LAW OF THERMODYNAMICS

9.6.1 Internal Energy
The internal energy U of a thermodynamic system is the sum of the
random thermal kinetic energies and potential energies of all particles
in the system.
For particles in an ideal gas, there are no long-range interactions, so the
potential energy term is zero and the internal energy is simply equal to the
sum of the kinetic energies:
1 3
Internal energy of an ideal gas: U=
= Total KE = nNA m v2 nRT
2 2
The fact that, for a fixed amount of gas, this depends only on the abso-
lute temperature of the gas has already been noted. Therefore the internal
energy of the gas is constant for all isothermal changes (changes at constant
temperature).
9.6.2 Heating, Working, and the First Law of Thermodynamics

Consider a system with total internal energy U. The internal energy can be
changed if the system is heated or if work is done on the system.
Heating: energy transfer as a result of a temperature difference (e.g., plac-
ing the system in thermal contact with another system at a different tem-
perature).
Working: energy transfer as a result of the movement of an applied force
(e.g., compressing the system).
The first law of thermodynamics is a statement of energy conservation for the
system and its surroundings.
First law of thermodynamics: The change in internal energy of the system
DU is the difference between the heat supplied to the system Q and the
work done by the system W:
∆U = Q − W
Note that we have defined W as positive when the system does external work.
Sometimes W is defined as work done on the system, in which case the sign of
W in the equation above changes.
FP.CH09_2pp.indd 185 3/6/2023 4:41:47 PM

9.6.3 Work Done by an Ideal Gas

Consider an ideal gas expanding against a piston. The gas is at temperature T,
has pressure p, and expands by an amount dx. The cross-sectional area of the
cylinder is A:
Piston moves
outwards
Expanding gas F
Work is done
by gas
x
Work done by gas as piston moves distance dx: dW = F dx = pAdx = pdV

where dV is the small increase in volume when the piston moves a distance dx:
x2
Work done as piston moves from x = x1 to x = x2: W = ∫ pdV
x1
This is also equal to the area under a graph of pressure against volume:
If the pressure is constant the work done is simply: W = p∆V
9.6.4 Thermodynamic Changes

Isothermal Changes
An isothermal change is one in which there is no change of temperature.
For an ideal gas, the internal energy is also unchanged so DU = 0. Therefore
Q - W = 0 and Q = W, the heat supplied to the system must be equal to the
work done by the system, or if work is done on the system an equal amount of
heat must flow out of the system. An example of this is a slow compression of
a gas that is in thermal contact with its surroundings:
Piston pushed
slowly into gas
GAS
Heat flow to Work done on

surroundings gas
Isothermal compression of an ideal gas obeys Boyle’s law: pV = constant.
FP.CH09_2pp.indd 186 3/6/2023 4:41:47 PM

Gases • 187
Adiabatic Changes
In an adiabatic change no heat flows into or out of the system so Q = 0. This
is often because the change takes place very rapidly and there is not enough
time for a significant heat transfer. Using the first law of thermodynamics we
can see that, when Q = 0, DU = - W, so the increase of internal energy is equal
to the work done on the system. An example of this would be a rapid compres-
sion of gas inside an insulated cylinder. Since the internal energy increases the
temperature of the system also increases.
Boyle’s law does not apply when an ideal gas is compressed adiabatically
(because the temperature is not constant). However, adiabatic compressions
obey another rule:
pV g = constant
where g is the adiabatic gas constant (g = 1.4 for a monatomic gas).
Isochoric Changes
An isochoric change is one that takes place with no change in the volume of
the system. If the volume is constant there is no work done on or by the sys-
tem so W = 0 and DU = Q. The change in internal energy is equal to the heat
supplied to the system. An example of an (approximately) isochoric change is
the explosive combustion of a petrol and air mixture inside the cylinder of a
petrol engine.
Isobaric Changes
An isobaric change is one that takes place at constant pressure. For a fixed
amount of an ideal gas Charles’s law is obeyed: V/T = constant. An example
of an (approximately) isobaric change is the expansion of a diesel air mixture
as fuel is injected and ignited inside the cylinder of a diesel engine. Another
example is the change of state of a liquid to a vapor at constant temperature
and pressure. The vapor occupies a much larger volume than the liquid and
as it expands against external pressure it does work (e.g., pushing back the
atmosphere):
When an ideal gas expands isobarically work done is given by: W = pDV
FP.CH09_2pp.indd 187 3/6/2023 4:41:48 PM

9.7 HEAT ENGINES AND INDICATOR DIAGRAMS

9.7.1 What Is a Heat Engine?
A heat engine uses thermal energy to do useful work, dumping waste heat
in the surroundings. Petrol and diesel engines are examples, and so are ther-
mal power stations and steam engines. Heat engines work between a high-
temperature source and a low-temperature sink and the work they do is
effectively diverted from the flow of heat between these two reservoirs. This
is shown on the next page:
Source at higher
temperature T1
Heat
extracted
from
source: Q1
Work done
by heat
engine: W
Heat
dumped
into sink:
Q2
Sink at lower
temperature T2
The efficiency of a heat engine is given by:
useful work output W Q1 Q2 Q

efficiency 100% 1 2
total heat input Q1 Q1 Q1
The second law of thermodynamics states that it is impossible to convert heat

to work with 100% efficiency (see Section 10.2.3). The French engineer Sadi
Carnot analyzed the cycle of thermodynamic changes (the Carnot cycle) in an
ideal heat engine and showed that it cannot exceed a maximum theoretical
efficiency determined by the temperatures of the hot and cold reservoirs:
T2
efficiency ≤ 1 −
T1
FP.CH09_2pp.indd 188 3/6/2023 4:41:50 PM

Gases • 189
To maximize the theoretical efficiency, we need a high source temperature

and a low sink temperature.
9.7.2 Indicator Diagrams

Heat engines using a working fluid (e.g., steam or burnt gases) take that fluid
around a repeated cycle of changes. These changes might involve heating or
cooling, expansion or compression and work might be done on or by the fluid.
Each stage in the cycle can be plotted on an indicator diagram which is a
graph of pressure against volume. This helps us understand the changes in
state and energy transfers that take place inside the heat engine.
Here is an example of a simplified indicator diagram. Real heat engines have
more complex cycles.
pressure
P2
C D
P1
B A
volume
V2 V1
AB: The fluid is compressed at constant pressure. Work must be done by an

external agent. The work done on the fluid is equal to the area under the line
AB. The work done on the fluid is W1.
BC: The fluid pressure increases rapidly at constant volume. This could be as
a result of sudden heating (e.g., ignition of a fuel–air mixture). The fact that
the volume remains constant means that there is no work done on or by the
fluid. The fluid gains heat Q1.
CD: The fluid expands at constant pressure and so does work against external
forces. The work done is equal to the area under the line CD. The work done
by the fluid is W2 so the net work done by the engine in one cycle of changes
is W = W2 - W1. This is equal to the area contained inside the loop on the
indicator diagram.
FP.CH09_2pp.indd 189 3/6/2023 4:41:50 PM

DA: The fluid loses pressure at constant volume. This could be the result of a
sudden transfer of heat to the surroundings (e.g., as a result of exhaust). The
fluid loses heat Q2.
If the working fluid is an ideal gas, then the gas laws and the first law of ther-
modynamics can be used to analyze each stage in the cycle in order to cal-
culate the efficiency of the heat engine. While real heat engines do not have
simple indicator diagrams there are good approximations for the way in which
certain heat engines work. For example, the Otto cycle is used to model the
behavior of spark ignition petrol engines and the Diesel cycle is used to model
the behavior of a compression ignition engine such as a diesel engine.
9.7.3 The Otto Cycle

This cycle models the behavior of the working fluid inside one of the cylinders
of a typical four-stroke petrol engine.
inlet valve exhaust valve

spark plug
cylinder working
head fluid
piston
cylinder
block
connecng
rod
cranksha
Intake stroke: the inlet valve opens and an explosive mixture of petrol and
air is drawn into the cylinder.
Compression stroke: the mixture is compressed.
Power stroke: following ignition the hot burnt gases push the piston back.
Exhaust stroke: the exhaust valve opens and burnt gases leave the cylinder.
FP.CH09_2pp.indd 190 3/6/2023 4:41:51 PM

Gases • 191
After this the exhaust valve closes and the cycle begins again. In reality, the
fluid inside the cylinder is not constant, it is drawn in at the start of the cycle
and ejected as exhaust at the end of the cycle.
The diagram shows an idealized indicator diagram for the Otto cycle.
pressure
C
pC
pD
D
pB
B
pA A
O
volume
V2 V1
AB: Adiabatic compression of gas (work done on gas but no heat transfer).
Compression stroke.
BC: Isochoric heating of gas (heat is supplied as fuel–air mixture explodes
but no work is done).
CD: Adiabatic expansion of gas (work is done by the gas but no heat is
transferred). Power stroke.
DA: Isochoric cooling of gas (heat is lost from the gas but no work is done
by it).
AB corresponds to the compression stroke and CD to the power stroke. The
closed loop OAO represents the intake and exhaust strokes. The area con-
tained by the loop ABCD represents the net work done in one cycle.
Analysis of the Otto Cycle

The theoretical efficiency of heat engine is given by:
T2
efficiency 1
T1
where T2 is the sink temperature and T1 is the source temperature. In the
idealized diagram above T1 = TC and T2 = TD so the theoretical efficiency of
the Otto cycle is:
FP.CH09_2pp.indd 191 3/6/2023 4:41:52 PM

TD
efficiency 1
TC
V1
This can be linked to the “compression ratio”, , by using the equation of
V2
state for an ideal gas and the relation pV g = constant for adiabatic changes.
We need to find an expression for TD/TC.
Consider the power stroke CD. This is an adiabatic change so:
pC V2g = pD V1g (1)
pC V2 pD V1
The ideal gas equation also applies: = (2)
TC TD
g g −1
TD  pD  V1   V2   V1   V2 
=
Combining these equations: =    =     
TC  pC   V2   V1   V2   V1 
The theoretical efficiency of the Otto cycle is therefore:
g −1
V 
efficiency ≤ 1 −  2 
 V1 
Using a typical compression ratio V1/V2 = 10 and taking g = 1.4 gives a maxi-
mum theoretical efficiency of (1 - 0.100.4) = 0.60 (or 60%).
A real internal combustion petrol engine is only about 25–35% efficient at
transferring thermal energy from the fuel into useful mechanical energy.
Here are some of the ways in which energy is wasted:
The thermodynamic cycle of a real engine differs from the ideal Otto
cycle – valves take time to open and close and changes in pressure and
temperature cannot take place instantaneously.
The intake and exhaust strokes do not form a closed loop and work must
be done by the engine during these phases.
There is heat transfer from the engine to the surroundings (in addition to
the heat lost during the exhaust stroke).
There is friction between moving parts which transfers energy to heat.
FP.CH09_2pp.indd 192 3/6/2023 4:41:56 PM

Gases • 193
9.7.4 The Diesel Cycle

The most important practical difference between a petrol engine and a d iesel
engine is that the fuel–air mixture in a petrol engine is ignited by a spark
whereas, in a diesel engine, it is ignited by compression. Diesel engines have
a larger compression ratio than petrol engines (typically 15–20 compared to
7–10) and this contributes to their higher thermodynamic efficiency. The
combustion process is also different. Whereas the petrol air mixture is ignited
explosively and the pressure and temperature of the working fluid rise almost
instantaneously, in the diesel engine the fuel is sprayed into the cylinder and
ignited so that the first part of the power stroke takes place at almost constant
pressure.
The idealized cycle is shown on the next page.
pressure
B C
PB
D
PD
pA
O A
V2 V3 V1 volume
AB: Adiabatic compression of air (work done on gas but no heat transfer).
Compression stroke. The fuel has not yet been injected so there is no dan-
ger of pre-ignition (which limits the compression ratio for a petrol engine)
and the compression ratio V1/V2 for a diesel engine can be significantly
higher than for a petrol engine.
BC: Fuel is injected and burnt at constant pressure. Work is done by the
engine. This is the first part of the power stroke.
CD: Adiabatic expansion of gas (work is done by the gas but no heat is
transferred). This completes the power stroke.
DA: Isochoric cooling of gas (heat is lost from the gas but no work is done
by it).
A real diesel engine only approximates the idealized cycle shown above and
additional energy losses occur in much the same way as for a petrol engine.
FP.CH09_2pp.indd 193 3/6/2023 4:41:56 PM

9.8 EXERCISES
1. 4.0 m3 of an ideal gas at a temperature of 15°C and a pressure of 1.10 × 105

Pa is compressed isothermally (i.e., at constant temperature) to one quarter
of its original volume.
(a) What will be its final pressure and temperature?

(b) What happens to the internal energy of the gas during this compres-
sion? Explain by referring to work done and heat transferred.
(c) Explain how the first law of thermodynamics applies to an isother-
mal compression (you should say what happens to each term in the
equation).
The gas is then allowed to expand at constant pressure until it again occu-
pies a volume of 4.0 m3.
(d) What is the final temperature of the gas?
(e) Explain, in terms of molecular motions, why the final temperature is
not 15°C.
2. A bicycle tire contains 9.2 × 10- 4 m3 of air at 20°C. The pressure in the
tire is 480 kPa.
(a) Calculate the number of moles of air in the tire.

(b) After the bicycle has been ridden the pressure inside the tire is
higher. Explain why.
(c) Estimate the final temperature of the air if the final pressure is 500
kPa.
(d) Explain why your answer to (c) can only be approximate.
3. A fixed mass of an ideal gas is trapped in a syringe.
gas
(a) Sketch a graph to show how the pressure inside the syringe var-
ies with volume as the gas is slowly* compressed (e.g., as if some-
FP.CH09_2pp.indd 194 3/6/2023 4:41:56 PM

Gases • 195
one put their finger over the end of the syringe and slowly pushed
the plunger in. *Assume this is slow enough for the temperature to
remain constant.
(b) On the same graph draw a second line to indicate how you would
expect pressure to change with volume if the plunger is pushed in
quickly.
(c) Explain the difference between the lines in (a) and (b) by referring
to work and internal energy.
4. Air at atmospheric pressure and temperature has a pressure of about 105
Pa and a density of about 1.2 kgm− 3.
(a) Calculate the rms speed of an air molecule.

(b) Air consists mainly of nitrogen and oxygen and the rms speeds of
oxygen and nitrogen molecules in the atmosphere are different.
Explain why.
5. Suggest why the ideal gas equation will cease to apply:
(a) at very high pressures,

(b) at very high temperatures,
(c) at very low temperatures.
6. The speed of sound in the air is given by the equation:
gp
v=
ρ
where g is a dimensionless constant, p is the air pressure, and r is the air
density. By treating air as an ideal gas show that the speed of sound in the
air is independent of pressure but directly proportional to the square root
of absolute temperature. (Hint: consider one-mole of air and use the ideal
gas equation to relate density to pressure and temperature by introducing
the molar mass M.)
7. Eight molecules in a gas are moving parallel to the x-axis and have
x-components of velocity equal to: 510.0 m/s, - 550.0 /s, - 495.0 m/s,
548.0 m/s, - 498.0 m/s, 502.0 /s, 518.0 m/s, - 535.0 m/s.
Each molecule has a mass of 5.4 × 10− 26 kg. Calculate:
FP.CH09_2pp.indd 195 3/6/2023 4:41:56 PM

(a) the average molecular velocity,

(b) the average molecular speed,
(c) the root mean square molecular velocity,
(d) the mean molecular kinetic energy.
8. An isothermal change takes place at constant temperature. Explain why
isothermal compression results in a flow of heat to the surroundings
exactly equal to the work done in compressing the gas.
9. (a) State the first law of Thermodynamics and define all the terms in the
equation.
(b) A trapped gas inside a cylinder is heated. It expands isothermally (at
constant temperature). The heat supplied is H.
i. State the change in internal energy of the gas.
ii. State the work done by the gas as it expands.
∆E
−
10. The Boltzmann factor is given by: e kT
(a) define the terms that appear in the equation and explain the mean-
ing of the Boltzmann factor.
(b) Sketch graphs to show how the Boltzmann factor varies with (i) T,
(ii) DE, and (iii) DE/kT.
(c) Explain why chemical reactions proceed more rapidly at higher tem-
peratures.
11. By what factor, roughly, would you expect the rate of a chemical reaction
to change if the temperature increases by 10 K (assume the reaction is
taking place at around 350 K and that the activation energy is about 1 eV.
12. In human blood oxygen molecules bind to hemoglobin molecules. The
bond energy is about 0.30 eV. Calculate the ratio of free oxygen molecules
to bound oxygen molecules in the blood at body temperature (310 K).
13. (a) Explain why the conductance (G = 1/R) of a semiconductor NTC
increases with temperature. Your answer should refer to the Boltz-
mann factor.
(b) A semiconductor NTC thermistor is connected to a source of con-
stant low voltage V and a currentIpasses through it. I will vary with
temperature. Explain why a graph of ln I versus 1/T will be a straight
line. What is the gradient of this line?
FP.CH09_2pp.indd 196 3/6/2023 4:41:56 PM

Gases • 197
14. Here is the Otto cycle for an ideal four-stroke petrol engine:
pressure
D
pB
B
pA A
O
volume
V2 V1
Here is some data for the same engine:

mean temperature of gases during combustion stroke 800 °C
mean temperature of exhaust gases 80 °C
area enclosed by indicator diagram loop 420J
−
calorific value of fuel 45 MJ kg 1
(a) Describe the thermodynamic changes and energy flows taking place
during the stages: AB, BC, CD, and DA on the diagram.
(b) Calculate the thermodynamic efficiency of the engine.
(c) Calculate the rate of burning fuel (kgs- 1) when the engine, which has
4 cylinders, is rotating at 1500 rpm.
FP.CH09_2pp.indd 197 3/6/2023 4:41:57 PM

FP.CH09_2pp.indd 198 3/6/2023 4:41:57 PM
CHAPTER
10
Statistical Thermodynamics and
the Second Law
10.0 INTRODUCTION
The gas laws and the ideal gas equation are all derived empirically, that is from
experimental data. They give a good mathematical description of the behavior
of an ideal gas but they offer no explanation of why the gas behaves in this
way. The reason for this is that they deal with macroscopic properties such as
volume, temperature, and pressure and do not engage with the microscopic
behavior of the molecules that constitute the gas. Kinetic theory (see Section
9.3) provides a more fundamental explanation of the behavior of an ideal gas,
despite being based on the random motion of particles. Statistical thermody-
namics takes this further and provides one of the most important and enig-
matic of all physical laws, the second law of thermodynamics.
10.1 REVERSIBLE AND IRREVERSIBLE PROCESSES

Consider a dark blue ink droplet dropped into a glass of water. For a while,
the dark ink forms recognizable and distinct patterns in the water but as time
goes on these become less and less distinct and after an hour or so the droplet
has disappeared and the water appears uniformly pale blue. The particles of
the ink have spread more or less uniformly throughout the water. And so it
remains.
You could wait a thousand years and the water would continue to appear uni-
formly pale blue. It is as if, once the ink has spread out, nothing further hap-
pens. In one sense this is true. Macroscopically the ink droplet has spread
uniformly throughout the water and that is the end of it. However, on the
microscopic scale, particles of water and ink are continually moving, colliding,
FP.CH10_3pp.indd 199 3/15/2023 12:15:15 PM

and changing positions. These microscopic collisions are all governed by

Newton’s laws and are completely reversible. This means that if you were to
analyze a particular collision and then run it backward, the reversed colli-
sion would be a perfectly acceptable physical process. In fact, if you were
presented with videos of a single collision running forward and backward you
would be unable to tell which one is the one that goes forwards in time. Both
would obey Newton’s laws and conserve energy, momentum, etc.
direcon of me
direcon of me
Newton’s laws are reversible. So are the laws of electromagnetism and gravity.
They do not distinguish between the past and the future. However, macro-
scopic processes are almost always irreversible. The pale blue water does not
resolve itself back into separate clear water and a dark drop, a broken glass
remains broken, a scrambled egg remains scrambled and, more personally, we
remember the past but not the future and we all eventually grow old and die.
This raises two very important questions.
1. If the underlying laws of physics are fundamentally reversible then how

do we explain the irreversibility of the macroscopic world?
2. What distinguishes the past from the future?
Another way to pose this question is in terms of an “arrow of time” that points
from the past to the future. Why is there an arrow of time and what deter-
mines its direction?
These questions about irreversible processes first emerged in the context of
thermodynamics when physicists tried to understand how to maximize the
efficiency of heat engines. This led them to the second law of thermodynam-
ics, one of the most important ideas in physics.
FP.CH10_3pp.indd 200 3/15/2023 12:15:16 PM

Statistical Thermodynamics and the Second Law • 201
10.2 THE SECOND LAW OF THERMODYNAMICS AS A

MACROSCOPIC PRINCIPLE
10.2.1 Macroscopic Statements of the Second Law
Heat engines are used to extract heat from work. If it was possible to do this
with 100% efficiency then we could build power stations that extract energy
directly from the oceans. This would not violate the law of conservation of
energy, it would simply lower the temperature of some seawater by a few
degrees and the sun would soon re-warm the water. Our energy problems
would be solved.
Seawater at
17 ° C (290K)
Solar Ocean at 100% efficient

heang 17 ° C (290 K) heat engine
Water returned
at 16 ° C (289 K)
While pilot heat engines have shown that this process is physically viable there
is a fundamental limit to the possible efficiency of any heat engine. This is stated
qualitatively in a macroscopic version of the second law of thermodynamics:
It is impossible to construct a heat engine that, operating in a cycle, can
extract heat from a source and transfer it completely into work.
This implies that any heat engine MUST dump some waste heat into the envi-
ronment. It cannot be 100% efficient.
The work generated by a heat engine comes as a by-product of the transfer of
heat from the hot source to the cooler sink. If heat could flow back from the
sink to the source, we could continue to operate the heat engine and increase
FP.CH10_3pp.indd 201 3/15/2023 12:15:19 PM

its efficiency. However, heat only flows spontaneously from systems at higher
temperatures to systems at lower temperatures, so this is not possible. This
leads to a second, equivalent, statement of the second law of thermodynamics.
It is impossible for heat to flow from a cooler to a hotter body with no
other change taking place.
This is a matter of common experience. If you leave a cup of hot coffee on
the table it will soon transfer heat to its surroundings until it reaches the same
temperature as the room. However long you wait it does not reheat itself. This
seems similar to the example of the ink droplet in water, it is another exam-
ple of an irreversible macroscopic process. Once again the macroscopic one-
way process of approaching thermal equilibrium is actually driven by random
microscopic collisions that continue to take place after a uniform equilibrium
temperature has been reached.
In order to explain these one-way irreversible changes physicists introduced
a new quantity, “entropy.” The second law can then be stated in terms of
entropy:
The entropy of an isolated system tends to a maximum.
But what is entropy?
10.2.2 Heat Transfer and Entropy

In macroscopic terms, the entropy change of a system is related to the heat Q
reversibly supplied to it divided by the temperature at which that heat is sup-
plied. This can be written as an integral:
state B
dQ
∆S = ∫
state A
T
For heat supplied at a constant temperature, this becomes simply DS = Q / T.

The best way to get a feel for this new quantity is to see how it applies to the
examples above. We will start with heat flow from hot to cold. Consider two
systems A and B at temperatures T1 and T2, respectively. If these are placed in
thermal contact then they form an isolated system and the total entropy of the
system must either increase or stay the same. However, molecular collisions at
the boundary between the two systems will result in a flow of heat. How does
the transfer of heat dQ from A to B affect the entropy?
FP.CH10_3pp.indd 202 3/15/2023 12:15:21 PM

A B
Q
T1 T2
dQ
d SA = −
T1
dQ
d SB = +
T2
dQ dQ
d Ssystem =
− + ≥ 0 (second law of thermodynamics)
T1 T2
dQ dQ
therefore: ≥ so T1 ≥ T2
T2 T1
If entropy cannot decrease then either T1 > T2 and heat flow from hot to cold
or T1 = T2 and the system are in thermal equilibrium with maximum entropy.
This shows that the second law expressed in terms of entropy is in agreement
with the statement about the direction of heat transfer. Heat cannot, in isola-
tion, flow spontaneously from a colder to a hotter body because this would
decrease the entropy of the system.
10.2.3 Entropy and Maximum Efficiency of a Heat Engine

The other macroscopic statement of the second law referred to the efficiency
of a heat engine. Let is analyze this using the concept of entropy (and assume
that, for the ideal heat engines under consideration, heat is transferred revers-
ibly between the source and the sink).
Work done has no effect on the entropy of the system, so the entropy changes
of the system are:
FP.CH10_3pp.indd 203 3/15/2023 12:15:24 PM

Entropy of source
Source at T1 decreases as heat is
Q1 extracted.
Work done: W
Q2
Entropy of sink
Sink at T2 increases as heat
is dumped
Q1
d Ssource = −
T1
Q2
d Ssink = +
T2
Q Q
d Ssystem =
− 1 + 2 ≥ 0 (second law of thermodynamics)
T1 T2
Rearranging this gives: Q2 T2
≥
Q1 T1
W Q1 − Q2 Q
The efficiency of the engine is: efficiency = = = 1− 2
Q1 Q1 Q1
T2
Therefore: efficiency ≤ 1 −
T1
This result was first derived by a French engineer called Sadi Carnot based
on the operation of an ideal reversible heat engine. It sets a limit to the theo-
retical efficiency of any heat engine and shows that the theoretical efficiency
increases as the ratio of T1 to T2 increases. This is in agreement with our
previous macroscopic statement of the second law and also explains why our
seawater heat engine was doomed to failure. While a heat engine working
on those principles could be constructed, if it worked between a source tem-
perature of 290 K and a sink temperature of 289 K its maximum theoretical
efficiency would be (1 - 289/290) = 0.0034 or 0.34%, and this is before taking
into account the unavoidable additional losses caused by friction, etc. For a
petrol engine, the source temperature (of the hot gases following ignition) is
FP.CH10_3pp.indd 204 3/15/2023 12:15:26 PM

about 1000°C (about 1300 K), and the sink temperature at which the gases
are exhausted is about 500°C (about 800 K) giving a maximum theoretical
efficiency of (1 - 800/1300) = 0.38 or 38%.
10.3 ENTROPY AND NUMBER OF WAYS

10.3.1 Macro-state and Micro-states
To understand what entropy actually represents we will need to consider the
relationship between macroscopic states and the microscopic configurations
that make them up. When we look at a system from the outside (macroscopi-
cally) we do not see its microscopic configuration. The uniformly pale blue
water left after an ink droplet has spread and diffused throughout the volume
might be realized by a very large number of different microscopic arrange-
ments of water and ink particles, all of which appear, from the outside, to be
uniformly spread. The key point is this: the same macro-state may correspond
to a large number of micro-states. This means that a stable and unchanging
macroscopic state does not necessarily imply that the microscopic configura-
tions that make up that state are also stable and unchanging.
Here is a simple model to illustrate what is going on. Imagine a box that
contains a large number N of identical gas particles that are initially held in
one half of the box by a barrier. Then the barrier is suddenly removed and the
particles are free to explore the whole box. For a short while an imbalance will
persist but very soon the particles will spread uniformly throughout the box
and they will not go back onto one side of it. This is an irreversible process
similar to the mixing of ink and water or the transfer of heat from a hotter to
a colder body. There is a clear arrow of time from the asymmetric state (all
particles on one side of the barrier) to the symmetric state (particles spread
evenly throughout the container.
barrier removed
To understand why this is irreversible we need to consider the microstates of

the system. Assume that, after the barrier has been removed, each particle has
a 50% chance of being on either side of the barrier. This seems reasonable
FP.CH10_3pp.indd 205 3/15/2023 12:15:26 PM

since the particles move randomly. What is the probability that all N particles
are found on the left-hand side of the barrier? It is ½N, which is vanishingly
small for large N. This is because there is just one way in which this can occur –
just one microstate of the system which has all particles on the left (ignoring
rearrangements within the N particles themselves!), in the same way, that if
you toss a coin N times there is just one way that all N coins can land heads.
Now consider the probability that (N - 1) particles are on the left of the barrier
and just 1 is on the right. There are N possible ways in which this can happen
(each of the particles could be the one that is on the right) so the probability is
N times larger. For a distribution with (N - 2):2 the probability is much larger
again (by a factor of ½( N - 1) times. This is because there are (
N ( N − 1) )
2!
ways in which this can occur. In fact, the number of ways continues to increase
up to a distribution with N/2 particles on each side and then falls back again
until there is just one way in which all the particles can be found on the right-
hand side. The “number of ways” we have referred to here is the number of
microstates that represents each macro-state. The final macro-state (a 50:50
distribution) is also the macro-state that can be realized in the largest number
of ways, that is, the macro-state that can be realized in the largest number of
distinct micro-states. The graph below shows how the “number of ways,” W,
changes with the distribution of particles:
number of
micro-states W
macro-state
distribuon L:R
N:0 N/2 : N/2 0:N
As N increases, the peak becomes incredibly sharp (tall and narrow) so

that virtually all of the micro-states cluster close to the N/2:N/2 equilib-
rium distribution. Most of the micro-states correspond to macro-states that
FP.CH10_3pp.indd 206 3/15/2023 12:15:27 PM

are indistinguishable from this 50:50 distribution. Since the particles move
randomly, we can assume that, left alone, the system will explore all of the
microstates available to it. The system we considered started off in a macro-
state which corresponds to only 1 micro-state. As time goes on it explores
other adjacent micro-states, most of which lie closer to the peak because the
number of micro-states increases that way. This means the system tends to
evolve toward the peak. Given the fact that the number of micro-states close
to the peak vastly outweighs all of those even a short distance away from it
then, if we leave the system for any length of time, the overwhelming prob-
ability will be that we will find it in a macro-state close to the peak of the dis-
tribution – that is,, close to equilibrium. And once in equilibrium, it is highly
unlikely to fluctuate far from it because the number of micro-states drops
rapidly away from equilibrium. For the systems we usually interact with, N is
enormous (e.g., >1020), so the probability of a large fluctuation from equilib-
rium is so small that it can be assumed to be zero.
A system starng here is more A system at or close to

likely to move toward equilibrium equilibrium will stay there
because there are more adjacent because, aer any fluctuaon
micro-states closer to equilibrium away there will be many more
than farther from it. micro-states closer to
equilibrium than farther from it.
number of
micro-states W
macro-state
distribuon L:R
N:0 N/2 : N/2 0:N
We can now give a more fundamental, microscopic, explanation of the irre-

versible evolution toward equilibrium. It is based on three assumptions.
FP.CH10_3pp.indd 207 3/15/2023 12:15:27 PM

Macroscopic systems can be composed of a number (usually a very large

number) W of microscopic configurations or micro-states (sometimes
called “number of ways”).
All micro-states of a macroscopic system are equally probable.
An isolated system will explore all accessible micro-states because of the
random thermal motions of its particles.
The consequence is that we expect a system that is prepared in a non-equilibrium
state to evolve toward equilibrium and then stay there. Equilibrium is the
macro-state that corresponds to the largest number of accessible micro-states,
that is, the macro-state for which W has a maximum value.
10.3.2 Entropy and Number of Ways

The idea that systems evolve toward a macroscopic state that can be realized
in the maximum number of microscopic configurations or ways corresponds
to the evolution of a system toward maximum entropy. If we are to explain
irreversibility we will need to link entropy, S, and number of ways, W. The
first person to do this was Ludwig Boltzmann and the equation linking the
two quantities is:
S = k ln W
where k is the Boltzmann constant, and W is the number of micro-states.

In this context, a micro-state is a distinct way of arranging energy and particles
in the system.
The SI unit for entropy is JK- 1.
It might seem surprising to see the natural logarithm in this equation. However,
consider joining two systems each having W configurations, together. The
number of possible configurations of the combined system is not 2W but W2.
However, if the original entropy of each system is S then the entropy of the
two systems together must be 2S, so the use of a logarithm reduces this multi-
plicative property to one of addition: ln (W2) = 2 ln (W). The logarithmic func-
tion makes entropy an extensive quantity like mass so that the total entropy
of a combination of different systems is equal to the sum of their individual
entropies.
If entropy is defined in this way, then the second law, that the entropy of
an isolated system tends to a maximum value, follows simply from probabil-
ity. The equilibrium macro-sate is the most probable macro-state, that is, the
FP.CH10_3pp.indd 208 3/15/2023 12:15:28 PM

one that can be realized in the largest number of micro-states This is the
maximum value of W and therefore maximum entropy. The arrow of time is
the direction from low entropy to high entropy and from small W to large W.
10.3.3 Poincaré Recurrence

The assumption that a system will explore all possible microstates implies
that, if left for a long enough time, it will eventually return to its initial state.
In fact, in an infinite time, we would expect it to return to all configurations
an infinite number of times. This means that while the statistical interpreta-
tion of entropy does explain why it is overwhelmingly likely that a system will
evolve toward maximum entropy it is not impossible for entropy to decrease.
Poincaré pointed out that any particular system will have a characteristic
average time between returning to its initial state. This is called the Poincaré
recurrence time. For the type of macroscopic system that is usually encoun-
tered (e.g., a flask of gas) this is enormous even compared to the age of the
universe, so it is safe to assume that the second law will hold and these systems
will evolve toward states of maximum entropy.
10.4 WHAT IS TEMPERATURE?

So far, we have not given a fundamental explanation of temperature. However,
we have identified heat transfer from hot to cold with an increase in entropy.
If we take a closer look at this, we will be able to gain a more fundamental
understanding of the concept of temperature.
When two systems at temperatures Thot and Tcold are placed in thermal con-
tact heat flows from the system at higher temperature to the system at lower
temperature. We showed (see Section 10.2.2) that this is a consequence of
increasing entropy. Removing heat dQ from the hotter system reduces its
entropy by an amount dQ/Thot which is smaller than the increase in entropy
of the cold system, dQ/Tcold resulting in a net increase in entropy of the com-
bined system and therefore satisfying the second law. What this must mean is
that transfer of heat dQ to or from a hotter system has a smaller effect on its
entropy than transfer of the same amount of heat to or from a colder system.
When two systems are in thermal equilibrium the entropy change when heat
dQ is transferred between them is zero. Richard Feynman likened it to dry
yourself with a small towel. At first, you are wetter than the towel and water
are transferred from you to the towel. However, once you have used the towel
FP.CH10_3pp.indd 209 3/15/2023 12:15:28 PM

for a while it becomes so wet that it no longer dries you – as you rub yourself
with the towel as much water transfers from you to the towel as transfers from
the towel to you – both the towel and your skin have reached the same degree
of “wetness.” Continuing to use the towel results in no further change in wet-
ness. This is analogous to the approach to thermal equilibrium.
Temperature is therefore related to the rate at which entropy changes when
heat is transferred to or from a system, and we can define the thermodynamic
temperature using this equation:
1 dS
=
T dQ
How does this relate to the statistical description of entropy in terms of micro-
scopic configurations? This can be understood by considering the effect of
adding 1 quantum of energy to a system that already contains N quanta. While
this will always increase the number of configurations of the system it has a
greater effect on the entropy of a system with an initially smaller number of
configurations than on one with an initially larger number. Mathematically
this is a consequence of the natural logarithm in Boltzmann’s formula.
10.5 ABSOLUTE ZERO AND ABSOLUTE ENTROPY

10.5.1 Entropy at Absolute Zero
As the temperature of a system is reduced it will approach a state of minimum
energy. In principle, for an ideal crystalline material, this state exists in a single
configuration.
Therefore W = 1 and S = k ln (W) = 0
This is a statement of the “third law of thermodynamics”:
The entropy of a perfect crystal is zero at absolute zero.
In practice, it is impossible to cool anything to absolute zero so the third law
serves as a theoretical starting point for the calculation of absolute entropies.
10.5.2 Calculating Absolute Entropy

While Boltzmann’s formula provides a clear idea of what entropy represents
it is not always straightforward to calculate entropies based on number of
FP.CH10_3pp.indd 210 3/15/2023 12:15:28 PM

configurations. However, since entropy determines how a system will evolve,

for example, whether a chemical reaction will proceed or not, we need to
know the absolute entropy values for different substances. In order to do this
we must return to the macroscopic definition of entropy change:
δS = QδT (for a reversible heat transfer)
The absolute entropy at temperature T of a substance is then defined as:

T
dQ T
c ( T ) dT
=S ∫0=T ∫
0
T
Where c(T) is the temperature-dependent heat capacity of the substance.
10.5.3 Entropy Changes for an Ideal Gas

When a gas is heated, compressed, or allowed to expand its entropy
changes. This is because the number of ways that energy can be distributed
amongst the gas particles, and the number of ways the particles themselves
can be distributed in the volume available to them, changes. Formulae
for entropy changes of an ideal gas under different types of change are
derived below.
Isochoric Changes (Constant Volume).
dU = Q - W but (W = 0) therefore: dU = Q = cVdT
T2 T2 T2
dQ cV dT 3 RdT 3 T 
∆
=S ∫T T= ∫
T1
T
= ∫
T1
=
2T 2
R ln  2 
 T1 
1
Isobaric Changes (Constant Pressure).

δU = Q − W = Q − pδV
3 3 RT δV
Q= RδT + pδV = RδT + (using the ideal gas equation)
2 2 V
T2 V
3 RdT 2 RδV 3 T  V 
=
∆S ∫T 2T + V∫ = V 2
R ln  2  + R ln  2 
 T1   V1 
1 1
FP.CH10_3pp.indd 211 3/15/2023 12:15:29 PM

Isothermal Changes (Constant Temperature):

For a constant temperature change dU = 0 and Q = W = pdV.
RT δV
Q = pδV = (using the ideal gas equation)
V
V2
RδV V 
=
∆S ∫
V1
= R ln  2 
V  V1 
10.6 REFRIGERATORS AND HEAT PUMPS

10.6.1 Refrigerators
The purpose of a refrigerator is to cool things down. To do this heat must
flow out of a cooler body (the food you place inside the refrigerator) and
into a warmer one (the surrounding air in the room outside the refrigerator).
The second law of thermodynamics states that this cannot occur in isolation
because it would lower the entropy of the Universe. However, it can occur if
work is done.
A refrigerator is actually a reversed heat engine:
Hot reservoir at Hot reservoir at

Thot Thot
Qhot
Qhot
Work done by Work done on
the heat engine: the refrigerator:
W W
Qcold Qcold
Cold reservoir at Cold reservoir at

Tcold Tcold
Heat engine Refrigerator
FP.CH10_3pp.indd 212 3/15/2023 12:15:30 PM

For a heat engine, the heat reservoirs are burnt fuel (hot) and the environment
(cold). For a refrigerator, the heat reservoirs are the foodstuff that must be
cooled down (cold) and the environment (hot). Typically, a domestic refrig-
erator removes heat from food at about 4°C and dumps heat into a room at
about 20°C, although different types of refrigerator can work in different tem-
perature regimes. The reason you need to plug an electric refrigerator into a
mains supply is so that electricity can supply the work that will ultimately be
dumped as additional heat in the environment.
For a refrigerator to operate it must obey the second law of thermodynamics
so the entropy increase caused by the heat dumped in the environment must
be equal to or greater than the entropy decrease caused by extracting heat
from the stuff inside the refrigerator:
Qhot
∆Senvironment =
Thot
Q
∆Sfood =
− cold
Tcold
Qhot Qcold
∆SUniverse = − ≥0
Thot Tcold
Qhot Qcold
≥
Thot Tcold
Qhot Thot
Which implies that ≥ so Qhot (the heat dumped in the environment)
Qcold Tcold
must be greater than Qcold (the heat extracted from the food). The refrigerator
can only operate if we supply additional energy. This is the work W = Qhot - Qcold.
As far as the Universe is concerned a refrigerator is a heater.
The coefficient of performance (CoP) is a measure of how effective the refrig-
erator is and is the ratio Qcold/W. The higher this is, the more joules of heat are
removed from objects inside the refrigerator per joule of electrical work done,
so it is like the “efficiency” of the refrigerator. The second law sets a limit to
the theoretical thermodynamic efficiency of a heat engine and it also sets a
limit to the theoretical coefficient of performance for a refrigerator:
Q Qcold 1 1
= cold
CoPrefrigerator = = ≤
W Qhot − Qcold Qhot Thot
−1 −1
Qcold Tcold
FP.CH10_3pp.indd 213 3/15/2023 12:15:30 PM

10.6.2 Heat Pumps

Heat pumps are used to extract heat from a cooler environment and dump
it into a warmer environment and so, like a refrigerator, they are really a
reversed heat pump. However, the point of the heat pump is not the amount
of heat extracted from the cold reservoir but the amount of heat delivered to
the hot reservoir. The coefficient of performance for a heat pump is therefore
defined as the ratio Qhot/W and the second law once again sets a limit on this:
Q Qhot 1 1
= hot
CoPheat pump= = ≤
W Qhot − Qcold 1 − Qcold 1 − Tcold
Qhot Thot
10.7 IMPLICATIONS OF THE SECOND LAW
10.7.1 The Second Law, the Arrow of Time, and the Universe
When the second law of thermodynamics is applied to the Universe it states:
The entropy of the Universe tends to a maximum value.
This implies that the arrow of time points in the direction of increasing
entropy. The past is a low entropy state and the future is a high entropy state.
Low High
entropy Arrow of Time. entropy
past future
On a microscopic scale, the Universe is evolving from a macro-state that exists

in a low number of ways (small number of configurations) to a macro-state
that exists in a much larger number of ways and ultimately to a state of maxi-
mum entropy in which the number of possible microscopic configurations
reaches its maximum value. While this final state of the universe is very far in
the future, once reached heat engines would no longer be able to do useful
work because everything would be at the same temperature. Energy would
become “unavailable” (for work) because heat engines operate between heat
reservoirs at different temperatures and increase the entropy of the Universe.
When the Universe reaches thermal equilibrium entropy is at a maximum.
This final state is sometimes called the “heat death” of the Universe.
FP.CH10_3pp.indd 214 3/15/2023 12:15:32 PM

So, have we explained the arrow of time? Not entirely. The microscopic
description of the second law shows that, if a macroscopic system starts off in a
low entropy state it is overwhelmingly likely to move to the high entropy states
that can be realized in a larger number of ways. However, the arrow of time
ceases to exist once the system has reached equilibrium, so one large ques-
tion remains: why did the Universe start in a low entropy state? According to
our analysis, the initial configuration of the Universe must have been one that
can be realized in only a relatively small number of ways (low W). This was a
state of low probability in the sense that if we considered the totality of macro-
states the Universe can have then the actual state in which it began belongs to
a tiny subset of these. This is rather like the first example we considered, the
ink droplet in water – the system starts in a very special low probability, low
entropy state and evolves toward a higher probability, high entropy equilib-
rium. Our Universe is also evolving toward a high entropy equilibrium state.
The assumption that the Universe began in a low entropy state is sometimes
called the “past hypothesis.”
10.7.2 The Second Law and Living Things

On the face of it, living things do not seem to obey the second law of ther-
modynamics. They gather resources and energy and grow from simple begin-
nings into complex functioning beings. They seem to create order from
disorder and lower their own entropy. However, living things are not isolated
systems. They exchange energy and matter with their environments. While it
is true that constructive processes within our cells might lower the entropy
of certain internal structures, it is also true that the metabolism that drives
this generates heat and dumps it into the environment. This increases the
entropy of our surroundings by an amount that is much greater than any local
reduction of entropy inside our bodies so the net effect of a living thing is to
increase the entropy of the Universe. When living things are considered along
with their surroundings they do obey the second law of thermodynamics just
like anything else.
10.7.3 Entropy and Energy Availability

There is a continual search for new sources of energy to satisfy our energy-
hungry civilization. However, it is often said that we are fast approaching
an energy crisis. Given that energy is never created or destroyed this seems
almost contradictory. How can our energy run out? The answer is actually
FP.CH10_3pp.indd 215 3/15/2023 12:15:32 PM

quite simple. Every time we harness a non-renewable source such as oil,

natural gas, or uranium and use it in a heat engine (e.g., a thermal power
station) to do work there is an increase in the entropy of the Universe. The
primary energy source we used has transferred its energy to a large number
of highly dispersed particles (e.g., in heating the atmosphere) at a much lower
temperature. This makes the original energy much harder to harness and use
in another heat engine to do more work. The original energy has not disap-
peared but has become unavailable. Increasing entropy can be thought of as
the increasing unavailability of energy to do work.
10.8 EXERCISES
1. (a) Explain the following terms:

Macroscopic state of a system
Microscopic state of a system
Number of ways
(b) Explain the link between number of ways and entropy.
Use the terms above to explain why, when a small crystal of potas-
(c)
sium permanganate is placed into a large beaker of water, the system
becomes mixed over time but is never observed to un-mix.
(d) Discuss whether, if left long enough, the system could un-mix.
2. (a) What is meant by “The Arrow of Time”?
(b) Explain why Newton’s laws do not provide such an arrow.
(c) Explain how thermodynamics does provide such an arrow.
(d) Discuss whether cosmology provides another arrow of time.
What does the second law of thermodynamics imply about the
3. (a)
entropy of the early universe? Explain.
What does the second law of thermodynamics imply about the
(b)
macro-state of the early universe (in terms of probability)?
Explain why living things might seem to violate the second law of
4. (a)
thermodynamics.
Explain how in fact living things do not violate it and are in fact
(b)
governed by it.
FP.CH10_3pp.indd 216 3/15/2023 12:15:32 PM

Explain how each of the following statements is consistent with the

5. (a)
fact that the entropy of an isolated system cannot decrease:
i. It is impossible to transfer thermal energy to mechanical work
with 100% efficiency.
ii. It is impossible to transfer heat from a colder to a hotter body
with no other changes taking place.
Explain how a refrigerator can transfer heat from a colder to a hotter
(b)
body without violating the second law of thermodynamics.
6. Show that the efficiency of an ideal heat engine is limited by the equation:
T2
efficiency ≤ 1 −
T1
FP.CH10_3pp.indd 217 3/15/2023 12:15:32 PM

FP.CH10_3pp.indd 218 3/15/2023 12:15:32 PM
CHAPTER
11
Oscillations
11.0 OSCILLATIONS
An oscillator undergoes regular periodic motion about a fixed equilibrium
position (or value). The simple model of a mass oscillating on a spring can be
adapted to explain a wide variety of physical phenomena, from lattice vibra-
tions in a crystalline solid to the effects of seismic waves. The mathemati-
cal analysis developed to describe mechanical oscillators can also be used for
electrical and electromagnetic oscillations and is the starting point for under-
standing all kinds of wave motion.
11.1 CAPTURING AND DISPLAYING OSCILLATORY MOTION

A motion sensor can be used to record the position and time of a simple oscilla-
tor consisting of a mass suspended from a light spring that obeys Hooke’s law:
The display shows a graph of position against time. This is sinusoidal with an
amplitude A and time period T. An oscillation which is purely sinusoidal is
called “simple harmonic.”
Amplitude A: maximum displacement from equilibrium.
Time period T: time for one complete cycle of oscillation.
1
Frequency f: number of oscillations per second. f = .
T
FP.CH11_3pp.indd 219 3/15/2023 12:40:15 PM

support
spring
amplitude of
oscillation: A
mass equilibrium position
connection to
oscilloscope
ultrasonic
Data logger
position
sensor
5 displacement / cm
4
3
2
1
me / s
0
-1 0 1 2 3 4 5 6
-2
-3
-4
-5
The graph above shows a simple harmonic oscillation with amplitude of 4.0
cm, time period of 2.0 s, and frequency of 0.50 Hz. At t = 0, the oscillator is
at its maximum positive amplitude so the displacement x of the oscillation can
be represented by cosine function:
x = 4.0 cosine (pt)
FP.CH11_3pp.indd 220 3/15/2023 12:40:15 PM

Oscillations • 221
(pt is in radians so when t = 2 s the oscillator completes one cycle of oscillation).

In general, the displacement x of a simple harmonic oscillator can be repre-
sented by an equation of the form:
x A cos t
The term in the bracket is called the phase of oscillation. It is an angle in
radians.
The oscillator must complete one cycle in a time T so wT= 2π and ω = 2π/T =
2pf. This is called the angular frequency of the oscillation.
x A cos 2ft
δ is a variable phase angle that affects the displacement at t = 0. If δ = 0 the

graph is like the one above. If δ = −π/2 then the graph is a sine curve, starting
at x = 0. Other values of δ simply change the value of displacement at x = 0
without changing the shape of the curve.
11.1.1 Graphs and Equations of Displacement, Velocity, and Acceleration

Most motion data loggers connected to a position sensor can also be used
to display velocity and acceleration. Mathematically, these are related to dis-
placement by differentiation:
x A cost
dx
v A sin t
dt
dv
a 2 A cos t 2 x
dt
There is a π/2 phase difference between v and x and between a and v. There
is a π phase difference between a and x.
The three graphs on the next page show how these graphs are related to one
another. Note the maximum values of velocity and acceleration and when
these occur:
vmax A when displacement is zero
amax 2 A when displacement is maximum and velocity is zero
FP.CH11_3pp.indd 221 3/15/2023 12:40:37 PM

displacement / cm
displacement
5
4
3
2
1
0 me / s
-1 0 1 2 3 4 5 6
-2
-3
-4
-5
v = dx/dt
1
velocity / cms velocity Max. velocity
5
15
vmax = A = 2fA
10
occurs at zero
5 displacement
0 me / s
0 1 2 3 4 5 6
-5
-10
-15
a = dv/dt
Max. acceleraon
acceleraon / cms
2
acceleraon
50 2 2
amax = A = 4 fA
40
30
occurs at max.
20
displacement
10
0 me / s
-10 0 1 2 3 4 5 6
-20
-30
-40
-50
FP.CH11_3pp.indd 222 3/15/2023 12:40:40 PM

11.1.2 Phase and Phase Difference

Oscillations are cyclic repeated motions and the position of an oscillator within
its cycle of oscillations is called its phase. Once complete cycle corresponds
to a phase change of 360° or 2π radians. In graphical terms, a phase shift is
equivalent to moving the graph along the time axis by a certain amount. A
shift of time period T corresponds to a phase change of 2π. The graph below
shows two oscillations with a phase difference of π/3.
displacement / cm
5
4
3
2
1
time / s
0
-1 0 1 2 3 4 5 6
-2
-3
-4
-5
11.2 SIMPLE HARMONIC MOTION

11.2.1 Equation of Motion for Simple Harmonic Motion
Simple harmonic motion is defined by two characteristics:
acceleration is directed toward a fixed equilibrium position.
acceleration is directly proportional to displacement from that position.
The equation of motion for simple harmonic motion is:
acceleration = −
(constant) × (displacement from equilibrium)
a = − w2 x
This can be written as a second-order differential equation:
d2 x
2
2 x
dt
FP.CH11_3pp.indd 223 3/15/2023 12:40:42 PM

The constant is written as w2 because this turns out to be physically significant

(it corresponds to the angular frequency of the oscillation). A general solution
to this equation is:
d2 x
x A cos t 2 x
dt 2
As can be easily shown by differentiating this twice. It then follows that:
ω = 2pf = 2π / T
Notice that if we had not used a squared term for the constant in the equation
of motion we would have had a square root in this solution.
11.2.2 Physical Conditions for Simple Harmonic Motion

To show that a mechanical oscillator will exhibit simple harmonic motion,
we must demonstrate that the forces acting on the oscillator will result in an
equation of motion of form a = − w2x. For an oscillator of constant mass m,
this will be true if F = ma = − mw2x. The conditions are:
resultant force is directed toward a fixed equilibrium position,
resultant force is directly proportional to displacement from that position.
Once we have established that this is the case we can find ω and hence f from
the constant of proportionality between force and displacement:
if F ∝ − x then F = − kx = − mw2x
k 1 k
so and f
m 2 m
11.3 THE MASS-SPRING OSCILLATOR

An ideal mass-spring oscillator has a mass m attached to a massless spring with
spring constant k. In equilibrium, the spring has an extension x0. At this point,
the weight of the mass is supported by the tension in the spring:
kx0 = mg
When the spring is displaced from equilibrium by an additional amount x the

magnitude of the resultant force on the mass becomes:
k(x + x0) − mg = kx
FP.CH11_3pp.indd 224 3/15/2023 12:40:50 PM

kx0 k(x0 + x)
equilibrium
x
mg
mg
Resultant force is in the opposite direction to the displacement from equilib-
rium so:
F = − kx
This satisfies our conditions for simple harmonic motion so an ideal mass-
spring oscillator is a simple harmonic oscillator. Now we can use angular fre-
quency to find the frequency and time period:
For the simple harmonic motion: F m2 x
For the mass-spring oscillator: F kx
k
Therefore:
m
1 k m
Giving: f and T 2

2 m k
11.4 THE SIMPLE PENDULUM

A simple pendulum consists of a point mass suspended from a light inexten-
sible string of length l. When it is displaced by a small angle from its equilib-
rium position it undergoes regular oscillations that are approximately simple
harmonic.
FP.CH11_3pp.indd 225 3/15/2023 12:41:00 PM

l Direction of
resultant
force
A B mg
x

The free-body diagram on the right shows that the resultant of the two forces
(tension and weight) acting on the bob will be mg sin θ along the line shown.
If we consider the horizontal motion (along AB) then the resultant force act-
ing horizontally is:
F mg sin cos
It is clear from the diagram on the left that sin θ = x/l so:
mgx
F cos
l
This does not satisfy our condition for simple harmonic motion because the
resultant force is proportional to x cos θ and not just to x (cos θ is itself a func-
tion of x). However, for small angles, cos θ is approximately equal to 1 so the
smaller the angular amplitude of the oscillation the closer it will be to an ideal
simple harmonic oscillator:
mg
As θ → 0 cos θ → 1 and F → x
l
Under these conditions, the constant of proportionality between F and x
is (mg/l).
mg
Therefore:
l
FP.CH11_3pp.indd 226 3/15/2023 12:41:08 PM

1 g l
Giving: f and T 2

2 l g
Notice that this is independent of the mass of the bob. This is because the
resultant force is directly proportional to the mass so mass will cancel when
calculating accelerations:
a = F/m = (constant×m)/m.
This is for the same reason that objects of different mass fall with the same
acceleration in the same gravitational field.
11.5 ENERGY IN SIMPLE HARMONIC MOTION

11.5.1 Variation of Energy with Time
A simple harmonic oscillator continually transfers energy between kinetic
energy and potential energy. For the simple pendulum the potential energy is
purely gravitational, for the mass-spring oscillator it is a combination of gravi-
tational and potential. If frictional forces (damping) can be neglected the total
energy of the oscillator, which is the sum of its kinetic and potential energies,
remains constant. Under these conditions the oscillator is described as a “free
oscillator.”
It is simplest to analyze the energy of the oscillator by starting with its kinetic
energy.
1 2 1 1
mv m A sin t m2 A 2 � sin 2 t
2
KE
2 2 2
If we take potential energy to be zero at the equilibrium position then the
maximum kinetic energy is equal to the total energy TE:
1
TE m 2 A 2
2
1
TE KE PE so PE TE KE m2 A2 1 sin2 t
2
1
m2 A 2 � cos� 2 t
2
The graphs below show how each type of energy varies during one cycle of
the oscillation:
FP.CH11_3pp.indd 227 3/15/2023 12:41:29 PM

3 energy / J
one cycle
2.5
2
TE
1.5 PE
KE
1
0.5
time /s
0
0 0.1 0.2 0.3 0.4 0.5 0.6
Notice that both KE and PE reach their maximum values twice per oscilla-
tion, this is because they are related to the square of the sine and cosine.
11.5.2 Variation of Energy with Position

The graphs above show how energy varies with time during one oscillation. It
is also interesting to see how the different types of energy vary with position.
To do this we will look at the work done by the resultant force F.
Since F = − kx for simple harmonic motion we can find the total energy of
the oscillator by integrating the work done by the oscillator as it moves from
equilibrium to one amplitude:
x A x A
1
W Fdx
x0

x0
kxdx kA2
2
The oscillator had maximum KE at equilibrium and has zero KE at the ampli-
tude so the maximum KE and therefore total energy of the oscillator is given by:
1 2
kA TE =
2
By the same analysis the work done in moving to a displacement x is ½ kx2
therefore:
1
KE k A2 x2
2
1
PE = kx2
2
FP.CH11_3pp.indd 228 3/15/2023 12:41:47 PM

The variation of each type of energy with position is shown below:
3
energy / J
2.5
2
TE
1.5
PE
1 KE
0.5
0 position / cm
-6 -4 -2 0 2 4 6
11.5.3 Damping
Real oscillators are subject to frictional forces which oppose their motion.
The oscillator must do mechanical work against these forces so (unless it is
driven by an external energy source) its total energy decreases with time and
so does its amplitude. The oscillator is “damped.” The heavier the damping,
the greater the rate at which the oscillator loses energy and the greater the
rate of decay of its amplitude. In many cases, the amplitude decays approxi-
mately exponentially. This occurs when the oscillator loses the same fraction
of its total energy on each oscillation.
5 displacement / cm
4
3
2
1
time /s
0
-1 0 0.1 0.2 0.3 0.4 0.5 0.6
-2
-3
-4
If the oscillation does decay exponentially its displacement varies according to

an equation of the form:
x Ae t cos t
where γ is a damping coefficient.

Light damping has very little effect on the natural frequency of the oscillator
but heavy damping reduces this frequency.
FP.CH11_3pp.indd 229 3/15/2023 12:41:53 PM

11.6 FORCED OSCILLATIONS AND RESONANCE

If an oscillator is driven by an external periodic force it is called a forced
(or driven) oscillator. Its response to the driver depends on the relationship
between the frequency of the driver fd and its natural frequency f0 for free
oscillations. It will also depend on the strength of coupling to the driver and
on the amount of damping in the system.
The qualitative effects on a forced oscillator can be demonstrated very sim-
ply. Take a simple pendulum of about 1 m in length and hold the end of the
string in your hand. Move your hand back and forth along a line at a very low
frequency with an amplitude of about 2.0 cm. The string remains more or
less vertical and the bob of the pendulum moves with the same amplitude
phase and frequency as your hand. Gradually increase the frequency of hand
movement while keeping the same amplitude of motion. The bob continues
to move with the driving frequency but with an increasing amplitude and at
a certain frequency the amplitude of the bob’s motion increases dramatically
while lagging about π/2 behind in phase. If you continue to increase the fre-
quency beyond this point the amplitude of response gets smaller and when
you shake the string at a high frequency the bob stays almost still at the center
of its motion. Closer inspection shows that it does oscillate with the driver
frequency but with a tiny amplitude and a phase lag of almost π. The strong
response occurs when the driving frequency is equal to the natural frequency
of the oscillator and is called “resonance.” This pattern of response is shown
in the graph below:
amplitude
strong
of oscillator
resonance
peak
effect of
increasing
damping
driver
amplitude
driver
frequency fd
fd = f0
FP.CH11_3pp.indd 230 3/15/2023 12:41:54 PM

Increasing damping has two effects:

the amplitude at resonance is lower
the frequency of the resonance is slightly below the natural frequency f0.
At resonance, the oscillator absorbs energy from the driver and the amplitude
grows until energy losses due to damping occur at the same rate as energy is
supplied from the driver. As damping is reduced this balance occurs at ever-
increasing amplitudes and if there was no damping it would grow without
limit. This can be destructive and this kind of destructive resonance is used
in ultrasonic devices used for cleaning jewelry or breaking up kidney stones.
In other situations, damping is deliberately increased in order to reduce the
amplitude at resonance and protect the structure, for example in the design
of earthquake-resistant buildings.
The phase relation between the driver and the driven oscillator varies from 0
at very low frequencies to π at high frequencies. At resonance, the driven
oscillator lags the driver by π/2.
phase difference
between driver and
driven

/2
0 driver frequency
fd
11.7 EXERCISES
1. A particle moves with SHM between points A and C in the diagram below.
Where is it at the instants when:
A B C
(a) It is stationary?
(b) It has the maximum velocity to the right?
FP.CH11_3pp.indd 231 3/15/2023 12:41:57 PM

(c) It has zero acceleration?

(d) It has the maximum acceleration to the left?
(e) It has the maximum kinetic energy?
(f) It has the maximum potential energy.
2. The diagram below shows a trolley tethered between two similar springs.
When it is displaced from equilibrium and released it undergoes periodic
oscillations about an equilibrium position. Neither spring goes slack dur-
ing these oscillations and both springs obey Hooke’s law in compression
and extension. The trolley has a mass of 0.80 kg.
M
A B
equilibrium position
(a) Explain what is meant by the “equilibrium position.”

(b) Explain why the oscillations will be simple harmonic.
The trolley is displaced 6.0 cm to the right and released. It oscillates
with a period of 2.0 s.
(c) Calculate the frequency of the oscillations.
(d) State the initial amplitude of the oscillations.
(e) Write down an equation for the displacement of the trolley as a func-
tion of time. Take t = 0 to be the moment of release.
(f) Sketch graphs of (i) displacement versus time, (ii) velocity versus
time, and (iii) acceleration versus time for the motion of the trolley
when it is released from an initial displacement of 6.0 cm and then
oscillates with a period of 2.0 s. Draw the graphs one above the other
and label them as completely as you can.
(g) Calculate the maximum velocity and total energy of the oscillations.
(h) What would happen to the period of oscillation if the trolley is
stopped and then released from a displacement of 3.0 cm?
FP.CH11_3pp.indd 232 3/15/2023 12:41:58 PM

(i) The period of the oscillations will change if the mass is changed or
the stiffness of the springs is changed. Why?
(j) After the trolley is released its oscillations gradually die away. Explain
why this happens.
(k) As the oscillations decay successive amplitudes are:
6.0 cm, 5.4 cm, 4.9 cm, 4.4 cm, 3.9 cm, 3.5 cm, 3.2 cm
A student suggests that the amplitude is decaying exponentially.

Devise and carry out a test on this data to check this suggestion.
(l) predict the amplitude of the oscillator after 10 oscillations.
3. (a) Show that x A cost is a solution to the SHM equation of motion

d2 x
2
2 x
dt
(b) A particular oscillation can be represented by the equation:
x 2.5 cos 10 t
(i) State the amplitude of this oscillation.

(ii) Calculate the time period and frequency of the oscillation.
(iii) Calculate the maximum velocity of the oscillator.
(iv) Calculate the maximum acceleration of the oscillator.
(c) (i) Derive an equation for the variation of kinetic energy with time
for this oscillator given that the mass of the oscillator is 0.50 kg.
(ii) Sketch a graph of kinetic energy with time for this oscillator.
(iii) Add a second line to your graph to show the variation of poten-
tial energy with time.
4. During an earthquake the floor of a building oscillates vertically with an

amplitude A and frequency f. Derive a formula for the frequency at which
objects in contact with the floor will just lose contact during the earth-
quake.
FP.CH11_3pp.indd 233 3/15/2023 12:42:10 PM

5. An atom of mass 5 × 10−26 kg is in a cubic lattice with all bonds between

adjacent pairs of atoms having a spring constant about 100 Nm−1.
(a) Estimate the natural frequency of oscillation of the atom.
(b) If it is able to absorb radiation at this frequency what part of the EM
spectrum does it absorb?
6. A simple pendulum has a period of 1.6 seconds on Earth.
(a) Calculate its length.
(b) As the temperature rises its length increases by 1%. Calculate the
percentage change in its time period and state whether this is an
increase or a decrease.
(c) Explain why the mass of the pendulum bob does not affect its time
period.
(d) A mass-spring oscillator is set up so that it has the same time period
as the pendulum. Both are then transported to the Moon. How do
their time periods compare on the Moon? Explain.
FP.CH11_3pp.indd 234 3/15/2023 12:42:10 PM

CHAPTER
12
Rotational Dynamics
12.0 INTRODUCTION
The circle is a simple geometric shape that is used throughout science to
model cyclic processes and is often the starting point for more complex theo-
ries (e.g., of planetary motion or electron orbits). Simple harmonic motion
can be regarded as a projection of circular motion onto a diameter making the
concept of phase very clear. Rotating vectors, or phasors, are powerful ways to
model oscillations and waves.
12.1 ANGLES
12.1.1 Measuring Angles in Radians
The degree is an arbitrary division of the circle into 360 equal parts. This is a
useful measure of angle but in physics, it is often simpler to work in a different
unit, the radian. The reason for this is that the radian is defined directly from
the geometry of the circle.
Definition: The angle θ at the center of a circle subtended by an arc of
length l is equal to the ratio l / r where r is the radius of the circle:
FP.CH12_1PP.indd 235 2/9/2023 6:24:59 PM

= l / r
l For a complete circle l = 2r

r
Therefore circle = 2 radians

To convert between degrees and radians:
2 radians = 360
radians = 180
/2 radians = 90
12.1.2 Small Angle Approximations

For small angles, there are some useful approximations that can simplify cal-
culations involving trigonometric functions such as sine and cosine. Consider
a right-angled triangle used to define sine the trigonometric functions
(a = adjacent side, o = opposite side, h = hypotenuse).
(h = radius of
circle)
sin
h
h
o l cos
h

As 0 o l and
a a h (radius)
therefore:
sin (in radians)
cos 1
sin
tan
cos
The significance of this is that, for small angles, we can replace the sine or
tangent of the angle with the angle itself (in radians). But how small is small?
Like all approximations, this depends on how precise a value is needed. The
table below shows that the approximations work to better than 1% when the
angle is 0.1 radians and better than 2% for 0.2 radians.
FP.CH12_1PP.indd 236 2/9/2023 6:25:00 PM

Rotational Dynamics • 237
Angle in radians Sine of angle Cosine of angle Tangent of angle Angle in degrees
1.00 0.842 0.540 1.56 57.3
0.50 0.479 0.878 0.546 28.6
0.30 0.296 0.955 0.309 17.2
0.20 0.199 0.980 0.203 11.5
0.10 0.0998 0.995 0.100 5.73
12.2 DESCRIBING UNIFORM CIRCULAR MOTION

An object moving in uniform circular motion has constant speed and turns at
the same angle every second.
12.2.1 Angular Displacement, Angular Velocity, and Angular Acceleration

Consider an object of mass m moving in a uniform circular motion of radius r
at a constant speed v. There is a linear displacement ds and angular displace-
ment dθ in time dt.
m Angular velocity is defined as the rate of

d 1
change of angle: (rads )
s dt
r
s vt
From the diagram:
v r r
v
Therefore:
t r
Taking t 0 gives:
v
or v r
r
This is a particularly useful relation.
For uniform circular motion ω = constant so the angular displacement ∆θ =

ω∆t. if the period of rotation is T then:
2
2 T and 2 f
T
Which relates the angular frequency ω (rads− 1) to the rotation frequency
f (Hz) or the time period of rotation T (s).
FP.CH12_1PP.indd 237 2/9/2023 6:25:07 PM

The angular acceleration α is defined as the rate of change of angular velocity:
d d 2

dt dt 2
For circular motion the radius is constant so:
d d v 1 dv a

dt dt r r dt r
where a is the tangential acceleration (not to be confused with centripetal
force).
12.3 CENTRIPETAL ACCELERATION AND CENTRIPETAL FORCE

In uniform circular motion, angular velocity and speed are both constant.
Velocity, however, is continually changing. This is because velocity is a vector
quantity and its direction is changing. Acceleration is defined as the rate of
change of velocity, so even though speed is unchanging the object is acceler-
ating. The direction of this acceleration (as we shall show later) is toward the
center of the circle: this is called “centripetal acceleration.” From Newton’s
second law, there must be a resultant force acting toward the center of the
circle. This is called a centripetal force and it is this force that is responsible
for circular motion.
12.3.1 Centripetal Acceleration

Consider a mass m moving in a uniform circular motion at constant angular
velocity ω. The vector diagram on the right shows how the velocity changes
during a short time dt.
FP.CH12_1PP.indd 238 2/9/2023 6:25:15 PM

For small angles dv in the vector triangle becomes equivalent to an arc of a

circle (shown by the dotted line) so we can write:
v
As dθ → 0
v
vt
We also have:
r
vt v v v2
Equating the two expressions for dθ: and a
r v t r
dv v2
In the limit of dθ → 0: =
a = this is an expression for the acceleration.
dt r
This can also be written in terms of the angular velocity: a = rw2.
It is also clear from the vector diagram that in the limit of dθ → 0 the vector
change in velocity dv becomes perpendicular to both v1 and v2 and is directed
inwards toward the center of the circle. It is a “centripetal” (center-seeking)
acceleration.
v2
Centripetal acceleration: a r2 toward the center of the circle.
r
12.3.2 Centripetal Force

Newton’s first law of motion states that an object will continue to move in
a straight line at constant velocity unless acted upon by a resultant external
force. This implies that, when an object changes the direction of motion even
if its speed does not change, there must be a resultant external force. Newton’s
second law relates the resultant force to acceleration by the equation F = ma
so the magnitude of the force acting in uniform circular motion is given by
mv2
Centripetal force: F ma mr2 toward the center of the circle.
r
12.3.3 Centripetal Not Centrifugal

It is a matter of common experience that one feels forced to the outside of
the curve when traveling inside a turning vehicle. This is often explained by
saying that a centrifugal (“center fleeing”) force acts outward from the center
of the turning circle. This force does not exist. It is an apparent or “iner-
tial” force that seems to be needed to make sense of motion in a non-inertial
(accelerating) reference frame. To understand this more clearly consider how
FP.CH12_1PP.indd 239 2/9/2023 6:25:37 PM

a loose object would behave inside a moving vehicle that is initially moving in
a straight line and then begins to turn in an arc of a circle. The object is free
to slide inside the vehicle and is initially traveling at the same speed as the
vehicle. The upper diagrams show the view from outside the vehicle and the
lower diagrams show the view from inside the vehicle.
v v
v
Vehicle and object moving at the v

same constant velocity
Vehicle begins to turn but there is no resultant force

on object so it connues to move in a straight line.
Vehicle side collides with object providing inward

force: object begins to move in circular moon
Object remains at rest Object collides with wall and

inside vehicle exerts an outward force on it.
Relave to vehicle object begins to accelerate

toward side. This is explained by assuming there
is an outward, centrifugal force.
In the first case (upper diagrams) we are describing the physics from an exter-
nal non-accelerating (inertial) frame of reference. Newton’s laws of motion
apply so the object continues to move in a straight line in the absence of a
resultant force. There is no acceleration or resultant force until the side of the
vehicle has moved inwards and collides with the sliding object.
FP.CH12_1PP.indd 240 2/9/2023 6:25:37 PM

In the lower case, the apparent acceleration is because the observer is inside
the vehicle and does not take into account his own acceleration. In order to
explain the apparent acceleration of the object he introduces an imaginary
outward force, the centrifugal force.
Centrifugal forces are examples of “inertial forces,” introduced in order to
explain observed physics from a non-inertial (accelerating) frame of refer-
ence. Unlike the centripetal forces in an inertial reference frame, inertial
forces have no physical origin and that is why they are referred to as imagi-
nary. They are helpful if we need to solve physical problems inside a rotating
reference frame but we must always bear in mind that they are an artifact of
our reference frame and do not arise from physical causes.
12.3.4 Moving in Uniform Circular Motion

When an object moves in uniform circular motion velocity and acceleration
are always perpendicular. This means that force is also perpendicular to veloc-
ity so there is never a displacement in the direction of the resultant force and
the resultant force does any work on the object. The diagram on the right
shows velocity and acceleration vectors at different positions in the circle.
v
v
a
a
v
a
a
v
a
a
v v
When you swing a stone in a circle on the end of a string the forces acting on
the stone are directed toward the center of the circle. If the string suddenly
breaks the stone has no resultant force acting on it and flies off along a tan-
gent, it does not accelerate outwards because there is no centrifugal force. It
moves in a straight line at constant velocity. (Here we have ignored the effects
of other external forces such as gravity).
FP.CH12_1PP.indd 241 2/9/2023 6:25:38 PM

A v
inial path of stone if string

breaks when stone is at point A
Note that centripetal force is not a new kind of force in physics. It is the
magnitude of the resultant force on a body moving with constant uniform
circular motion. It arises as a result of the real physical forces that act on it.
In some cases this is simple. For example, the Moon’s orbit around the Earth
is approximately circular and the centripetal force is provided by the gravita-
tional attraction toward the Earth.
Fgrav
Moon
Earth Moon
later
4 2 mr
Fgrav mr 2
T2
T is the Moon’s orbital period, r is its orbital radius and m is its mass.
A more complex situation involves an object moving in a vertical circle in a
uniform vertical gravitational field. Examples might be a person on a funfair
ride, a plane looping the loop, or just a stone on a string. If the motion is
uniform (constant angular velocity) the resultant force stays constant in mag-
nitude but the forces that contribute to it change. The example below shows
a person standing in a capsule on a fairground ride and indicates the forces in
four different positions. The capsule is rotating at constant angular velocity.
FP.CH12_1PP.indd 242 2/9/2023 6:25:44 PM

Only two forces act on the man. His weight mg is the same in all positions but
the contact force R from the floor of the capsule changes with position. The
resultant force at all points is the vector sum of weight and reaction force and
must be F = mrw2 toward the center of the circle.
g R
mg
R R

D B
mg mg
C
mg
Position A:
Weight and contact force act in the same direction so, mrw2 = mg + RA
RA = mrw2 − mg
For one particular angular velocity w0, the contact force falls to zero, RA = 0.
g
This is when mrw02 = mg so 0 .
r
Under these conditions, the contact force between the man and the floor is
zero and he feels “weightless.” This is only apparent weightlessness because he
is in fact free falling at this moment and the capsule is accelerating downwards
FP.CH12_1PP.indd 243 2/9/2023 6:25:49 PM

at g. For higher values of angular velocity, there must be a contact force from
the floor. For lower values of angular velocity, he will lose contact with the
floor and begin to fall downwards toward the roof of the capsule. In practice,
this sets a minimum value for the practical angular velocity in such a fair-
ground ride.
Position C:
Weight and contact force act in opposite directions so, mrw2 = RA − mg
RA = mrw2 + mg
The contact force is greater than his weight. He experiences this through the
reaction force from the floor pushing up on his feet. This makes him feel
heavy at the bottom, as if his weight has increased. Once again this is only an
apparent increase in weight since the gravitational forces have not changed.
RD RB
mg mg

2
Resultant = mr Resultant = mr2
RD sin = mg RD sin = mg
RD cos = mr2 RD cos = mr2
tan = g / r2 tan = g / r2
Positions B and D:
In both positions the contact force has a vertical component that balances the
weight and a horizontal component that provides centripetal force:
12.4 CIRCULAR MOTION, SIMPLE HARMONIC

MOTION, AND PHASORS
Simple harmonic motion is equivalent to the projection of circular motion
onto a diameter. This can be shown using a phasor, a rotating vector. The
FP.CH12_1PP.indd 244 2/9/2023 6:25:49 PM

length of the phasor, A, defines the radius of the circle and is equal to the
amplitude of the simple harmonic motion. The constant angular velocity ω
of rotation of the phasor is equal to the angular frequency of the simple har-
monic motion. The angle the phasor turns through is the phase of the simple
harmonic motion.
displacement
t=0
A
t = 3T/4 me
t = T/4
t = T/2 A
t=0 t = T/4 t = T/2 t = 3T/4 t=T
Phasors are a useful mathematical tool that is particularly useful when analyz-
ing wave superposition.
Component of phasor on x-axis:
x = A cos
A A cos
For rotaon at constant angular velocity:
= t
Therefore: x = A cos t
(the equaon for the displacement of a simple

harmonic oscillator).
FP.CH12_1PP.indd 245 2/9/2023 6:25:50 PM

12.5 ROTATIONAL KINEMATICS

12.5.1 Equations for Uniform Angular Acceleration
The suvat equations are invaluable when dealing with problems of constant
acceleration. There is a corresponding set of equations for rotation under
constant angular acceleration. Since the underlying definitions of linear and
angular motion are mathematically equivalent all the equations have the same
form: we simply replace linear variables with rotational variables.
Linear definitions Rotational definitions
Displacement s Angular displacement θ
ds d
Velocity v= Angular velocity
dt dt
dv d 2 s d d 2
Acceleration =
a = Angular acceleration
dt dt 2 dt dt 2
This allows us to use the following analogy:

s→θ
u → wi
v → wf
a→α
t→t
to generate corresponding rotational equations for uniform angular
acceleration:
v u at → f i t
s
u v t
→
i
f t
2 2
1 1
s ut at2 → i t t
2 2
1 2 1
s vt at → f t t2
2 2
v u 2 as
2 2
→ f i 2
2 2
These are used in exactly the same way as the original suvat equations.
FP.CH12_1PP.indd 246 2/9/2023 6:26:53 PM

12.5.2 Rotational Kinetic Energy

When a rigid body rotates about a fixed axis through its center of mass it has
no translational kinetic energy. However, every point mass inside the body
has a velocity and therefore has its own kinetic energy. The sum of all of these
kinetic energies is equal to the rotational kinetic energy of the body. When a
body is rolling it has both translational kinetic energy (because its center of
mass is moving) and rotational kinetic energy (because it is rotating about its
center of mass). The total kinetic energy is equal to the sum of translational
and rotational kinetic energies.
The diagram below shows a rigid body rotating about its center of mass (CM).
It consists of N particles and the ith particle has mass mi and is at a distance
ri from the axis.
mi
ri vi

CM
The kinetic energy of the ith particle is: kei = ½ mivi2 = ½ miri2w2
The rotational kinetic energy of the body is:
i N i N
1
RKE kei m i ri 2 2
i1 i1 2
The angular velocity is the same for all particles so:
1 i N
RKE m i ri 2 2
2 i1
The term in brackets is called the “moment of inertia” I of the body
i N
I m i ri 2
i1
The SI unit for a moment of inertia is kgm2.
FP.CH12_1PP.indd 247 2/9/2023 6:27:06 PM

Moment of inertia in rotational dynamics plays an analogous role to mass

(inertia) in linear mechanics. However, unlike mass, the moment of inertia
is not a fixed quantity for a body because it depends on both the mass and its
distribution about the axis of rotation.
Using the moment of inertia we can extend the analogy between linear and
rotational motion:
Linear kinetic energy = ½ mv2 → Rotational kinetic energy = ½ Iw2
h

Here is an example that involves both translational and rotational kinetic

energy: a cylinder of radius r, mass m, and moment of inertia I rolling without
slipping down an inclined plane so that its center of mass drops through a
vertical height h.
Neglecting energy losses due to friction, the gain in total kinetic energy must
equal the loss of gravitational potential energy. If the cylinder starts from rest
then its final kinetic energy will be (using ω = v/r):
mgh = translational KE + rotational KE = ½ mv2 + ½ Iw2 = ½ mv2 + ½ Iv2/r2
This can be rearranged to give the linear velocity at the bottom.
2 mgh
v
I
m 2
r
The larger the moment of inertia the lower the final linear velocity. This is
because a larger fraction of the energy goes into rotation. A cylinder has a
larger moment of inertia than a sphere because more of the mass is farther
from the axis. If a cylinder and ball are rolled down the same slope side by side
then the ball will reach the bottom first.
FP.CH12_1PP.indd 248 2/9/2023 6:27:09 PM

If the surface is completely smooth and the object slides without rolling then
there is no rotational kinetic energy and the expression above reduces to:
v = 2 gh
This shows that a block sliding down a smooth slope will always get to the bot-
tom faster than any round-rolling object on the same slope.
12.5.3 Angular Momentum

The angular momentum of a point mass is defined as the moment of its
momentum about a point. That is the linear momentum multiplied by its per-
pendicular distance from the point considered:
m
Magnitude of linear momentum p = mv
Magnitude of angular momentum l = mvr

v
1
r S.I. unit for angular momentum: kgms or Js
Angular momentum is actually the vector cross product of the displacement

vector r and the linear momentum p. This is written l = r ^ p. The direction
of angular momentum is related to the axis of rotation. For 2D problems, we
need only consider whether the direction is clockwise or counterclockwise.
The angular momentum of a rigid body can be found by summing the indi-
vidual contributions from each point particle inside the body (in the same way
that we derived a formula for rotational kinetic energy).
mi
ri vi

CM
FP.CH12_1PP.indd 249 2/9/2023 6:27:12 PM

The angular momentum of the ith particle is: li m i vi ri m i ri 2

The angular momentum of the whole body is therefore:
i N i N
L l i mi ri 2
i 1 i 1
The angular velocity is the same for all particles so:
i N
L mi ri 2 I
i 1
i N
Notice that the “moment of inertia” I of the body, m i ri 2 , once again plays
a role analogous to mass in linear mechanics i 1
Linear momentum = mv → Angular momentum = Iw
12.5.4 The Second Law of Motion for Rotation.

In linear mechanics, resultant force is equal to the rate of change of linear
momentum. An analogous relation exists in rotational motion and depends on
the torque or moment of a force.
Torque (moment of a force)
Torque = force
i F perpendicular distance
r from pivot P
P = Fr
S.I. unit for torque: Nm
(Mathematically torque is the vector cross product of the displacement vector

from the pivot to the point of application of the force and force.)
When a resultant torque is applied to a rigid body each point particle inside
the body experiences a resultant force that causes a linear acceleration. The
FP.CH12_1PP.indd 250 2/9/2023 6:27:20 PM

sum of torques from these individual forces must equal the total resultant
force on the body.
mi
ri fi

CM
For the ith particle: f i m i ai m i ri (using a = rα)

i N i N i N
f i ri m i ai ri m i ri 2 I
i 1 i 1 i 1
This has the same form as Newton’s second law in linear mechanics:
Linear case: F = ma → Rotational case: Γ = Iα

d
We can use to relate this to angular momentum.
dt
d mv d I
Linear case: F →
dt dt
We can now state newton’s laws of motion in a form that applies to rotational
motion.
Newton’s first law for rotational motion.
An object continues to rotate with constant angular momentum until
acted upon by a resultant external torque.
Newton’s second law for rotational motion.
The resultant external torque on a system is equal to the rate of change of
angular momentum of that system.
Newton’s third law for rotation.
When object A exerts a torque on object B, B exerts an equal but opposite
torque on A.
FP.CH12_1PP.indd 251 2/9/2023 6:27:34 PM

12.5.5 Conservation of Angular Momentum

Newton’s laws of motion for rotation show that angular momentum can only
change as a result of an external resultant torque:
d I
so I dt
dt
that is, change of angular momentum = angular impulse
They also show that when systems interact the torques experienced by each
system are equal and opposite so while they interact they exert equal and
opposite angular impulses on one another. Consequently, any change of
momentum of system A is equal and opposite to the change of momentum of
system B and the total angular momentum of the combined system does not
change. This can be stated as the law of conservation of angular momentum:
The angular momentum of an isolated system (no external resultant
torque) is constant.
An interesting example of this is when an ice skater pirouettes. She balances
on the point of one skate while rotating relatively slowly with arms and one leg
outstretched. Gradually she draws her arms and leg closer to the axis of rota-
tion, thus reducing her moment of inertia. There is no external torque so her
angular momentum (L = Iω) cannot change. If I falls ω must increase, so she
spins faster. Another similar example is the collapse of the core of a massive
star at the end of its life. Its radius reduces by many orders of magnitude so
the rotation rate can be very high – some neutron stars have rotation periods
of less than one millisecond.
12.6 DERIVING EXPRESSIONS FOR MOMENTS OF INERTIA

i N
Moment of inertia is defined by the equation I m i ri 2 and we can use
i 1
this to derive useful expressions for the moments of inertia of a range of stand-
ard mass distributions.
12.6.1 Moment of Inertia of One or More Point Masses

For a single point mass distance r from the center of rotation:
FP.CH12_1PP.indd 252 2/9/2023 6:27:43 PM

m
r
P
For several point masses at various distances from the center of rotation:
r1 m1
r2
r3 m2
m3
i N
I m i ri 2 m1 r12 m2 r2 2 m3 r3 2
i 1
12.6.2 Moment of Inertia of a Rod

For a continuous mass distribution, we need to convert the sum into an inte-
gral. For a thin rod lying along the x-axis, this can be done by considering a
short section of length dx.
l
P
x
x
Let the total mass of the rod be m. The mass of the short section of length dx
mx
at distance x from the center of rotation is m and the moment of iner-
l
mx2 x
tia of this section (treated as a point mass) is I .
l
FP.CH12_1PP.indd 253 2/9/2023 6:27:51 PM

The moment of inertia of the rod about one end is therefore:

x l
i N xl mx2 dx 1 2
Irod end m i ri 2 I ml
i 1 x0 x0
l 3
A similar approach can be used to determine the moment of inertia about any
point in the rod. For example, the moment of inertia about the center of the
rod would be found by taking x = 0 at the center and then integrating from
− l/2 to + l/2:
x l / 2
i N xl mx2 dx 1
Irod CM m i ri 2 I ml 2
i 1 x0 x l / 2
l 12
This moment of inertia is smaller than the moment of inertia about one end
because more of the mass is now closer to the axis of rotation.
You can get a qualitative “feel” for a moment of inertia by taking a rod (e.g.,
ruler) holding it in the center, and trying to rotate it back and forth. You will
feel resistance to rotation. Now do the same thing while holding it near one
end. The resistance to rotation has increased significantly – the moment of
inertia is larger.
12.6.3 Moment of Inertia of a Cylindrical Shell and a Uniform Cylinder

A cylindrical shell or ring with a thickness much less than its radius rotating
about its center has all of its mass at a constant distance from the center of
rotation. Its moment of inertia is therefore simply I = dmr2 where dm is the
mass of the shell. We can find the moment of inertia of a disc or cylinder by
integrating over all of the thin cylindrical shells that make it up.
Solid cylinder is
made up of an
r infinite number
of infinitely thin
shells.
The diagram at the top of the next page shows an end view of a solid cylinder
of total mass m and radius r. A thin shell at radius x is shown.
FP.CH12_1PP.indd 254 2/9/2023 6:28:01 PM

Thin cylindrical shell

of thickness x and
mass m
The moment of inertial of a cylindrical shell is δI = dmx2 and the mass of the
2 xmx 2 xmx
shell is a fraction of the mass of the cylinder given by m .
r 2 r2
This is because the area of the shaded strip above is effectively 2pxdx if treated
like a long thin rectangle.
The moment of inertia of the entire cylinder is then:
xr
x r
2 mx3 dx 1 2
Icylinder I r 2 2 mr
x0 x0
12.6.4 Moment of Inertia of a Uniform Sphere

A uniform sphere of mass m and radius r can be considered to be made up of
an infinite number of thin discs of varying radius. The total moment of inertia
is the sum of moments of inertia of all such discs.
The moment of inertia of the sphere is:

1 2
Isphere
0

2
x dm
Thin disc of thickness x a distance

x above center of sphere has mass
m and radius r cos .
x r
Its moment of iner
a about the
axis of rota
on
rota
on axis is:
I = ½ x2 m
FP.CH12_1pp.indd 255 3/14/2023 4:39:51 PM

In order to integrate this we must express dm in terms of dθ . It simplifies

things if we introduce the density of the material of the sphere ρ at this stage:
m r 2 sin 2 x
dx
However, x and θ are related by: x = r cos θ so r sin
d
so we can replace dx by −r sin θ dθ

1
Isphere
0
r 5 sin 5 d
2
This can be integrated using standard techniques to give:
8r 5
Isphere
15
3m
Now replace the density with: to give:
4 r 3
2
I sphere = mr 2
5
12.7 TORQUE WORK AND POWER

It should come as no surprise that we can extend the analogy between linear
and rotational motion to include work and power. This can be shown using
the same summation techniques as we used in Sections 12.5.2 and 12.5.3. The
important results are:
Work: W = Fs (constant force) → W = Γθ (constant torque)
W Fds → W d
Power: P = Fv → P = Γw
12.8 ROTATIONAL OSCILLATIONS, THE COMPOUND

PENDULUM
A compound pendulum consists of an extended rigid body pivoted at one
point and undergoing regular periodic oscillations. In order to analyze these
FP.CH12_1PP.indd 256 2/9/2023 6:29:02 PM

oscillations we need to consider the resultant torque when the pendulum is

displaced through a small angle from its equilibrium position.
The center of mass CM is a distance h below the pivot P.
P P
h h
CM CM
mg
Equilibrium
The right-hand diagram above shows how the weight produces a restoring
torque Γ = − mgh sin θ (negative sign indicates that this is directed toward
equilibrium) when the body is displaced through angle θ.
d d2
From Newton’s second law: I I 2
dt dt
The equation of motion for this compound pendulum is therefore:
d2 mgh sin
2

dt I
For small angles sin θ → θ (in radians) so for small angles, we can write
(approximately):
d2 mgh
2

dt I
This has the same mathematical form as the equation of motion for simple
harmonic motion
d2 x
2
2 x
dt
so it will have sinusoidal solutions of the same kind.
FP.CH12_1PP.indd 257 2/9/2023 6:29:26 PM

The compound pendulum undergoes simple harmonic angular oscillations

(for small angles). Its period and frequency can be found using:
mgh
2
I
I 1 mgh
giving: T 2 and f
mgh 2 I
12.9 EXERCISES
1. (a) Use Newton’s laws of motion and a suitable diagram to show that for
an object to move in uniform circular motion there must be a result-
ant force acting toward the center of the circle.
(b) Explain why a centripetal force cannot do work on an object moving
in circular motion.
2. A small ball of mass m is released from a height h on a track that leads to
a looping section as shown below.
h
r
B
(a) Derive an expression in terms of r for the minimum height h from

which the ball can be released if it is to complete the loop without
losing contact with the track. Ignore the rotational motion of the ball
and assume friction is negligible.
FP.CH12_1PP.indd 258 2/9/2023 6:29:38 PM

(b) Draw free-body diagrams to show the forces acting on the ball at
points A, B, and C.
(c) Discuss whether the resultant force on the ball is toward the center
of the circle at all points as it completes the loop.
(d) Derive an expression for the velocity of the ball at point C.
3. A car tire of radius r is rolling along a flat horizontal surface at constant
speed v. Copy the diagram below and add labeled arrows to show the
velocity and acceleration of a particle fixed to the tire at each of the
points A to D.
D B
4. A vinyl record rotates at 33 revolutions per minute. Its diameter is 30 cm

and its mass is 200 g.
(a) Calculate the time period of the rotation.
(b) Calculate the frequency of the rotation.
(c) Calculate the angular velocity of the record.
(d) Calculate the angular momentum of the record.
(e) Calculate the average torque needed to accelerate the record to its
playing speed in 0.50 s.
(f) Calculate the rotational kinetic energy of the record.
5. Derive an expression for the moment of inertia of a uniform rod of length l
rotating about an axis perpendicular to the rod and through a point one-
third of the way along the rod.
FP.CH12_1PP.indd 259 2/9/2023 6:29:38 PM

6. A space station consists of two spherical accommodation pods of radius

20 m separated by a connecting tunnel of length 400 m. The mass of each
pod is 50,000 kg and the mass of the connecting tunnel is 80,000 kg. The
space station is rotating about its center of mass.
40 m
400 m
(a) Estimate the moment of inertia of the space station about its rota-
tion axis by treating the pods as point masses concentrated at their
centers of mass and by treating the connecting tunnel as a rod. The
moment of inertia of a rod of mass m and length l about its CM is
1
given by I = ml 2 .
12
(b) Explain why an astronaut standing on the outer edge of a pod would
experience artificial gravity.
(c) Calculate the angular velocity of rotation that will create an effect of
artificial gravity of strength 9.8 Nkg− 1 at the outer edge of the pods.
(d) Calculate the rotational kinetic energy of the space station when its
angular velocity is 0.20 rads− 1.
7. A small flywheel has a moment of inertia of 2.0×10− 3 kgm2 and it is rotat-
ing at 50 revolutions per second. The frictional torque working against
rotation is 2.4×10− 2 Nm.
(a) Calculate the angular momentum of the flywheel including appro-
priate units.
(b) Calculate the time taken for the flywheel to come to rest.
8. A group of children is sitting on the outside of a merry-go-round that is
rotating at a constant rate. They all move toward the center of the merry-
go-round and it speeds up. Explain why this occurs and discuss the angu-
lar momentum and energy changes that take place in this system. Assume
that external torques can be ignored.
FP.CH12_1PP.indd 260 2/9/2023 6:29:45 PM

9. The Earth’s rotation is slowing because of its tidal interaction with the
Moon. This causes the day length to gradually increase. Day length
increases by 1 second roughly every 18 months.
(a) Use this information and the data below to calculate the torque act-
ing on the Earth to slow down its rotation. Assume the Earth is of
uniform density.
Mass of Earth = 6.0×1024 kg. Radius of Earth = 6400 km
Number of seconds in a day = 86400
Moment of inertia of a solid sphere of mass m and radius a is
I = 2/5 ma2
(b) The Earth is not uniform, its density increases toward its center.
State and explain how this would affect your answer to (a).
FP.CH12_1PP.indd 261 2/9/2023 6:29:45 PM

FP.CH12_1PP.indd 262 2/9/2023 6:29:45 PM
CHAPTER
13
Waves
13.0 INTRODUCTION
There are many different types of wave but the underlying physics is com-
mon to all of them, whether they are mechanical vibrations in a medium, like
sound, or vibrations of an electromagnetic field, like light. They all transfer
energy, reflect, refract, diffract, and interfere. The wave model is one of the
most important models in physics and in the next four chapters we will inves-
tigate waves in considerable detail.
13.1 DESCRIBING AND REPRESENTING WAVES

13.1.1 Basic Wave Terminology
Imagine starting circular water waves using a dipper oscillating up and down
at the center of a pond. A series of circular, equally spaced crests moves out-
wards. The source of these waves is the oscillation of water at the center and
they are formed because as those water molecules move up and down, they
exert forces on the ones adjacent to them pulling them up and down slightly
later so that these molecules oscillate with the same frequency and ampli-
tude but with a small phase delay relative to the source. The phase delay
increases with distance from the source and when this delay is 2π radians the
molecules are once again oscillating in phase with the source. Around any
circle centered on the source, all of the molecules are oscillating in phase with
one another because they are all equidistance from the source. The visible
circles are formed by crests, where all of the oscillations are simultaneously
at a positive amplitude. These positions move outwards at constant velocity.
FP.CH13_3pp.indd 263 3/15/2023 12:41:24 PM

This is called the phase velocity of the wave. It is the movement of the wave
disturbance but not the outward movement of matter. The diagram below
shows how this appears from above. The circular lines of constant phase are
called wave fronts and the arrows are called rays. Rays and wave fronts are
perpendicular to one another and are alternative ways to represent the wave
pattern. The separation between two adjacent wave fronts is called the wave-
length λ of the wave.
ray
The time period T for the formation of one wave is equal to the time period of
the source oscillations so all particles in the wave oscillate with the same fre-
quency f as the source. During one oscillation of the source, the wave moves
forward a distance equal to its wavelength so the phase velocity v of the wave
is given by:

v f
T
We can now define some terms:
Traveling or Progressive wave
A wave where all the particles oscillate with the same (or a decaying)
amplitude but with a progressive phase delay in direct proportion to their
distance from the source.
Wavelength λ
Shortest distance between two particles oscillating in phase in the wave.
FP.CH13_3pp.indd 264 3/15/2023 12:41:30 PM

Waves • 265
Time period T
Time for one complete wave to leave the source or time for a particle to
complete one cycle of oscillation.
Frequency f
1
Number of waves leaving the source in 1 second: f = .
T
Phase velocity v
Velocity at which a point of constant phase (e.g., a wave crest) travels away
from the source:
v f
Wave front
Line of constant phase in the wave pattern. Perpendicular to rays.
Ray
Arrow in the direction of energy transfer. Perpendicular to wave fronts.
13.1.2 Transverse and Longitudinal Waves

The source of a wave is a vibration or oscillation. There are two distinct ways
in which the oscillations can be related to the direction of energy transfer
(direction of travel of the wave):
Transverse Waves
Vibration directions are perpendicular to the direction of energy transfer.
phase velocity v
A
direction of
oscillations
direction of
energy transfer
A
FP.CH13_3pp.indd 265 3/15/2023 12:41:41 PM

As the wave moves to the right, the disturbance at each position varies verti-
cally with the same amplitude as the source (assuming no energy dissipation).
All electromagnetic waves are transverse, as are seismic S-waves (secondary
or shear waves).
compression compression compression

rarefaction rarefaction
oscillation
direction
phase
wavelength velocity v
Longitudinal Waves
Vibration directions are parallel to the direction of energy transfer. This
results in regions of compression (shown as dark areas below) and rarefaction
(low density, shown as lighter areas below) in the medium through which the
wave passes.
As the wave moves to the right, the disturbance at each position also varies
horizontally, with the same amplitude as the source (assuming no energy dissi-
pation). Sound and ultrasound are longitudinal waves, so are seismic P-waves
(primary or pressure waves).
Some waves are a combination of longitudinal and transverse waves. This
results in particles undergoing elliptical motions as the wave passes. Surface
water waves are like this.
direction of motion
of water wave
longitudinal transverse
component component
FP.CH13_3pp.indd 266 3/15/2023 12:41:47 PM

Waves • 267
13.1.3 Graphs of Wave Motion

For a one-dimensional wave, for example, a wave traveling along the x-axis,
we can plot a graph of how the wave disturbance (y) varies with position (x)
at any fixed moment of time or of how the disturbance varies with time at any
particular position.
wave disturbance
(e.g. displacement)
versus position at x
one time
A

wave disturbance
(e.g. displacement)
versus time at one t
position
A T
These graphs are similar to one another but represent different things. The
repeat distance in space is the wavelength and the repeat distance in time is
the time period of the wave.
13.1.4 Equation for a One-Dimensional Traveling Wave

The displacement of a one-dimensional wave varies with both time and posi-
tion, so the equation to represent the whole wave will be a function of these
two variables. For a traveling wave, all points oscillate with the same ampli-
tude but with a progressive phase delay with respect to the source. If the
FP.CH13_3pp.indd 267 3/15/2023 12:41:47 PM

source oscillates with simple harmonic motion, then the displacement at the
source (x = 0) can be written as:
y A cos t
where ω = 2pf.
The oscillation at any other point on the positive x-axis will be given by an
equation of the form:
y A cos t
where δ is a phase delay that depends on x (distance from the source). When
x = λ the phase delay is 2π (two particles separated by one wavelength oscil-
2 x
late in phase). Therefore and so:

2 x
y A cos t

2
The term is called the wavenumber k so we can write the equation for a

1D traveling wave of amplitude A moving in the positive x-direction in the
simple form:
y A cos t kx
If the wave is moving in the negative x-direction the sign in the equation
changes:
y A cos t kx
Derivation of Phase Velocity

The phase velocity is the velocity at which a point of constant phase, for exam-
ple, a wave crest, moves along the axis. The phase of the wave is given by:
t kx
d dx
k kv 0 (for the position of constant phase).
dt dt
2 f
v f
k 2
v f
FP.CH13_3pp.indd 268 3/15/2023 12:42:14 PM

Waves • 269
13.1.5 Amplitude and Intensity

The intensity of a wave is the power delivered per unit area (Wm− 1). This will
be proportional to the energy of each oscillator in the wave. For a harmonic
wave, the disturbance at each point is simple harmonic, so the total energy of
each oscillator is given by an expression of the form:
1
E m2 A2
2
It follows that the intensity of a harmonic wave is directly proportional to the

square of its amplitude. This is a general and very useful relation:
I ∝ A2
Doubling amplitude increases intensity by a factor of four, etc.
13.2 REFLECTION
When a wave strikes a boundary, it can be wholly or partially reflected. The
law of reflection states that:
The incident and reflected rays make equal angles to the normal to the
surface at the point of incidence and both rays and the normal lie in the
same plane.
normal
i r
If two plane reflectors are placed at 90° to one another then a ray striking
either one of them will return parallel to its original path. This is used in car
reflectors and was used by the Apollo astronauts who left an array of corner
reflectors on the surface so that lasers sent from Earth would reflect back and
allow the distance between the Earth and Moon to be measured precisely.
FP.CH13_3pp.indd 269 3/15/2023 12:42:21 PM

i1 = r1
i2 = r2 = 90 i1
i1 Ori
Original incident ray has turned
r1 through a total angle of:
thro
(180 (i1 + r1)) + (180 (i2 + r2)) =

i2 r2
360 2i1 2i2 = 180
13.3 REFRACTION
13.3.1 Refraction at a Boundary Between Two Different Media
Wave velocity depends on the medium through which the wave is traveling.
For example, light slows down when it travels from air into glass, and surface
water waves slow down when they travel from deeper to shallower water. If
the wave strikes the boundary between two different media at an angle to the
normal then the wave direction changes. This is called refraction.
medium 1: v1 medium 1: v1
1 1
2 2
medium 2: v2 medium 2: v2
No waves are created or destroyed so the frequency must be the same in both
media. Therefore:
v2 2
v1 f 1 and v2 f 2
v1 1
FP.CH13_3pp.indd 270 3/15/2023 12:42:32 PM

Waves • 271
13.3.2 Snell’s Law of Refraction

A simple experiment can be used to find the relationship between the inci-
dent and refracted angles at a boundary between two media, in this case air
and glass.
Sheet of A3 white
paper fixed to
drawing board
ray
box
1
2
Rectangular
glass block
The position of the edge of the block and the normal to this line are marked
onto the white paper and the ray box is used to direct a single fine ray at the
point where these two lines intersect. The ray is traced using optical pins
(shown by X on the diagram) pushed into the board. The block can then be
removed and the path of the ray outside and inside the block can be drawn
using a pencil and ruler. The incident angle θ1 and the refracted angle θ2 are
then measured using a protractor. If this is done for incident angles in the
range 0 to 90° then a graph of sin θ1 against sin θ2 is a straight line.
sin 1
sin 2
sin 1 sin incident angle

This shows that: constaant
sin 2 sin refracted angle
FP.CH13_3pp.indd 271 3/15/2023 12:42:35 PM

This is called Snell’s law and the constant is called the relative refractive
index 1n2 for a ray passing from medium 1 into medium 2.
1 Snell’s law:
sin 1
1 n2
sin 2
2
If medium 1 is the vacuum, then this constant is the absolute refractive index
for medium 2.
In practice, since the speed of light in air is almost the same as the speed of
light in a vacuum, the absolute refractive index is usually used when medium 1
is air.
Refraction occurs because the wave velocity changes at the boundary. This
implies that refractive index must be related to this change in velocity.
The diagram below can be used to derive this relationship.
B
1
Medium 1 1
Medium 2 A 2 C
D
2
AB and DC are adjacent wave fronts so they are separated by one wavelength:
v1
BC= 1 =
f
v2
AD= 2 =
f
FP.CH13_3pp.indd 272 3/15/2023 12:42:43 PM

Waves • 273
AC is the common hypotenuse to the two right-angled triangles containing

angles θ 1 and θ 2 so we can write expressions for the sines of each angle:
v1
BC f
sin 1
AC AC
v2
AD f
sin 2
AC AC
Dividing the first equation by the second gives us Snell’s law in terms of the
wave velocities in each medium:
sin 1 v1
1n2
sin 2 v2
The index of refraction is the ratio of wave speeds in the two media on either
side of the boundary. It is obvious from this equation that if the ray direction
is reversed it will retrace its path. In other words, the refractive index when
going from medium 2 to medium 1 is the reciprocal of the refractive index
when going from medium 1 to medium 2:
v2 1
2 n=
1 =
v1 1 n2
It is also clear that with a greater ratio of velocities, more refraction will occur
and that rays of light will bend toward the normal as they enter a medium with
a lower speed of light and away from the normal when they enter a medium
with a higher speed of light.

Air
Glass

FP.CH13_3pp.indd 273 3/15/2023 12:42:52 PM

13.3.3 Absolute and Relative Refractive Indices

The absolute refractive index of a medium is the value of 1n2 when medium 1
is the vacuum. The speed of light in a vacuum is c and in a medium is v (<c).
c c
Absolute refractive index n = therefore v =
v n
The speed of light v in a medium of absolute refractive index n is equal to the
speed of light in a vacuum c divided by the absolute refractive index of the
medium. For example, the absolute refractive index of glass is about 1.5 and
the speed of light in a vacuum is 3.0 × 108 ms− 1, so the speed of light in glass
is about 2.0 × 108 ms− 1. The greater the refractive index the slower the speed
of light in that medium. The speed of light depends on the electron density
inside the material so the denser the medium the slower the speed of light
and the larger the refractive index.
We can also express the relative refractive index at a boundary in terms of
the two absolute refractive indices of the media on either side of that bound-
ary. Let n1 and n2 be the absolute refractive indices and let 1n2 be the relative
refractive index for a ray of light passing from medium 1 into medium 2:
c
v1 n1 n2
=
1n = =
2
v2 c n1
n2
This gives us the most useful form of Snell’s law:
n2 sin 1

n1 sin 2
n1 sin 1 n2 sin 2
13.3.4 Total Internal Reflection

When a ray of light moves into a medium with lower refractive index and
therefore higher wave speed it bends away from the normal. However, the
maximum angle of refraction is 90° and this occurs when the incident angle
inside the first medium is less than 90°. The incident angle at which this
occurs is called the “critical angle c” and for larger incident angles there is no
refracted ray. All of the incident light is reflected from the boundary. This is
called “total internal reflection” (TIR). This can be demonstrated by directing
FP.CH13_3pp.indd 274 3/15/2023 12:42:59 PM

Waves • 275
a ray of light at a semi-circular glass prism. If the ray is aimed along a radius
there is no change of deflection on entering the prism.
partial reflection
TIR
TIR
1 < c 1 = c 1 >c
refracted ray critical condition: no refracted ray

1 = c 2 = 90
The condition for the critical angle is that the refracted angle is equal to 90°:
n1 sin 1 n2 sin 2

=
n1 sin c n=
2 sin 90 n2
n2
sin c =
n1
If medium 2 is the vacuum (or air) then:
1
sin c =
n
Where n is the absolute refractive index of the transparent block.
Total internal reflection can only occur when light travels from a material of
higher refractive index to one of lower refractive index.
13.3.5 Optical Fibers

Total internal reflection is a particularly efficient form of reflection and is used
inside optical fibers for data transmission. The basic principle of an optical
fiber is to have a transparent core surrounded by a transparent cladding mate-
rial of lower refractive index so that total internal reflection can occur at the
core-cladding boundary.
FP.CH13_3pp.indd 275 3/15/2023 12:43:09 PM

Each time the ray reaches the boundary its incident angle is greater than the
critical angle so it is repeatedly totally internally reflected. Light has a very
high frequency so it can be modulated to carry a great deal of information.
There are two main types of optical fiber: mono-mode fibers, which effec-
tively only allow a single path for the light (by having an extremely narrow
core of about 10 µm), and multi-mode fibers which are much thicker and
allow multiple light paths. The disadvantage of multi-mode fibers is that, over
a long distance, different parts of the signal have traveled significantly differ-
ent distances and develop time delays. If these become comparable to the
time between ones and zeroes in the digital signal then the information is lost.
The longer the fiber the lower the maximum data transfer rate, so they tend
to be used over shorter distances, for example, within a single building. The
advantage of multi-mode fibers is that they are cheaper and can carry the light
of multiple wavelengths so they are often used with LEDs rather than lasers.
Mono-mode fibers use laser sources working at a single wavelength and can
transmit high data rates over great distances (up to thousands of kilometers).
13.3.6 Dispersion
The amount of refraction at a boundary depends on the refractive index at that
boundary. However, absolute refractive indices depend on the wavelength of
light (because the speed of light in a medium depends on wavelength). This
means that if a polychromatic ray (i.e., one with a range of wavelengths pre-
sent, such as white light) refracts at a boundary, different wavelengths will
refract in different amounts. This is called “dispersion” and is familiar from
the way a triangular glass prism can disperse white light into a “spectrum” of
colors.
FP.CH13_3pp.indd 276 3/15/2023 12:43:09 PM

Waves • 277
For glass the higher frequency, shorter wavelength end (blue/violet end) of
the spectrum travels more slowly than the lower frequency, longer wavelength
end (red/orange) of the spectrum, and so has a higher refractive index. For a
certain type of crown glass nred = 1.509 while nviolet = 1.521, a small difference
but one that is clearly demonstrable.
13.4 POLARIZATION
13.4.1 What Is Polarization?
Longitudinal waves vibrate parallel to the direction in which they transfer
energy, so there is a unique vibration direction. Transverse waves, however,
vibrate at 90° to the direction in which they transfer energy so they can vibrate
in any direction perpendicular to the direction in which the wave is traveling.
If a transverse wave is confined to oscillate only in one plane it is said to be
plane-polarized. Transverse waves can be polarized but longitudinal waves
cannot.
For example, if the only vibration direction is vertical then the wave is verti-
cally plane polarized.
possible
vibration direction of energy transfer
directions
13.4.2 Polarizing Filters

An ideal polarizing filter will transmit one direction of polarization and absorb
the perpendicular direction of polarization. If the direction of polarization of
the incident wave is at an angle to the polarizing direction of the filter, then
only a component of the wave’s amplitude is transmitted.
FP.CH13_3pp.indd 277 3/15/2023 12:43:10 PM

vertical
vertical vibrations
vibrations
direction of
energy transfer
In the diagram below we are looking in the direction of wave travel with the
wave moving away from us. The diagram shows a polarizing filter that will
transmit vertically plane-polarized light. The light incident on the filter is
plane polarized at an angle θ to the vertical.
Transmitted
A cos
A
Absorbed
A sin
The transmitted amplitude is: Atrans = A cos θ

The transmitted intensity is: Itrans = I0 cos2 θ (using I ∝ A2)
The transmitted polarization direction is vertical (i.e., the same as the
filter).
Rotation of Polarizing Filter

If vertically plane polarized light is incident on a vertical polarizing filter that
is slowly rotated around the axis of the beam, the intensity variation maps out
the cos2θ function:
FP.CH13_3pp.indd 278 3/15/2023 12:43:10 PM

Waves • 279
fraction of
maximum 1
intensity
0.8
0.6
mean value: 0.5
0.4
0.2 angle of
polarising filter
0
0 30 60 90 120 150 180 210 240 270 300 330 360
Unpolarized Light Incident on a Polarizing Filter

Unpolarized light contains all possible vibration directions perpendicular to
the direction of travel of the wave. When this falls onto an ideal vertical polar-
izing filter, the intensity is reduced to 50% of its original value. This follows
from the equation for intensity above. The intensity at any angle is propor-
tional to cos2θ so the intensity of the transmitted light will be the incident
intensity multiplied by the average value of cos2θ over one rotation (θ from
0 to 2π). If we use the identity cos2 θ = (½ cos 2θ + ½) we can see that the aver-
age value of the first term is zero (since cosine is equally positive and negative
over one cycle) so the average of cos2θ must be ½. Therefore, the transmitted
intensity is Itrans = 0.5 I0. This remains the case if the polarizing filter is rotated.
Crossed Polarizing Filters

If light falls on two polarizing filters, one placed behind the other and the sec-
ond filter is rotated with respect to the first there will be maxima of intensity
when the angles between their polarizing filters are: 0°, 180°, and 360° and
minima at 90° and 270° (as in the graph above).
13.4.3 Rotation of the Plane of Polarization

Some transparent media can rotate the plane of polarization around the
direction of travel of the wave. Sugar solutions do this because the molecules
are themselves asymmetric and the amount of rotation can be used to meas-
ure the concentration of the solution (higher concentrations produce a larger
rotation angle per unit distance. A polarimeter consists of two polarizing fil-
ters mounted on either side of the sample to be tested. One of the filters can
FP.CH13_3pp.indd 279 3/15/2023 12:43:11 PM

be rotated and the angle between the two filters can then be read off from a
fixed scale. With no sample, the maximum intensity will be when the filters
are aligned. If the sample rotates the plane of polarization the second filter
can be rotated until the intensity is once again a maximum and the angle of
rotation can be measured.
direction
of wave
sample rotates
polarization
direction by filter must be
light source:
rotated by to
unpolarized
find maximum
13.4.4 Polarization by Reflection and Scattering

When unpolarized light is incident on a non-metallic transparent medium
some of the light is refracted into the medium and some are reflected from
the boundary. The interaction between light and matter causes dipoles in
the material surface to absorb energy, vibrate, and then re-radiate the light.
However, light is a transverse wave, so these vibrations determine the possible
polarization of the refracted and reflected light rays. At a particular angle of
incidence, called the Brewster angle, the reflected and refracted angles will
be at 90° and the reflected ray is plane polarized parallel to the surface of the
material. The refracted ray is also partially plane polarized along a line parallel
to the direction of the reflected ray.
unpolarized polarized
B
B B
2
partially
polarized
FP.CH13_3pp.indd 280 3/15/2023 12:43:12 PM

Waves • 281
We can derive an expression for the Brewster angle θB by using Snell’s law and
the law of reflection. Let the refractive index of the first and second media be
n1 and n2, respectively.
n1 sin θB = n2 sin θ2
but θ2 = 180 − (90 + θB) = 90 − θB
n1 sin θB = n2 sin (90 − θB) = n2 cos θB
n2
tan B
n1
This is “Brewster’s law.”
Photographers use polarizing filters to enhance contrast, for example, to
darken the sky compared to the clouds, or to reduce glare from reflective sur-
faces (e.g., water or windows). Polarizing sunglasses are also used to reduce
reflected glare by absorbing on component of polarization.
13.5 EXERCISES
1. A traveling wave is described by the equation below (distances measured

in meters):
y 0.25 cos 30 t 2 x
(a) State the amplitude, frequency, and wavelength of this wave.

(b) Calculate the speed of the wave.
(c) State the direction in which the wave is traveling.
2. A swimmer is treading water in the sea. Waves traveling at 1.2 ms− 1 cause
her to bob up and down six times a minute. The distance between her
highest and lowest positions as the wave passes is 2.4 m.
(a) Sketch a graph to show how her displacement varies with time over
a period of 20 s.
(b) Calculate the frequency and wavelength of the waves.
(c) Her friend is also treading water but is 15 m further out to sea. What
is the phase difference between the oscillations of the two swim-
mers? Assume that the waves are traveling directly toward the shore.
FP.CH13_3pp.indd 281 3/15/2023 12:43:16 PM

3. The diagram below shows a pin placed symmetrically between two plane
mirrors that are perpendicular to one another.
pin
An observer looking into the mirrors can see three images of the pin.
Draw a careful ray diagram to locate the positions of these images.
4. Here are some absolute refractive indices for different media.
Vacuum n = 1 (exactly)
Air n = 1.000293 (at s.t.p.) .... usually taken to be 1.00
Diamond n = 2.42
Glass n = 1.50
Water n = 1.33
Speed of light in a vacuum c = 3.0×108 ms− 1
The table below refers to a ray of light traveling from medium 1 to
medium 2 as shown in the diagram.
1
2
medium 1 medium 2
FP.CH13_3pp.indd 282 3/15/2023 12:43:18 PM

Waves • 283
(a) Complete the table (you will need to use values from the list above):
Medium 1 Medium 2 v1 v2 θ1 θ2
Glass 3.0 × 10 ms
8 −1
40 0
Glass Air 400

Water Diamond 200
2.0 × 108 ms− 1 550 700
(b) Explain what is meant by total internal reflection and state the con-
ditions under which it occurs.
(c) Calculate the critical angle for an interface between:
(i) water and air,
(ii) glass and air,
(iii) diamond and air.
Suggest why a real diamond sparkles more than a fake glass
(d)
“diamond.”
5. The refractive index n of glass can be determined by measuring the mini-
mum angle of deviation D when light passes through a triangular prism
of apex angle A:
A
D
The minimum deviation occurs when the light passes symmetrically

through the prism (the ray inside the prism is parallel to the base). The
formula used to find the refractive index is:
AD
sin
n 2
A
sin
2
Derive this formula.
FP.CH13_3pp.indd 283 3/15/2023 12:43:20 PM

6. Two polarizing filters are placed at 900 to each other. A third polarizing
filter is placed between them and slowly rotated. Unpolarized light is
directed into the system of three filters.
(a) Explain why the intensity of light passing through the system of
three filters reaches a maximum when the third filter is at 450 to the
direction of the first polarizing filter.
(b) Calculate the fraction of the incident intensity that passes through
the system when the middle polarizing filter is at 30° to the first
polarizing filter.
FP.CH13_3pp.indd 284 3/15/2023 12:43:20 PM

CHAPTER
14
Light
14.1 LIGHT AS AN ELECTROMAGNETIC WAVE

14.1.1 Waves or Particles?
Properties such as reflection and refraction of light have been known since
antiquity but the nature of light remained a mystery. Newton thought that
light must consist of streams of tiny corpuscles (particles) emitted from bright
objects but Huygens (a Dutch physicist) thought that light was a form of wave
motion like ripples on the surface of a pond. It turns out that neither the wave
model nor the particle model is a complete model of the nature of light (or
any other electromagnetic wave). However, aspects of both the wave model
and the particle model can be used to explain how light behaves, but we have
to be careful where we use each model. We will look at wave-particle duality
in much more detail later (see Section 27.3) but in this chapter, we will con-
centrate on the wave model.
14.1.2 Electromagnetism
In the 19th century, Michael Faraday and James Clerk Maxwell tried to make
sense of all the different electromagnetic phenomena that had been discov-
ered at that time. This included:
how charges exert forces on one another (see Section 17.3.1)
how electric currents create magnetic fields (see Section 20.3.1)
how magnetic fields exert forces on moving charges and electric currents
(see Section 20.2.2)
how changing magnetic fields can induce voltages (see Section 21.1.1).
FP.CH14_2pp.indd 285 3/14/2023 4:59:29 PM

In order to do this Faraday introduced the idea of an “electromagnetic field.”

Electromagnetic fields are created by electric charges and exert forces on
electric charges. For example, a positive charge at one point in space creates
an electric field through all of space, and another charge some distance from
the first experiences a force from the field. The beauty of this model is that the
field itself has properties and charges that do not interact by “instantaneous
action-at-a-distance” but by interacting with the local electromagnetic field.
Maxwell discovered a set of four equations, the Maxwell equations that
describe how fields are created by, and interact with, charges and how the
field in one place affects the field nearby. The equations showed that if the
field is disturbed in one place, for example, because a charge is vibrating, that
disturbance spreads outwards from the source at a constant speed. The speed
could be calculated from the equations and was equal to the measured speed
of light. This suggested very strongly that light must be an electromagnetic
disturbance. It also showed that visible light was just part of a much wider
spectrum of electromagnetic waves, all of which travel at the same constant
speed in a vacuum.
14.1.3 Electromagnetic Waves

Electromagnetic waves travel through a vacuum so they cannot be mechani-
cal waves - there are no material particles to vibrate. So, what does vibrate?
The electric and magnetic fields at each point in space. An electromagnetic
wave is a transverse wave (we know this because light can be polarized) and
consists of vibrations of the electric and magnetic field at each point in space.
The direction of the electric field vibration is at 90° to the direction of vibra-
tion of the magnetic field. The diagram below shows the relationship between
the electric and magnetic fields in a plane-polarized electromagnetic wave.
direcon of electric
field vibraons
direcon of energy
transfer
FP.CH14_2pp.indd 286 3/14/2023 4:59:30 PM

Light • 287
The polarization direction is, by convention, the direction in which the electric
field vibrates. In the example above, the polarization is vertical.
It is important to realize that this diagram represents the variation of the two
fields at a point ON the red line. They do not represent physical motions of
anything off the line. As the wave passes a point the electric and magnetic
fields at that point increase and decrease in strength periodically. If a wave
like the one above strikes a material medium the two fields will cause charges
in the surface to vibrate at the frequency of the wave. This is how a radio
antenna detects radio waves – the vibrating charges in the antenna set up a
weak a.c. signal that can be amplified.
Maxwell’s equations show that electromagnetic waves are emitted when
charges accelerate (e.g., while oscillating) and cause charges to accelerate
when they are absorbed.
Visible light consists of part of the much larger electromagnetic spectrum.
Our eyes respond to wavelengths from about 390 nm (violet) to 700 nm (red)
which corresponds to a frequency range of 4.3 × 1014 Hz to 7.7 × 1014 Hz. The
main regions of the electromagnetic spectrum are shown below.
frequency / Hz
10 24
10 2
10 20
10 18
10 16
10 14
10 1
10 10
10 8
10 6
10 4
102 100
micro
visible
gamma X-rays UV IR
radio waves
rays wave
0 2 4 6 8
10−1 10−14 10−1 10−1 10−8 10−6 10−4 10− 10 10 10 10 10
wavelength / m
You might still be wondering what it is that actually vibrates when an electro-
magnetic wave passes through a vacuum – calling it the electric and magnetic
field does not really answer the question. 19th-century physicists were also
puzzled about this and assumed that there must be some invisible medium
filling all of space so that electric and magnetic fields are distortions of this
medium and electromagnetic waves involve vibrations in the medium. They
called the medium the “luminiferous aether” but no attempt to demonstrate
its existence was ever successful and Einstein’s special theory of relativity
abandoned it (see Section 24.1.2).
FP.CH14_2pp.indd 287 3/14/2023 4:59:30 PM

14.1.4 Measuring the Speed of Light

The first successful measurement of the speed of light was an astronomical
one. Jupiter has several moons and each of these completes one orbit in con-
stant time. However, when viewed from Earth the time of orbit seems to vary
by about ± 8 minutes over the course of a year. The astronomer, Olaf Römer,
realized that as the Earth and Jupiter orbit at different rates, the distance
between the two varies by a distance equal to the diameter of the Earth’s
orbit. This means that light takes different times to reach the Earth when the
relative positions of Earth and Jupiter change and concluded that the speed of
light must be equal to the diameter of the Earth’s orbit (1.5 × 1011 m) divided
by 8 minutes (480 seconds). This is about 3 × 108 ms- 1.
J
Maximum
J
separaon
E
Minimum
separaon
The first terrestrial measurement of the speed of light was made by Armand
Fizeau in 1849. He employed an ingenious method using a rapidly spinning
toothed wheel to chop light into short pulses. The pulses then hit a distant mir-
ror and reflected back to the wheel. As the speed of the wheel increased the
returning light hit the next tooth and was blocked. This meant that the time
taken for the light to travel to the mirror and back was equal to the time taken
for the wheel to rotate by the angle between a tooth and the next gap. He meas-
ured the rate of rotation at which this occurred and used this to find the time
of flight t of the light pulse. The speed of light was then calculated from the
distance to the mirror, d, and back divided by the time of flight: c = 2d/t.
spinning
mirror
toothed
wheel
*
d
observer
FP.CH14_2pp.indd 288 3/14/2023 4:59:30 PM

Light • 289
To oscilloscope to
record emied pulse
Mirror
To oscilloscope to d
record detected pulse
This method was modified and improved by Jean Leon Foucault and then by
Albert Michelson. They replaced the toothed wheel with a rotating mirror but
the principle of the method was similar. Foucault even measured the speed of
light in water by placing a tube of water in the light path between the rotating
mirror and the fixed distant mirror. His results showed that light travels more
slowly in water than in air, a result that reinforced the wave model of light
because it was consistent with the wave explanation of refraction at an air–
water boundary (the particle model predicted that the particles of light would
speed up as they entered the denser medium). Michelson spent most of his
life refining methods for measuring the speed of light. He also carried out
the famous Michelson–Morley experiment, which seemed to show that the
speed of light is independent of the motion of the observer, an effect that was
eventually explained by Einstein’s special theory of relativity (see Chapter 24).
What makes it difficult to measure the speed of light in a laboratory is the fact
that it is so fast. This means that we have to measure extremely short time
intervals. Nowadays, high-speed electronic devices can do this and the meas-
urement can be carried out successfully over a distance of just a few meters.
The principle is very similar to that used by Fresnel, Foucault, and Michelson,
short pulses of light are generated by an LED laser and a fast oscilloscope is
used to detect the emission of the pulse and the detection of its reflection
from a mirror placed a few meters away.
One slight complication with this method is that the time it takes for the elec-
tronic processing of the signals is comparable to the time of flight of the light
pulses, so the time t1 measured on the oscilloscope must be corrected for this.
One way to do this is to place the emitter and detector so that they are facing
each other and the light path is virtually zero. There will still be two separate
peaks on the screen and the time delay t2 is now entirely due to the equip-
ment. The actual time of flight of the light pulse is therefore (t1 - t2) and the
speed of light is 2d/(t1 - t2). Typical oscilloscope traces are shown below.
FP.CH14_2pp.indd 289 3/14/2023 4:59:31 PM

t1 t2
emied detected pulse emied detected pulse

pulse (reflected) pulse (direct)
14.1.5 Maxwell’s Equations and the Speed of Light

Maxwell’s equations themselves are beyond the scope of this book. However,
the equations lead to an expression for the speed of all electromagnetic waves:
1
c=
ε0µ0
where e0 is the permittivity of free space (effectively a measure of the ability of

a vacuum to support an electric field) and m0 is the permeability of free space
(effectively a measure of the ability of a vacuum to support a magnetic field).
When electromagnetic waves penetrate matter (e.g., light going through
glass) the equation becomes:
1
v=
ε0 εr µ0µr
where er is the relative permittivity of the medium and mr is the relative perme-
ability of the medium. Whereas the speed of all electromagnetic waves is the
same in a vacuum the values for the relative permittivity and permeability in
a medium vary with frequency, so different parts of the electromagnetic spec-
trum travel at different speeds inside media causing dispersion. This is respon-
sible for the way white light can be spread into a spectrum by a triangular prism.
c
We have already met the equation v = where n is the absolute refractive
n
index of a medium. It follows that the absolute refractive index is related to
permittivity and permeability by:
n= εr µr
FP.CH14_2pp.indd 290 3/14/2023 4:59:31 PM

Light • 291
14.1.6 Defining Speed, Time, and Distance

The speed of light in a vacuum is one of the fundamental physical constants
and it has been measured ever more precisely. Its value is now fixed to be
299 752 498 ms- 1 and this will not change even if the speed of light is meas-
ured even more precisely in the future. The second is also a defined quantity,
equal to 9 192 631 770 periods of vibration of the radiation emitted between
two energy levels in the cesium-133 atom. This implies that the length of the
meter cannot be a defined quantity since distance, speed and time are all
related. The meter is a derived quantity equal to the distance traveled by light
in a vacuum in a time of 1/(299 752 498) of a second.
The fact that the speed of light can be derived from Maxwell’s equations raises
an interesting question about its meaning. What is the speed of light meas-
ured against, or more specifically, in what reference frame does light travel
at c? In the 19th century most physicists assumed that light was a vibration
in some as yet undiscovered and all-pervading medium which they called the
luminiferous aether so that light traveled at speed c relative to this aether. The
implication was that the speed measured by an observer who was also moving
relative to the aether at speed v would be a relative speed. For example, if both
the light and the observer moved in the same direction through the aether at
speeds c and v then the speed of light relative to the observer should be c - v.
But this is not the case. The speed of light is the same for all uniformly mov-
ing observers and is independent of the speed of the observer or the source.
This is explained by Einstein’s special theory of relativity, which also led to the
abandonment of the idea of the luminiferous aether.
14.2 RAY OPTICS

The behavior of optical instruments can be explained by following the paths of
rays. Rays are assumed to travel in straight lines (rectilinear propagation) from
a source, only changing direction when they pass through a lens or prism, or
when they are reflected.
14.2.1 Thin Lenses

Convex or converging lenses are shaped in such a way that light traveling
parallel to their principal axis is refracted to a real focal point beyond the lens.
The distance from the optical center (of the lens) to the focal point F is called
FP.CH14_2pp.indd 291 3/14/2023 4:59:31 PM

the focal length f of the lens. The reciprocal of this value is called the power
of the lens P and is measured in diopters (1 diopter is equivalent to 1 m- 1).
principal axis F
1
P=
f
Concave, or diverging, lenses are shaped in such a way that light traveling par-
allel to their principal axis diverges from a virtual focal point before the lens.
principal axis F
In a real lens, the light will refract as it enters the lens and refract again as
it leaves on the opposite side. A thin lens is one where both refractions can
be assumed to take place at the same point, on a line perpendicular to the
principal axis and passing through the optical center of the lens. Ray diagrams
involving thin lenses are often drawn with the lens itself shown as a single
vertical line passing through the optical center and a small symbol is used to
show the type of lens.
Thick Thin lens
FP.CH14_2pp.indd 292 3/14/2023 4:59:32 PM

Light • 293
14.2.2 Predictable Rays for Thin Lenses

Ray diagrams can be used to find the positions of objects and images in optical
instruments. To draw them we make use of predictable rays.
Convex lens – predictable ray 1: a ray parallel to the principal axis will pass
through the focal point on the far side of the lens.
thin lens
In ray optics, the direction of the ray is reversible so a ray passing through the
focal point before it reaches the lens will then travel parallel to the principal
axis on the far side of the lens.
Convex lens – predictable ray 2: a ray passing through the optical center of
the lens does not change direction.
thin lens
Concave lens – predictable ray 1: a ray parallel to the principal axis diverges
from the principal axis on the other side of the lens along a line that traces
back to the focal point.
thin lens
FP.CH14_2pp.indd 293 3/14/2023 4:59:33 PM

The direction of the ray is reversible, so a ray heading toward the focal point
on the far side of the lens will then travel parallel to the principal axis after it
passes through the lens.
Concave lens – predictable ray 2: a ray passing through the optical center
of the lens does not change direction.
thin lens
14.2.3 Images
An image is a point-by-point representation of an object. A “real image” is
formed when the rays creating the image pass through the image points,
for example, the image formed on the retina of the eye or the image pro-
jected onto a cinema screen. A virtual image is formed when the rays cre-
ating the image diverge from points on the image but do not pass through
those points, for example, the image in a plane mirror or the image of
the bottom of a swimming pool seen through the surface. Since the rays
responsible for the image do not pass through it, a “virtual image” cannot
be formed on a screen.
The size and position of an image can be determined by drawing two predict-
able rays from a single point on an object and tracing their paths. The place
where they intersect is the position where the image point is formed. Objects
are often shown as vertical arrows and the two rays to be traced are from the
tip of the arrow.
14.2.4 Image Formation with a Convex Lens

The distance from the object to the optical center of the lens is called the
object distance, u.
The distance from the optical center of the lens to the image is called the
image distance, v.
FP.CH14_2pp.indd 294 3/14/2023 4:59:33 PM

Light • 295
u v
ho
F image
object F hi
thin lens
A real, inverted, magnified image has been formed.

hi
The linear magnification m is given by the equation: m =
ho
Real images are formed if the object distance is greater than the focal length
of the lens (u > f). However, if the object is placed in the focal plane (so that
u = f) then the rays are parallel after passing through the lens. They are said to
form an image at infinity (v = ∞).
u=f
F
object F
to image
thin lens at infinity
Large magnification can be achieved by placing the object so that u is just

greater than f. Magnification of 1 is when u = 2f and for larger object distances
the image is diminished (smaller than the object, m < 1).
For object distances less than the focal length the emerging rays diverge.
However, these can be traced backwards to a virtual image on the same side
of the lens as the object.
This is an erect (same way up as object), magnified, and virtual image.
FP.CH14_2pp.indd 295 3/14/2023 4:59:33 PM

image u<f
F
object
F
diverging
thin lens rays
This ray diagram can be used to explain how a magnifying glass works. Your
eye would be positioned on the optical axis to the right of the lens and would
look through the lens to see the magnified image beyond the object.
Summary of the behavior of a convex lens

Object distance Image position Image nature Magnification
>2f Between f and 2f Real <1
2f 2f Real 1
Between f and 2f >2f Real <1
F Infinity Undetermined infinite
<f >f Virtual >1
14.2.5 Image Formation with a Concave Lens

A similar approach can be used to find the image position when a concave
lens is used.
F
object
image
thin lens
The image is erect (the same way up as the object) diminished and virtual.
FP.CH14_2pp.indd 296 3/14/2023 4:59:34 PM

Light • 297
14.2.6 Object at Infinity

If an object is at a great distance from a convex lens (u >> f) rays diverging
from a point on the object will be almost parallel when they reach the lens.
The object is at optical infinity and when the rays pass through the lens they
converge to form an image point in the focal plane of the lens. This is the situ-
ation when we look at a distant object and an inverted image of that object is
formed on the retina of our eye.
focal plane
Rays from a single
point on distant object
F F
thin lens
For an extended object at a great distance, for example, the Sun, there will be
an angle between the rays from a point on one side of the Sun and the rays
from a point on the opposite side of the Sun. They will be focused in the focal
plane above and below the focal point of the lens and an image of the Sun will
be formed between them. focal plane
Rays from a
single point
on top of Sun
image
of Sun
Rays from a
single point on
boom of Sun thin lens
This creates an image of the Sun that can be projected onto a screen – for
example, to look for sunspots.
FP.CH14_2pp.indd 297 3/14/2023 4:59:34 PM

The power of a lens depends on its shape and on the material from which it is
made. The higher the refractive index the more the light is refracted and the
shorter the focal length of the lens. However, the refractive index depends on
wavelength, so objects which have a range of colors are not focused sharply
at any one point. This is called chromatic aberration. Achromatic lenses are
composite lenses that use a range of different lenses to compensate for chro-
matic aberration.
14.2.7 The Lens Equation

The equation relating u, v, and f are called the lens equation. It can be derived
from the ray diagram for a convex lens but, with a suitable sign convention,
it also applies to concave lenses. Look at the two shaded triangles in the ray
diagram below.
u v
ho
F image
object F
hi
thin lens
These are similar triangles so:
hi v
=
m =
ho u
Now look at these two shaded triangles which are also similar:
v
u f
ho
F image
object F
hi
thin lens
FP.CH14_2pp.indd 298 3/14/2023 4:59:34 PM

Light • 299
Therefore:
hi v f v

ho f u
Which can be rearranged to give the lens equation:
1 1 1
= +
f u v
This can be used for convex and concave lenses using the REAL IS POSITIVE
sign convention:
f is positive for a convex lens (real focus) and negative for a concave lens
(virtual focus)
u is positive for a real object and negative for a virtual object
v is positive for a real image and negative for a virtual image
14.2.8 Virtual Image Formed by a Plane Mirror

When an object is placed in front of a plane mirror the reflected rays diverge
from an image point behind the mirror.
u v
object image
normal
normal
The two shaded triangles are similar so u = v and m = 1. The object and the
image lie on the same normal to the mirror.
Curved mirrors can be used like lenses to focus and project images. A suitably
shaped concave mirror has a real focus and a convex mirror has a virtual focus.
FP.CH14_2pp.indd 299 3/14/2023 4:59:37 PM

F F
concave mirror convex mirror
Concave mirrors are used to focus microwaves and radio waves so that a detec-
tor placed at or near the focal point receives a strong signal (e.g., for a radio
telescope or a satellite TV dish). The way to think about curved mirrors is to
realize that the ray diagrams for a concave mirror are like those for a convex
lens, but with a reflection, and the ray diagrams for a convex mirror are like
those for a concave lens with a reflection.
If a convex mirror is actually part of a sphere rays nearer the edge of the mir-
ror focus on a different position than rays near its center. This results in an
unfocused image and is an effect called spherical aberration. To avoid this the
correct shape for a concave mirror is a paraboloid.
14.2.9 Real and Apparent Depth

When you look down into a swimming pool the floor of the pool appears closer
to the surface than it actually is. This reduces the apparent depth of the water.
The reason for this is that when light refracts at the water surface it creates a
virtual image of the floor of the pool that is closer to the surface. If you look
directly down into the pool the ratio of the real depth to the apparent depth is
equal to the refractive index of water.
sin α
= nwater
sin β
x x
From the diagram: sin α = and sin β =
happt hreal
hreal
nwater =
therefore:
happt
FP.CH14_2pp.indd 300 3/14/2023 4:59:38 PM

Light • 301
air x
water
β happ hreal
α
image point
β
object point
14.3 OPTICAL INSTRUMENTS

The behavior of an optical instrument (e.g., its magnifying power and the
nature and position of the image it produces) can be determined by ray tracing.
14.3.1 An Astronomical Refracting Telescope

Telescopes are used to observe distant objects, so the rays of light arriving at
the objective of the telescope from a particular point on the object are paral-
lel rays. The telescope is used to increase the visual angle, that is the angle
subtended by the image at the eye. It does this using two convex lenses. The
first, the objective lens, is a low-power convex lens that creates a real, inverted
image of the distant object on the objective focal plane inside the barrel of
the telescope. The second, the eyepiece lens, is a higher-power lens used as
a magnifying glass to magnify the intermediate image. In normal adjustment,
the telescope accepts parallel rays and parallel rays leave the eyepiece, there-
fore the distance from the first image to the eyepiece lens must equal the
eyepiece focal length. This makes the total distance between the two lenses
equal to the sum of the focal lengths of the lenses.
FP.CH14_2pp.indd 301 3/14/2023 4:59:38 PM

rays from
point at top of
f0 fe
distant object
α F β
intermediate image
objecve lens eyepiece lens eye
In the ray diagram above, the rays from the top of the object enter the objec-
tive at an angle α to the principal axis. Rays from the bottom of the object are
assumed to enter along the principal axis but are not shown in the diagram.
The angle α is therefore equal to the angle subtended by the object at the
objective and β is the angle subtended by the image at the eye:
α is the angle subtended by the distant object at the objective lens.
β is the visual angle – the angle subtended at the eye by the rays forming
image.
The angular magnification M of the telescope is the ratio β/α.
tan β fo
Using similar triangles and alternate angles it is easy to see that =
tan α fe
However, the angles involved are small so we can use the approximation tan
q ≈ q (in radians) to obtain and expression for angular magnification:
β fo
M= =
α fe
The image is, as can be seen in the diagram, inverted. For astronomical work,
this is not an issue. However, a terrestrial refracting telescope has a third con-
vex lens in between the intermediate image and the eyepiece and this lens
simply inverts the intermediate image so that the final image is erect.
14.3.2 An Astronomical Reflecting Telescope (Newtonian Telescope)

The amount of light captured by the objective of a telescope is directly
proportional to its area so the larger the diameter of an objective the greater
the sensitivity of the instrument. Astronomers need to observe very faint
FP.CH14_2pp.indd 302 3/14/2023 4:59:38 PM

Light • 303
objects so astronomical telescopes need large-diameter objectives. Very

large-diameter lenses are heavy, difficult to manufacture, and have the
additional problem of chromatic aberration, so large optical telescopes use
concave mirrors as their objective rather than convex lenses. This design was
first used by Isaac Newton and is often referred to as a Newtonian telescope.
The angular magnification is equal to fo/fe as for the refractor. The secondary
mirror does reduce the amount of light received by the primary but this is
a small problem compared to the difficulties of building a large achromatic
lens. The largest ground-based optical telescope is in La Palma on the Canary
Islands. It is the Gran Telescopio Canaris and the diameter of its objective
mirror is 10.4 m. However, the Great Magellan telescope is currently under
construction in Chile and when completed this will have a diameter of 24.5
m. The reflecting surface of these large telescopes is made up of several indi-
vidual reflectors rather than a single dish. The largest space telescope is the
James Webb telescope, launched on Christmas Day 2021 with the first images
received in the summer of 2022.
Objecve: concave
mirror of focal length fo
Secondary mirror
Eyepiece: convex
lens of focal length
Radio telescopes are also reflecting telescopes, using a large concave dish to
reflect radio waves to a detector. Radio waves have much longer wavelengths
than light so radio telescopes must have much larger diameters than optical
telescopes (the resolution of a telescope depends on its diameter – see Section
15.3.4). The world’s largest radio telescope is the Tianyan (Heavenly Eye) tel-
escope in Guizhou Province, China. Its objective reflector has a diameter of
FP.CH14_2pp.indd 303 3/14/2023 4:59:38 PM

500 m and is built into a natural depression in the landscape. The signal from
a radio telescope is detected by placing a receiver close to the focal point of
the objective mirror.
14.3.3 A Compound Microscope

A compound microscope uses two convex lenses. The objective lens creates a
real magnified inverted intermediate image and the eyepiece lens is used like
a magnifying glass to make this even larger. It creates a final magnified virtual
image.
objecve eyepiece
lens lens
intermediate
image
f0 fe
object
F F F
F
eye
Inverted
magnified
virtual image
14.4 THE DOPPLER EFFECT

When a source moves relative to an observer or vice versa, the wavelength and
frequency of the waves received by the observer changes. This is called the
Doppler effect and it occurs for any type of wave. An everyday example of this
is the change in frequency of a siren as an emergency vehicle approaches and
then moves past us at speed. As it approaches we hear a higher frequency and
when it moves away we hear a lower frequency than the frequency we would
hear if the vehicle was at rest.
FP.CH14_2pp.indd 304 3/14/2023 4:59:39 PM

Light • 305
frequency heard by
staonary observer
siren
approaching frequency of siren
f0 at rest
siren receding
me
Here we will restrict ourselves to electromagnetic waves. For relative veloci-

ties significantly lower than the speed of light the Doppler effect for electro-
magnetic waves can be analyzed very simply. This is because while relative
velocity affects wavelength and frequency it has no effect on the velocity of
the waves, since this is independent of the velocity of the source or observer.
When relative speeds are significantly comparable to the speed of light there
is an additional relativistic effect to take into account – the frequency of the
source, as measured by the observer, changes as a result of time dilation.
14.4.1 The Doppler Effect for Electromagnetic Waves

Consider a source that emits waves of frequency f0 and wavelength l0 as
measured by an observer at rest with respect to the source. If the source
approaches an observer with velocity v it will move a distance v / f0 toward
the observer in the time taken to emit one complete wave. This means that the
next wave front is emitted from a point l0 - v / f0 behind the first, so the wave-
length has been reduced by v / f0. However, f0 = (v/c)l0 so the new wavelength
is l´= (1- v/c)l0 and the change in wavelength is Dl = - (v/c)l0. It is clear
from this that when the source recedes at velocity v the change in wavelength
is of the same magnitude but opposite sign, that is, an increase in wavelength.
This change in wavelength is often referred to as a Doppler shift.
Doppler shifts:
v
source and observer approaching with relative velocity v: ∆λ = − λ 0
c
v
source and observer receding with relative velocity v: ∆λ = + λ 0
c
FP.CH14_2pp.indd 305 3/14/2023 4:59:39 PM

These shifts in wavelength are accompanied by changes in frequency with

the frequency increasing for approach and decreasing for recession. The new
frequency f ´ can be calculated by using the fact that the velocity of light is a
constant so that f ´ = c/l´.
Doppler shift on reflection

When waves reflect from a moving object the Doppler shift is doubled. The
arriving waves are Doppler shifted and then these Doppler shifted waves are
effectively emitted from a moving source and so are Doppler shifted for a
second time:
v
∆λ reflection = −2 λ 0
c
The Doppler effect is often used to measure the velocity of a remote object.
Doppler radar can be used to work out the velocity and rotation rate of an
asteroid. The Doppler effect is used by the police in radar speed guns to
detect speeding motorists. Astronomers can use the Doppler shifts of radio
waves in the spiral arms of the Milky Way to work out the rotation rate of our
own galaxy. Doctors use Doppler ultrasound to measure the rate of flow of
blood inside capillaries.
14.4.2 “Red Shift” and “Blue Shift”

When visible light is Doppler shifted the change in wavelength causes a change
in color. If the source is moving away from us then the received wavelength
increases and the observed color moves toward the red (longer wavelength) end
of the spectrum. This is called a red-shift. However, we must be careful – if the
shift takes the wavelength beyond the red end of the spectrum then it actually
moves away from red into the infra-red. A red-shift is always a move toward
longer wavelengths but is not always a move toward the red end of the spec-
trum, nor does it have to involve red light.
The red-shift z is defined as:
λ´ −λ 0 v
=z =
λ0 c
where l´ is the longer wavelength received by the observer, l0 is the wave-

length of light that would be observed by an observer at rest with respect to
the source and v is the speed of recession.
FP.CH14_2pp.indd 306 3/14/2023 4:59:39 PM

Light • 307
In a similar way, when a source moves away from us, the light we receive is
said to have been “blue-shifted” (moved to shorter wavelengths).
In the 1920s, the astronomers Edwin Hubble and Vesto Slipher discovered
that the spectral lines in light received from distant galaxies are all red-shifted.
The implication is that distant galaxies are all moving away from us and from
each other. In addition to this Hubble showed that the red-shift is directly
proportional to the distance of the galaxy, so that very distant galaxies are mov-
ing away from us more rapidly than closer galaxies. Since red-shift is directly
proportional to recession velocity this discovery showed that recession veloc-
ity v is directly proportional to distance d. This is called Hubble’s law:
v = H0 d
where H0 is the Hubble constant.

This discovery led to the idea of the expanding universe (see section 28.3).
This was eventually explained by Einstein’s general theory of relativity, which
reinterpreted the galactic red-shifts. According to Einstein’s model, the rea-
son for the red shifts is not that galaxies are flying apart in pre-existing space
but that space itself is expanding so that the separation between the galaxies
increases. Either way, we can use the red-shift to calculate the speed at which
any particular galaxy is receding from us.
In 1964, the American radio astronomers Arno Penzias and Robert Wilson
discovered that the Universe is bathed in low intensity microwaves with an
almost perfect black body radiation spectrum. These are believed to be an
echo of the Big Bang. According to the Big Bang theory the early Universe
was filled with high-energy short wavelength gamma-radiation, but in the past
13.7 billion years of expansion this has been hugely red-shifted and is now
present as the cosmic microwave background radiation.
14.5 EXERCISES
1. The distance to the Moon has been measured by reflecting light from
arrays of mirrors left on the Moon’s surface by Apollo astronauts. The mir-
rors are arranged as corner reflectors that reflect incoming beams back on
themselves. However, the spreading of the beam means that only about
1 in 1017 photons leaving Earth are detected in the reflection! The uncer-
tainty in the measurements of distance is of the order of 1 cm.
FP.CH14_2pp.indd 307 3/14/2023 4:59:39 PM

The speed of light is: 299 792 458 ms- 1

The distance to the Moon is (on average) 385 000.6 km.
Calculate the time taken for light to make the return trip to the
(a)
Moon.
How precisely must this time be measured in order to achieve a dis-
(b)
tance accuracy of ± 1 cm?
(c) Why is it impossible to prevent the beam from spreading?
2. A magnifying glass of focal length 4.0 cm is used to form a magnified
image of a small insect placed 3.0 cm from the lens.
Draw a ray diagram to show how the magnified image is formed.

(a)
(b) Calculate the linear magnification of the image.
Describe and explain what happens to the image as the magnifying
(c)
glass is slowly moved away from the insect.
3. Complete the table below, which refers to a convex lens of focal length
20 cm.
Object distance Image distance Linear magnification Nature of image (real/virtual)
10 cm
20 cm
30 cm
40 cm
infinity
4. Complete the table below, which refers to a concave lens of focal length
20 cm.
Object distance Image distance Linear magnification Nature of image (real/virtual)
10 cm
20 cm
30 cm
infinity
5. An astronomical refracting telescope has an objective lens of diameter

of 5.0 cm and focal length of 120 cm. It is in normal adjustment with an
eyepiece lens of focal length 4.0 cm.
FP.CH14_2pp.indd 308 3/14/2023 4:59:39 PM

Light • 309
(a) What is the magnifying power of the telescope?

The eyepiece is changed for one of focal length 6.0 cm and read-
(b)
justed so that it is in normal adjustment with the new lens. How does
this affect the magnifying power and the length of the telescope?
A policeman with a radar gun measures a Doppler shift of 83 parts
6. (a)
in a billion for radio waves reflected from an approaching car. How
fast is the car moving?
An astronomer measures a red-shift for a distant galaxy of 0.025.
(b)
How far away is the galaxy? The Hubble constant H0 = 2.2 × 10- 18 s- 1.
7. Suggest an experimental method, based on the Doppler effect, that could
be used to measure the rotation period of the Sun.
FP.CH14_2pp.indd 309 3/14/2023 4:59:39 PM

FP.CH14_2pp.indd 310 3/14/2023 4:59:39 PM
CHAPTER
15
Superposition Effects
15.0 SUPERPOSITION EFFECTS

Superposition is what happens when two or more waves are present at
the same point. This might occur because they originate from different
sources or because of reflection. If the waves are of the same type the
resultant disturbance at that point is the vector sum of the disturbances
from each individual wave. This is called the “principle of superposition.”
We can determine the effects of superposition graphically, by adding pha-
sors or by calculation. Many important phenomena are linked to superposi-
tion including interference and diffraction and the formation of standing
(stationary) waves.
15.1 TWO-SOURCE INTERFERENCE

If waves of the same type with equal wavelength, frequency, and amplitude
are emitted from two sources placed a short distance apart (comparable to a
few wavelengths) then a regular interference pattern is formed. The pattern
consists of regions where the waves reinforce to produce maximum intensity
(constructive interference) and regions where they cancel to produce mini-
mum intensity (destructive interference).
The famous double slit experiment carried out by Thomas Young in 1801
provided strong evidence for the wavelength of light and enabled Young to
calculate its wavelength. A similar set up with sound can be used to demon-
strate superposition patterns and, in a modified form to create noise-canceling
headphones.
FP.CH15_2pp.indd 311 3/14/2023 5:11:41 PM

screen or
detecon plane
maximum
*
Source 1 minimum
SUPERPOSITION maximum
Source 2
* minimum
maximum
In order for stable clear interference effects to be created the two sources
must be coherent.
Coherent sources maintain a constant phase relationship.
This means that they must be the same type of wave, and have the same wave-
length and frequency. The sources do not have to be in phase but the phase
difference between them must be constant. They must also have comparable
amplitudes; if one wave has a much greater amplitude than the other then
variations in intensity will be hard to detect.
15.1.1 Demonstrating Superposition Effects with Sound

If two loudspeakers are connected to the same signal generator they will
emit sound waves with the same amplitude, wavelength, and frequency, so an
interference pattern will be formed where they superpose. The pattern can be
explored by moving a microphone connected to an oscilloscope through the
region of superposition. If this is done on a large enough scale it is possible to
walk through the superposition region and actually hear the variation in sound
intensity.
In the experiment above the microphone is moved along the line AB and
it detects a sequence of maxima and minima. When the microphone is in
any position the contributions from each speaker can be displayed separately
by switching each one off in turn. The resultant disturbance depends on the
phase difference between the waves as they reach the microphone. This
depends on the path difference Δx = S2M − S1M. A path difference of one
FP.CH15_2pp.indd 312 3/14/2023 5:11:41 PM

Superposition Effects • 313
microphone
M
connecon
S to
Signal
generator oscilloscope
whole wavelength corresponds to a phase difference of 2π radians, so the

relationship between path difference and phase difference is:
2p∆x
∆f =
l
For a maximum the waves must reach the microphone in phase, this occurs
when the path difference is a whole number of wavelengths. For a minimum,
they must arrive in antiphase (an odd multiple of π phase difference), this
happens when the path difference is an odd number of half wavelengths.
There will be a maximum at the center of the pattern because S2M = S1M
here so the path difference and phase difference are both zero. Minima will
occur when x = λ/2, 3λ/2, 5λ/2, etc. and these minima will be positioned sym-
metrically about the central maximum. In between the minima there will be
further maxima when x = λ, 2λ, 3λ, etc. The intensity between maxima will
vary continuously from maximum to zero and then back to a maximum. The
path difference, phase difference, and effects at maxima and minima are tabu-
lated below.
Path difference Phase difference (rad) Resultant intensity
0 0 Central maximum
½λ π Minimum
λ 2π Maximum
/2 λ
3
3π Minimum
2λ 4π Maximum
5
/2 λ 5π Minimum
FP.CH15_2pp.indd 313 3/14/2023 5:11:42 PM

15.1.2 Demonstrating Superposition Effects with Light

Light from a filament lamp is polychromatic and incoherent, producing short
wave trains from different atoms in the filament so these sources are not well
suited to producing interference effects. When incoherent light superposes
the interference effects average out so that we simply add intensities and do
not get a pattern of maxima and minima. An example of this is the pool of light
where a car’s two headlamp beams overlap.
Lasers are ideal because they are intense monochromatic sources. However,
the way in which laser light is produced means that the wave trains only remain
coherent for a short time. If we wish to demonstrate interference it is best to
derive the two light sources from a single beam. This can be done by diffract-
ing the beam with a single slit and then letting the light fall onto a double slit.
LASER SUPERPOSITION
single
slit interference
paern
double screen
slit
The diagram above has exaggerated the angle of spread from each slit. In
practice the angles are small and this allows us to use the small angle approxi-
mation when deriving the relationship between the wavelength of the light
and the structure of the pattern. It is also important to realize that the slit
separation is very small compared to the distance between the double slits
and the screen.
Here is an image of an actual double slit pattern formed using a green laser pen.
FP.CH15_2pp.indd 314 3/14/2023 5:11:42 PM

The small bright vertical patches are interference maxima (sometimes called
interference fringes). Notice that the pattern itself fades in and out. This is a
secondary effect caused by the diffraction pattern from individual slits rather
than the interference pattern between different slits.
15.1.3 Using the Double Slit Experiment to Find the Wavelength of Light
Young’s double slit experiment was used to measure the wavelength of visible
light. We can derive an equation that relates the wavelength of light to three
parameters: the separation of the double slits s, the distance from the double
slits to the screen d, and the separation of adjacent maxima in the interference
pattern y (fringe separation).
P st
1 max.
y
*
S1
s max.
O
*
S2 T
d
screen or
detecon
plane
P is the position of the first maximum above the central maximum. A line
drawn from O, half-way between the two slits, to P, makes an angle θ with the
central axis. If a line is now drawn from S1 perpendicular to OP then distances
S1T and TP are equal. The distance S2T is therefore equal to the path differ-
ence and this must be equal to one wavelength:
y S2 T l
sin q= = =
d s s
Rearranging:
sy

d
The wavelength of light can be determined by measuring the slit separation,

the fringe separation, and the distance to the screen.
FP.CH15_2pp.indd 315 3/14/2023 5:11:45 PM

15.1.4 Superposition of Harmonic Waves

When two harmonic waves superpose the resultant effect is the sum of two
sinusoidal oscillations. If the two waves have the same amplitude this sum can
be represented by adding two sine or cosine functions, y1 and y2 to give the
resulting disturbance Y:
=y1 A sin ( ωt )
y2 A sin ( ωt + φ )
=
Y= y1 + y2= A {sin ( ωt ) + sin ( ωt + φ )}

where φ is the phase difference between the two oscillations. This will depend
on the path difference x between waves from the two sources.
This can be simplified using a well-known trigonometric identity:
 R+T   R−T 
sin R + sin T =
2 sin   cos  
 2   2 
to give:
 φ φ
=Y 2 A sin  w t +  cos  
 2 2
This is most easily interpreted by grouping it differently so that there is a

phase-dependent amplitude multiplied by a sinusoidal oscillation:
  φ   φ
=Y 2 A cos    sin  w t + 
  2   2
The term in square brackets represents the amplitude. When φ/2 = 0, 2π, 4π,
etc., the cosine term will be one, representing a maximum in the interference
pattern and the amplitude will be 2A. When φ/2 = 0, π, 3π, etc., it will be zero
representing a minimum.
The graphs below show the resultant oscillations when two waves, each of ampli-
tude 2 units, combine with various different phase differences. The two broken
lines represent oscillations from waves 1 to 2 while the solid line is the sum of these.
Another way to determine the resultant of two superposed waves is to add
phasors representing each wave. This approach is illustrated below for the
FP.CH15_2pp.indd 316 3/14/2023 5:11:45 PM

Superposion effects
6
4
Phase difference = 0
2 Lines for waves 1 and 2
0 are on top of each
Y
-2 other.
-4
-6 Resultant amplitude =
me 2+2=4
3
Phase difference =
2
1 Lines for waves 1 and 2
0 are in anphase.
Y
-1
-2 Resultant amplitude = 0
-3
me
4
Phase difference = /2
2
Lines for waves 1 and 2
0 are /2 out of phase.
Y
-2 Resultant amplitude =
-4 22 cos (/4) = 22
me
final case where the amplitude of each wave is 2 units and the phase differ-
ence is π/2.
Phasors are particularly helpful for visualizing how two or more waves will
superpose.
FP.CH15_2pp.indd 317 3/14/2023 5:11:46 PM


22
2
2 /2 /4
phasor for phasor for

resultant phasor
wave 1 wave 2
15.2 DIFFRACTION GRATINGS

A transmission grating consists of a large number of parallel equally spaced
narrow slits. When light passes through the grating it is diffracted by each
slit and light from a large number of sources overlaps to create a pattern of
interference (often referred to as a diffraction pattern). Diffraction gratings
produce intense well-separated and sharply defined maxima and are useful
for analyzing the spectrum of light sources. This is called spectroscopy. The
typical arrangement of apparatus used to measure wavelengths of light is
shown below.
Second-order maximum
First-order maximum
central (zeroth-
LASER order) maximum
First-order maximum
Second-order maximum
Here is an image of an actual experimental set-up using a green laser pen as

source.
FP.CH15_2pp.indd 318 3/14/2023 5:11:46 PM

diffracon
maxima (orders
of diffracon)
diffracon
grang
screen
LASER pen
15.2.1 The Diffraction Grating Formula

In practice, the slit separation on a diffraction grating is very small compared
to the distance from the grating to the screen, so that rays leaving adjacent
slits and superposing at a point on the screen effectively travel parallel to one
another. This simplifies the analysis because we can assume the rays are par-
allel as they leave the grating. In the analysis below we will consider parallel
rays travelling at an angle θ to the normal to the grating and superposing at a
point P on the screen. The first diagram shows the large-scale situation and
the second diagram is a highly magnified picture of some of the rays leaving
slits in the grating.
FP.CH15_2pp.indd 319 3/14/2023 5:11:47 PM

LASER O
central (zeroth
L
order) maximum
Diffracon grang with

N lines per meter and
slit separaon d
Often the grating is described by the number of lines per meter (or per mil-
limeter) N, rather than the slit separation d. These are related by d = 1/N.
The path difference between the ray from S1 and the ray from S2 is S2Q = d
sin θ.
rays will superpose

at point P on screen.
S1
d

Q
S2
R
S3
It is clear from the diagram that the path difference between the ray from
S1 and the ray from S3 is just double this, and that the path difference to a
FP.CH15_2pp.indd 320 3/14/2023 5:11:47 PM

slit m times further away is m times greater. It follows that if the rays leaving
adjacent slits are in phase, then ALL rays across the entire grating will be in
phase at that angle and an intense maximum will be created. These maxima
are called orders of diffraction.
For rays from adjacent slits to be in phase the path difference between them
must be an integer number of wavelengths. The condition for a maximum is
therefore:
nl
sin q = or n d sin
d
where n is the order of diffraction (equal to the number of wavelengths path

difference between rays from adjacent slits).
This is the diffraction grating equation. If we know d and can measure q we
can calculate λ.
Intensity of Maxima
Intensity I is proportional to amplitude-squared (I∝A2) so in the double slit
experiment the maxima will have double amplitude and four times the inten-
sity of the light from a single slit. When we use a diffraction grating N slits
contribute to each maximum so the amplitude is N times greater and the
intensity is N2 times greater. This makes the maxima much more intense than
in the double-slit experiment.
Sharpness of Maxima
In the double slit experiment, the intensity varies gradually between one
maximum and the next minimum so that the maxima are not sharp. This
is because the path difference between adjacent slits varies slowly as we
move away from a maximum position. This is also true for adjacent slits in
the diffraction grating, but the resultant effect on the screen is the sum of
waves from slits all the way across the grating, and even a small displace-
ment from the maximum position results in a large change in path differ-
ence from a distant slit. The result is that the sum of rays across the grating
falls to zero very rapidly as we move away from each maximum. This is
best illustrated using phasors. The diagrams below compare the effect of
moving the same short distance from a maximum for both the double slit
and for a grating with just eight slits (in practice gratings have many more
than this).
FP.CH15_2pp.indd 321 3/14/2023 5:11:49 PM

As the angle moves away from a diffraction grating maximum the phasors
from slits across the width of the grating curl up more and more so that the
maximum itself is sharp and has a series of closely packed secondary maxima
on either side of it. The more slits the sharper the maximum. This increases
the precision with which wavelengths can be measured.
Two phasors have a

+ = small phase
difference so
resultant is large.
Double slit
+ + + + + + + =
Grang
phasors “curl up” and

resultant is zero
Number of Orders
The maximum possible path difference between rays leaving adjacent slits
occurs when the rays leave parallel to the grating surface (q = 90°) and is
then equal to the slit separation d. The maximum value of n is therefore d/λ.
However, n can only be an integer so the maximum number of orders must be
the largest integer less than this; for example, if d/λ = 5.7 then 5 orders of dif-
fraction could be formed on either side of the central maximum. In practice,
these might not all be visible. The intensity of light diffracted at large angles
might be too low for them to be seen or they might correspond to missing
orders where there is a minimum of the single slit diffraction pattern (see
Section 15.3.1).
FP.CH15_2pp.indd 322 3/14/2023 5:11:49 PM

15.2.2 Spectroscopy
Spectroscopy is the analysis of electromagnetic radiation to identify the
wavelengths present in a source. This is one of the most important experi-
mental techniques in science and is used in a vast range of different areas of
research, from cosmology to atomic physics.
violet Increasing red

wavelength
connuous
spectrum
line emission
spectrum
line absorpon
spectrum
When atoms or molecules are excited to higher energy states the electrons
can then return to lower energy states by making quantum jumps and emit-
ting photons. The wavelength of the emitted photon is related to the energy
change by the equation λ = hc/E (see Section 27.2.1) so the greater the energy
jump the shorter the wavelength. For isolated atoms, the energy levels are
very distinct so there is a discrete set of emitted wavelengths forming a “line
emission spectrum.” The spectrum of each atom is unique so an analysis of
its spectrum can be used to identify the types of atom present in the source
(e.g., a distant star). Atoms in molecules interact producing a more complex
spectrum that includes “bands” corresponding to small allowed ranges of
energy. The atoms in solid materials are packed closely together and electrons
occupy wide energy bands so the characteristic emission from a hot solid is a
“continuous spectrum.” When electromagnetic radiation is absorbed by atoms
FP.CH15_2pp.indd 323 3/14/2023 5:11:49 PM

or molecules in lower energy states photons whose energies correspond to

allowed quantum jumps can be absorbed and electrons are excited to higher
energy states. This removes certain wavelengths from the radiation so that the
transmitted radiation will contain dark lines or bands called an “absorption
spectrum.” The wavelengths missing in the absorption spectrum are the same
ones that would be present in the emission spectrum from the same element
or compound.
A typical experimental set-up to analyze light from a source (e.g., a laser or a
discharge lamp) is shown below. The diagram shows the measurements that
would need to be taken to calculate one wavelength present in the spectrum.
If there are several then each wavelength would produce its own set of orders
and would need to be measured separately.
L first
maximum
slit
above center
x
1 central
source* 1 maximum
first
diffracon
maximum
grang with slit
separaon d below center
The angle q1 can be found by measuring x and L with a ruler and then using
tan q1 = x/L.
In practice, it is best to use an average value for q1 by measuring the first
order on both sides of the central maximum. This helps to reduce errors due
to alignment. The wavelength is then calculated using:
=l d sin θ1
If several orders are visible each one can be used and wavelength can be cal-
culated from:
d sin θn
l=
n
Another approach is to plot sin qn against n. The gradient of this graph is l /d.
FP.CH15_2pp.indd 324 3/14/2023 5:11:50 PM

15.2.3 Spectrometers
For more precise spectroscopy a specialized instrument called a spectrometer
must be used. A traditional spectrometer consists of a collimating tube, which
ensures that rays from the source arrive along a normal to the grating, and a
telescope that is used to detect the orders of diffraction. These are all mounted
on a rotating base so that the angle between the normal and each order can be
measured from a Vernier scale engraved on the base.
diffracon grang
collimator
source

*
adjustable width base with Vernier

slit scale for angle
The eyepiece of the telescope has a built-in cross-hair so that the position of
the image can be found precisely. The observer sees a fine vertical line in each
wavelength at each order of diffraction. For the greatest precision, the col-
limator slit must be adjusted so that it is very narrow.
Setting up a spectrometer requires great care to ensure that the rays leaving
the collimator are parallel and the diffraction grating is in the plane perpen-
dicular to these rays.
Digital Spectrometers
Digital spectrometers are used to display spectra and to measure wavelengths
but are rarely able to provide the level of precision achievable with a tradi-
tional set-up. Most simple digital spectrometers use an optical fiber to direct
light into the device where it then falls onto a reflection grating. This consists
of a large number of parallel reflecting lines (like on the surface of a CD)
and produces orders of diffraction by reflection. The light is detected by a
CCD detector and the position on the detector corresponds to wavelength so
this can then be recorded digitally and used to generate a graph of intensity
against wavelength.
FP.CH15_2pp.indd 325 3/14/2023 5:11:50 PM

15.3 DIFFRACTION BY SLITS AND HOLES

When waves pass through a small gap or past the edge of an object they
spread out or diffract. This results in superposition effects that create inter-
ference patterns. For slits and holes, the amount of diffraction depends on
the ratio of the wavelength l of the wave to the size d of the aperture (λ/d).
The larger this ratio the more significant the effect. This is illustrated below
(simplified), showing plane waves passing through gaps of different size. As
the ratio approaches 1, the waves spread out in all directions.
The intensity of the diffracted waves varies with angle and usually has maxima
and minima.
15.3.1 Diffraction by a Narrow Slit

When waves pass through a narrow slit a characteristic diffraction pattern is
formed with a broad central maximum and narrower dimmer secondary max-
ima spread out symmetrically on either side. A simple experimental arrange-
ment is shown below:
narrower
secondary
maxima
broad central
LASER
maximum
narrower
Single slit of width w secondary
maxima
FP.CH15_2pp.indd 326 3/14/2023 5:11:51 PM

Here is a graph showing the intensity variation on the screen.
intensity
posion on screen
The pattern for a thin wire of diameter d is identical to that for a slit of width
w when w = d.
15.3.2 Analysis of the Single Slit Diffraction Pattern

A complete mathematical analysis of this pattern is beyond the scope of this
book, but we can derive a formula for the positions of the minima in the pat-
tern by considering the angles at which light emitted from across the width
of the slit interferes destructively. This is based on an idea from Christian
Huyghens who realized that we can consider every point on a wave front as
if it is an independent source of circular (or spherical) waves. We then need
to add up the contributions from all such points. This leads to an integral that
gives the resultant intensity at any point on the screen. To find the angles at
which minima occur we simply need to find the condition under which the
waves from all points add to zero.
The diagram below shows the region close to the slit and has divided the slit
up into a large number of point sources. The angle shown corresponds to the
first minimum from the center of the diffraction pattern. Rays from the edges
and center of the slit are shown. Rays from intermediate points have not been
shown.
Rays from all points across the slit reach the same point on the screen and add
to zero. Consider rays from A and B. If these two rays are π out of phase the
pair will add to zero. Now consider the two rays a small distance further down
the slit from A and a small distance further down the slit from B. These will
have the same small phase difference from A and from B but will also be π
FP.CH15_2pp.indd 327 3/14/2023 5:11:51 PM

A
D
w B
E
out of phase with each other and so will also add to zero. In fact, continuing
this argument, all the rays will cancel in pairs across the slit and the resultant
at the screen will be zero. The condition when this first occurs is when the
ray at B is π out of phase with the ray at A and the path difference BE is λ/2.
Consider triangle BCE when this occurs:
B
E BE/BC = sin
BE = /2 and BC = w/2
so the first minimum occurs when:
sin = /w
C
The condition for the nth minimum is when rays can cancel in pairs n times
across the width of the slit. The path difference between the rays at each edge
of the slit must therefore be nλ so that:
nl
sin θn =
w
FP.CH15_2pp.indd 328 3/14/2023 5:11:52 PM

This has a similar form to the equation for the maxima of a diffraction grating –
don’t forget that this gives the positions of minima!
If the pattern subtends a small angle at the slit then the minima are equally
spaced and the central maximum, which goes from the minima on either side,
is exactly double the width of the secondary maxima. This contrasts with the
double slit pattern where the maxima have equal widths.
15.3.3 Diffraction Through a Circular Hole

The diffraction pattern through a circular hole or by a circular object is quali-
tatively similar to that for a slit, but the pattern consists of concentric rings
rather than lines and the angular positions of the minima are not the same.
The first minimum for a circular diffraction pattern occurs when:
1.22l
sin q =
D
where D is the diameter of the hole or object.
narrower
secondary
maxima
broad central
LASER maximum
circular hole of narrower

diameter D secondary
maxima
It is quite easy to demonstrate this pattern in the laboratory. Lycopodium

powder consists of fine pollen grains that are all roughly spherical and of
about the same diameter. If a microscope slide is dipped into the powder
and then held in front of a green laser pen the diffraction pattern can be
projected onto a screen. However, for this to be convincing you will need a
completely blacked-out room. The pattern itself is the superposition of many
patterns formed by the individual particles. However, the particles are very
close together so they create a single pattern on the screen.
Diffraction patterns from slits, wires, holes, and circular objects can be used
to measure the size of the diffracting object. There is an inverse relationship
FP.CH15_2pp.indd 329 3/14/2023 5:11:52 PM

between the size of the object and the size of the pattern, so the smaller the
object the larger the angle at which the first minimum occurs. In practice, the
wavelength of the source needs to be known and the experiment consists of
measuring the angle at which the first minimum occurs.
15.3.4 Resolving Power and the Rayleigh Criterion

When light enters the objective of an optical instrument it diffracts. This lim-
its how sharply the instrument can focus images. The amount of diffraction
depends on the ratio λ/D, where D is the diameter of the objective, so instru-
ments with large diameter objectives working at short wavelengths have the
greatest ability to resolve detail in the images they produce.
The ability of an optical instrument to resolve detail is called its “resolving power”
and the smallest angular separation between object points that can be distin-
guished as separate image points is called the “limit of resolution.” When instru-
ments are compared, we use the Rayleigh criterion to give a value for the limit of
resolution. Rayleigh suggested that the diffraction limit occurs when the central
maximum of the diffraction pattern from one object falls onto the first mini-
mum of the diffraction pattern from the other object. This would occur when
the angular separation θ of the two objects at the objective is equal to 1.22λ/D.
object 1
*
*object 2 objecve
diameter D
Rayleigh criterion:
1.22 not resolved
q<
D
1.22 limit of resolution
q=
D
1.22 resolved
q>
D
FP.CH15_2pp.indd 330 3/14/2023 5:11:53 PM

In practice, diffraction is only one factor that limits resolution so these

diffraction limits are rarely achieved. Other factors are aberration, atmos-
pheric effects, sensitivity of the sensors, etc.
It is interesting to compare the limit of resolution of the human eye with that
of a large optical and radio telescope. We will take 500 nm as a representative
visible wavelength and 20 cm as a representative radio wavelength.
Human Eye
Pupil diameter 5. 0 mm, wavelength 500 nm, limit of resolution: 1.2×10−4 rad
Large Optical Telescope
Objective diameter 10 m, wavelength 500 nm, limit of resolution: 6.1×10−8 rad
Large Radio Telescope
Objective diameter 500 m, wavelength 20 cm, limit of resolution: 4.9×10−4 rad
The resolving power of the large optical telescope is, as expected, much
greater than that of the unaided human eye but it might be surprising to real-
ize that the world’s largest radio telescope “sees” the Universe in less detail
than our eyes, albeit seeing different features.
15.4 STANDING (STATIONARY) WAVES

Standing waves are formed when two similar waves travel in opposite direc-
tions superpose. This can occur because of reflection or as a particular exam-
ple of two-source interference. The resultant disturbance has positions called
antinodes, where the waves always combine in phase to produce a large ampli-
tude, and regions called nodes where the waves always combine in antiphase
and cancel out. Standing waves are responsible for the sound of musical
instruments and the existence of energy levels in atoms.
15.4.1 Standing Waves on a String (Melde’s Experiment)

One end of a string is attached to a vibrator and the other end is connected
over a pulley to some hanging masses. At certain vibration frequencies stand-
ing waves are set up on the string.
The ends of the string are effectively fixed. These are the boundary conditions
so there must be nodes at each end. Standing waves can then be formed at
FP.CH15_2pp.indd 331 3/14/2023 5:11:53 PM

332 • Foundations of Physics 2/E Superposion effects
signal vibrator
generator
or
frequencies that “fit” the length of the string – in other words, patterns that
have nodes at the ends. The lowest frequency, longest wavelength wave that
sets up a standing wave is called the fundamental.
The sequence of different standing waves is harmonics. Since the speed v of
transverse waves on a string is determined only by the tension and mass per
unit length of the string it is the same for all harmonics and the frequencies
form a simple sequence. The diagrams on the next page show the first few
harmonics for the string above.
It is clear that the harmonic frequencies are all integer multiples of the funda-
mental frequency f1. The frequency of the nth harmonic will be nf1. A similar
series is obtained if the boundary conditions have an antinode at both ends
(e.g., standing sound waves in a tube open at both ends).
If the boundary conditions are different at each end, so that there is a node
at one end and an antinode at the other, then the sequence is f1, 3f1, 5f1, etc.,
all the odd multiples of the fundamental frequency (e.g., standing waves in a
tube open at one end and closed at the other).
annode
node node
L = 1/2
Fundamental (1st harmonic):
f1 = v/1=v/2L
FP.CH15_2pp.indd 332 3/14/2023 5:11:53 PM

annode annode
node node node
L = 2
2nd harmonic):
f2 = v/2 = v/L = 2f1
annode annode annode
node node node node
L = 33/2
3rd harmonic):
f3 = v/2 = 2v/3L = 3f1
The speed of transverse waves on a string is given by the equation:
T
v=
m
The frequencies of harmonics on the string above are therefore:
n T
fn =
2L m
FP.CH15_2pp.indd 333 3/14/2023 5:11:54 PM

Here are some of the key features of standing waves:

Nodes are regions where the waves always interfere destructively, giving
zero amplitude.
Antinodes are regions where the waves always interfere constructively,
giving maximum amplitude.
The distance between two adjacent nodes, or two adjacent antinodes, is
equal to λ/2.
All points between two nodes oscillate in phase
Points on either side of a node oscillate with a phase difference of π radians.
Standing waves on a stretched string produce sounds in stringed musical
instruments. The player excites the string by plucking, strumming, hitting, or
bowing it and this generates waves with a wide range of frequencies that travel
along the string and reflect from the fixed ends. Only wavelengths that satisfy
the boundary conditions (nodes at each end) create standing waves, so these
dominate the emitted sound, which will usually consist of a range of harmon-
ics. The nature and quality of the sound we hear will depend on the amount of
each harmonic present and the attack and decay rate of the sound.
15.4.2 The Mathematics of Standing Waves

Since standing waves are formed by the superposition of two similar waves
traveling in opposite directions. We can find an equation for a standing wave
by adding the wave equations for these waves together.
=y1 A cos (wt − kx ) wave traveling in +x direction
=y2 A cos (wt + kx ) wave traveling in the −x direction
Y = y1 + y2 = A cos (wt − kx ) + A cos (wt + kx ) equation of a standing wave
where ω = 2πf and k = 2π/λ.

This can be rewritten using the trigonometric identity:
 A+ B  A−B
cos A + cos B =
2 c o s  sin  
 2   2 
To give:
Y = 2 A cos kx cos w t
where we have reversed the order of the cosines.
FP.CH15_2pp.indd 334 3/14/2023 5:11:54 PM

The first part of this, 2A cos kx, can be regarded as a position dependent
amplitude. This will be zero when cos kx = 0 and this occurs when kx = 0, π,
2π, etc., that is, kx = nπ. These are the nodes and the separation of two nodes
is given by kΔx = π:
2p 2p∆x l
k= so = p giving ∆x = (separation of nodes)
l l 2
The maximum value of this term is when cos kx = 1. The amplitude is then 2A.
This occurs at the antinodes, half way between nodes.
The second term, cos ωt, is a simple harmonic oscillation at all points in the
standing wave.
15.5 EXERCISES
1. The diagram shows a two-source interference experiment using sound.
C
A 20 cm
26 cm D
4.0 m
Speakers A and B are 26cm apart along a line parallel to CD. They both
emit sound waves of a single frequency and the same amplitude α and
they are in phase with one another. D is equidistant from A and B. When a
microphone is placed at D it records a sound of maximum intensity. When
it is moved along the line DC it gradually fades to become a minimum at
C, immediately beyond C the intensity increases.
(a) State the phase difference between the waves arriving at D.

(b) State the phase difference between the waves arriving at C.
(c) State the path difference (in wavelengths) between waves reaching C
from sources A and B.
FP.CH15_2pp.indd 335 3/14/2023 5:11:54 PM

(d) State the amplitude and intensity of the resultant sound at D (assume
I = kα2 where k is a constant).
(e) State the amplitude and intensity of the resultant sound at C (assume
I = kα2 where k is a constant).
(f) What would happen to the amplitude and intensity of the sound at D
if one of the speakers was switched off?
(g) Use the dimensions on the figure to work out the wavelength and
frequency of the sound. The speed of sound is 340 ms–1.
(h) How would CD change if the wavelength of the sound was doubled?
(i) How would the distance CD change if the separation of the speakers
was doubled?
(j) Explain why the intensity of sound increases beyond D.
2. Monochromatic light of wavelength 589 nm is passed through a single

slit which diffracts the light onto double slits of separation 0.12 mm. The
double slits act as coherent sources and an interference pattern is formed
on a screen 2.0 m beyond the double slits.
(a) Explain why it is important to use monochromatic coherent sources.

(b) Calculate the separation of maxima on the screen.
(c) Describe and explain the effect on the interference pattern if each of
the following changes are made (independently):
i. The entire apparatus is immersed in water (refractive index 1.33),
ii. One of the two double slits is wider than the other.
iii. The separation between the double slits and the single slit is
increased.
iv. The screen is moved further away from the double slits.
v. Light of longer wavelength is used.
vi. Polarizing filters are placed in front of each slit and one of them
is slowly rotated through 360°, starting off with both polarizing
directions parallel.
(d) Light from a car’s headlamps overlaps. Explain why this does not form
an interference pattern.
FP.CH15_2pp.indd 336 3/14/2023 5:11:55 PM

3. Explain in detail how and why the interference pattern from three slits
compares with one from a double slit if they both have the same slit width
and slit separation and if they are both illuminated with monochromatic
light of the same frequency. Illustrate your answer with a graph showing
how the intensity of the light varies with position on a screen.
4. A diffraction grating has 300 lines per mm and it is illuminated by light
containing two strong emission lines at 480 nm and 520 nm. The width of
each slit is 1.11 μm.
(a) What are the angular positions of the first order maxima for each of
these lines?
(b) What is the angular separation of lines in the second-order spectrum?
(c) How many orders of diffraction are there?
(d) Explain why the third order of diffraction will not be observable.
5. Describe an experiment using a diffraction grating to measure the wave-
length of light from a laser pen. Your description should include:
a labeled diagram of the apparatus.
what you will measure and how the measurements will be made.
how you will calculate the wavelength from the data you collect.
how you will maximize accuracy and precision.
6. Red light of wavelength 525 nm passes along a normal to a narrow vertical
slit of width 0.20 mm.
(a) Calculate the angle to the normal at which the first minimum of the
resultant diffraction pattern occurs.
(b) Sketch a graph of intensity against position (in mm) for the diffraction
pattern formed on a screen 1.2 m from the slit.
(c) Explain how the minima in the pattern are formed.
(d) How would the pattern change if the red light source was replaced
with a blue light source of wavelength 450 nm?
7. Describe an experiment that could be carried out to measure the thick-
ness of a human hair using a laser of a known wavelength. Your descrip-
tion should include:
a labeled diagram of the apparatus.
what you will measure and how the measurements will be made.
FP.CH15_2pp.indd 337 3/14/2023 5:11:55 PM

how you will calculate the wavelength from the data you collect.
how you will maximize accuracy and precision.
8. The pupil of the human eye is about 5.00 mm in diameter.
(a) Use the Raleigh criterion to calculate the theoretical minimum angu-
lar limit of resolution.
(b) If the eye could reach this diffraction limit calculate the maximum
distance at which it could resolve two point sources separated by a
distance of 1.00 mm.
(c) A pair of stars form a binary system 20 light years from Earth. It is just
possible to resolve them into separate images using the naked eye.
Estimate a lower limit for the separation of the stars.
(d) Explain why your answer to (c) is a lower limit.
(e) Explain why a small telescope, with aperture diameter of 15 cm can
easily resolve these two stars.
9. (a) Estimate the maximum distance at which you could resolve a car’s
headlamps into separate sources.
(b) It has been claimed that a spy satellite orbiting at a height of 600 km
could resolve the letters in a car number plate on the surface of the
Earth. Is this a realistic claim?
T
10. The speed of transverse waves on a string is given by v = where T is
m
the tension in the string (N) and μ is the mass per unit length of the string
(kgm−1).
A steel wire of a diameter of 0.12 mm is held at a tension of 26 N between
two fixed points 0.75 m apart. The density of steel is 7800 kgm−3
(a) Calculate the mass per unit length of the steel string.
(b) Calculate the speed of transverse waves in the steel string.
(c) What is the frequency of the fundamental?
(d) List the frequencies of the first three harmonics.
(e) The temperature of the wire increases. State and explain how this will
affect the frequencies of sound emitted by the vibrating wire.
FP.CH15_2pp.indd 338 3/14/2023 5:11:55 PM

CHAPTER
16
Sound
16.1 THE NATURE AND SPEED OF SOUND

Sound waves are longitudinal mechanical vibrations. Audible sound lies in the
range of 20 Hz to 20 kHz and travels at about 340 ms- 1 through the air. Sound
waves with frequencies greater than 20 kHz are called ultrasound, while those
with frequencies below 20 Hz are called infra-sound. The wave itself consists
of compressions and rarefactions of the medium through which it passes so
the speed of sound depends on the mechanical properties of the medium
such as its density and compressibility.
sound source, e.g.,

speaker
compression compression
rarefac on
sound detector,
e.g., microphone
oscilla on
direc on
wavelength λ
speaker cone oscillates and Pressure varia ons in the air

alternately compresses and speed of exert forces on the
rarefies air in front of it sound v microphone which converts
these into electrical signals
The sound wave itself can be described either by the varying par cle
FP.CH16_3pp.indd 339 3/15/2023 12:17:19 PM

The sound wave itself can be described either by the varying particle
displacements at each point or by the varying pressure at each point.
g RT
The speed of sound in an ideal gas is given by: v =
M
where M is the molar mass of the gas and g is the adiabatic gas constant.
E
The speed of sound in a solid material is approximately given by: v =
r
where E is the Young modulus and r is the density of the solid.
The speed of sound in a liquid is given by a similar equation but the Young
modulus is replaced by the bulk modulus of the liquid, a quantity that meas-
ures the compressibility of the liquid.
Typical values for the speed of sound are:
Air at atmospheric pressure and 20°C v = 343 ms- 1
(this falls to about 300 ms- 1 at the altitude of a commercial jet).
Water: v = 1482 ms- 1
Steel: v = 6000 ms- 1 (varies with type of steel).
16.2 THE DECIBEL SCALE

The human ear has a logarithmic response to sound. This allows us to detect
sounds over a very wide range of intensities (from about 1 pWm- 2 to about
1 Wm- 2) but it also means that doubling the intensity of the sound does not
double the apparent loudness of the sound. In fact, for sound to seem twice as
loud, we have to increase its intensity by a factor of 10. The decibel scale takes
this into account. Intensity levels are measured in bels (B) and 1 bel is equal to
10 decibels (dB). The intensity level in bels is related to the threshold of human
hearing at I0 = 1 pWm- 2. To calculate the intensity level in bels corresponding
to an actual intensity I in Wm- 2 we use the equation:
 I   I 
intensity level ( B ) = log10   intensity level ( dB ) = 10 log10  
 I0   I0 
FP.CH16_3pp.indd 340 3/15/2023 12:17:20 PM

Sound • 341
Here are some typical intensity levels:

Threshold of hearing: 0 dB
Breathing: 10 dB
Radio/TV: 70 dB
Live rock music: 110 dB
Jet take off (300 m away): 130 dB
Jet take off (25 m away): 150 dB (eardrum rupture)
Note that the decibel is certainly NOT an SI unit and values of intensity level
are comparative, not absolute.
16.3 STANDING WAVES IN AIR COLUMNS

Many musical instruments (wind instruments) use standing waves in air col-
umns to generate particular sounds. The length of the air column and the
boundary conditions at its ends determine what harmonics will be present in
the sound. A simple demonstration of these standing waves is shown on the left.
signal
generator
parcles at open end

can vibrate with a
large amplitude
vercal cylinder
L
containing air
parcles at closed end

cannot vibrate
As the frequency of the signal generator is gradually increased the sound

intensity increases sharply at certain frequencies. These correspond to the
standing waves formed when waves travelling down the column reflect from
the base and the incident and reflected waves superpose.
FP.CH16_3pp.indd 341 3/15/2023 12:17:21 PM

The air column shown is closed at the bottom and open at the top. Particles
near the bottom cannot undergo longitudinal oscillations so this must be a
node of displacement. The top of the column is open to the atmosphere, so
particles are free to oscillate. This allows an antinode to form but it actually
occurs a small distance e above the top of the column. e is called the end cor-
rection and it is about half the radius of the tube. The effective length of the
resonating column is therefore L + e.
The diagrams below (shown horizontally) show the standing waves that can
be formed in a tube open at one end and closed at the other. Note that, while
the diagrams look like transverse waves, they simply represent the amplitude
of longitudinal vibrations at each position in the tube.
L+e
Fundamental
(1st harmonic)
L + e = λ/4
2nd harmonic
L + e = 3λ/4
3rd harmonic
L + e = 5λ/4
This harmonic series consists of odd multiples of the fundamental frequency:
( 2 n + 1) ln ( 2 n + 1) v
=
L+e =
4 4 fn
=
fn ( 2 n + 1) f1
FP.CH16_3pp.indd 342 3/15/2023 12:17:21 PM

Sound • 343
if the air column is open at both ends the boundary conditions will be
displacement antinodes at both ends (i.e., a column of length L + 2e). If the
column is closed at both ends the boundary conditions will be displacement
antinodes at both ends. In both cases, the harmonic series will be integer mul-
tiples of the fundamental frequency.
It was mentioned previously that we can describe the sound wave in terms
of particle displacements or variations of pressure. However, displacement
nodes are actually pressure antinodes and vice versa. This can be understood
by considering the particle motions on either side of a displacement node.
posion of displacement node
parcles vibrate π out of phase
posion of displacement node
The particles move toward the displacement node and increase the pressure
and then move away from the displacement node and decrease the pressure.
The variation of pressure has a large amplitude here so it is a pressure antinode.
16.4 MEASURING THE SPEED OF SOUND

Here are two simple ways to measure the speed of sound. The first one uses
travelling waves and the second method uses standing waves.
Method 1
mic. 1 mic. 2
sig.
gen.
to to
oscilloscope oscilloscope
FP.CH16_3pp.indd 343 3/15/2023 12:17:25 PM

The signal generator is set to a single frequency (5–10 kHz is suitable) and
the two microphones are connected to a dual-beam oscilloscope. The oscillo-
scope is triggered from the first microphone. The second microphone is then
positioned close to the first one and the two traces, which will be sinusoidal,
are compared. Then x is changed until the signals are in phase. This position
x = x0 is recorded. Now the second microphone is moved away from the first
and until the phase has changed by 2np (i.e., has gone in and out of phase n
times). The new position x = xn is recorded and the value of the wavelength
is calculated from l = (xn - x0)/n. The frequency of the sound can also be
measured from the oscilloscope (using f = 1/T) and the speed of sound is
calculated from v = fl .
Method 2
standing wave paern
sig. A N A N A
gen.
to oscilloscope
solid reflector
A standing wave is formed where the reflected sound superposes with incident
sound. The microphone is moved along a line perpendicular to the reflector
and the signal on the oscilloscope screen has periodic maxima every time the
microphone passes through an antinode. The separation of adjacent maxima is
l/2 so an average value for the separation of adjacent nodes can be found and
the wavelength can be calculated. The oscilloscope can also be used to find the
frequency of the sound so that the speed can again be calculated from v = fl.
16.5 ULTRASOUND
Ultrasound has frequencies greater than 20 kHz, that is, above the highest
sound frequency audible to humans. Ultrasound scanning is used for medical
imaging (e.g., in pre-natal scans). Ultrasound pulses from a transmitter on
the surface of the patient’s skin partially reflect at each boundary inside the
patient. The times of the returning pulses can be used to determine the depth
of the boundary and to map out structure. This is done automatically so that
FP.CH16_3pp.indd 344 3/15/2023 12:17:25 PM

Sound • 345
the ultrasound scanner connects to a computer that displays a digital image of

the organ or fetus being examined.
The fraction of the incident ultrasound intensity that is reflected is deter-
mined by the nature of the tissues on either side of the boundary. The key
parameters are the density and speed of sound in the medium These are com-
bined in the acoustic impedance Z of the medium:
Acoustic impedance z = (speed of sound) × (density)
The SI unit for the acoustic impedance is kgm- 2s- 1
The ratio of the reflected intensity If to the incident intensity I0 is given by the
2

expression: Ir Z2 Z1
I0 Z2 Z1
the greater the difference in acoustic impedances between the two media the
greater the ratio. Here are some values of acoustic impedance:
Medium Speed of sound (ms− 1) Density (kgm− 3) Z (kgm− 2s− 1)

Air 340 1.2 410
Water 1480 1000 1.5 × 106
Muscle 1590 1070 1.7 × 106
Blood 1550 1060 1.6 × 106
Bone 4000 1500 6.0 × 106
The low value of Z for air compared to body tissues means that if there is an
air gap between the ultrasound transmitter and the patient then most of the
ultrasound reflects from the surface and does not enter the body. To solve
this problem, a gel is spread on the skin under the transmitter. The gel has
an acoustic impedance similar to that of water or body tissues so most of the
ultrasound is transmitted rather than reflected.
16.6 ANALYSIS AND SYNTHESIS OF SOUND

Most sounds that we hear contain a wide range of different frequencies and
it is possible to analyze the spectrum of sound in much the same way that we
might use a spectrometer to analyze the spectrum of light. This is usually done
using a microphone connected to a computer that is running an app that acts
as a sound spectrum analyzer. This will then display a graph of sound intensity
(usually in decibels) against frequency. Complex sounds can be analyzed into
FP.CH16_3pp.indd 345 3/15/2023 12:17:30 PM

a sum or sequence of different sinusoidal components. This is called Fourier

analysis.
In the same way, we can add a sequence of harmonic sounds (sinusoidal func-
tions) to synthesize a complex sound. This is called Fourier synthesis. The
mathematics of Fourier analysis and synthesis are beyond the scope of this
book but the idea that complex sounds can be broken down into a sum of
simple sinusoidal components is an important one, both theoretically and
practically.
excess pressure
me
The graph above shows the result of adding three sinusoidal sound waves of
the same amplitude but with frequencies f, 2f, and 4 f. This can also be repre-
sented by the equation:
y A sin ( ωt ) + A sin ( 2ωt ) + A sin ( 4ωt )

=
16.7 EXERCISES
1. Describe an experiment to measure the speed of sound. Your description

should include:
A labeled diagram of the apparatus.
An explanation of what you will measure and how the measurements
will be made.
An explanation of how you will calculate the wavelength from the data
you collect.
An explanation of how you will maximize accuracy and precision.
2. A particular sound has an intensity of 1 mWm- 2 at a listener’s ear.
(a) Calculate the intensity level of this sound in decibels.

The intensity increases by a factor of 1000 to 1mWm- 2. How many
(b)
times louder does the sound seem to be?
FP.CH16_3pp.indd 346 3/15/2023 12:17:30 PM

Sound • 347
3. A student sets up the apparatus below and records the amplitude of sound
emitted by the air column as the tube is gradually filled with water. The
end correction for the tube is 5.0 cm and the frequency of the sound used
is 1200 Hz. The speed of sound is 340 ms- 1.
signal
generator
40 cm vercal cylinder
containing air
h water
supply
(a) Calculate the wavelength of the sound.

Draw diagrams to show how standing waves of sound can be formed
(b)
in the air column.
(c) Calculate the values of h at which the amplitude will be a maximum.
4. When a medical ultrasound scan is carried out the doctor spreads a layer
of gel onto the patient’s skin before pressing the ultrasound transmitter/
receiver against it.
(a) Explain why this is necessary.
Calculate the fraction of the incident ultrasound intensity that is
(b)
reflected when an ultrasound pulse reaches a boundary between mus-
cle and bone. The acoustic impedance for muscle is 1.7 × 106 kgm-2s-1
and the acoustic impedance for bone is 6.0 × 106 kgm-2s-1.
FP.CH16_3pp.indd 347 3/15/2023 12:17:31 PM

FP.CH16_3pp.indd 348 3/15/2023 12:17:31 PM
CHAPTER
17
Electric Charge and
Electric Fields
17.1 ELECTRIC CHARGE
Electric charge Q is a fundamental property carried by some fundamental
particles. There are two types of charge, positive (e.g., the charge of the pro-
ton) and negative (e.g., the charge of the electron). Like charges repel one
another and unlike charges attract.
Charge is quantized. This means that the total charge on any object is always
a multiple of a fundamental amount equal to the magnitude of the charge on
an electron or a proton. The SI unit for charge is the coulomb (C) and the
charge on an electron is:
e = − 1.60217662 × 10−19 C
The charge on the proton has the same magnitude but opposite sign. Atoms
contain equal numbers of protons and neutrons and are neutral. If an atom
loses an electron it becomes a positive ion and if it gains an electron it
becomes a negative ion. You might have read that quark, the fundamental par-
ticles inside protons and neutrons, have fractional charges. However, quarks
are never found as individual particles, they are always combined in pairs
(mesons) or triplets (baryons) and they always combine in a way that makes
the total charge of the composite particle an integer multiple of e.
When charge flows it is called an electric current. Electric current I is defined
as the rate of flow of electric charge:
dQ
I=
dt
FP.CH17_3pp.indd 349 3/15/2023 4:29:02 PM

The SI unit of electric current is the amp (A) and 1 A = 1 Cs−1.

If the current is constant the charge transferred in time t is given by:
Q = It
Some materials, for example, metals, allow charge to flow through them.
These are called conductors. Others, such as plastics and rubber do not allow
charge to flow and are called insulators. A third class of materials, with inter-
mediate properties, is called semi-conductors.
17.2 ELECTROSTATICS
Electrostatics deals with situations where objects become charged, remain
charged, or lose charge. Most of these situations can be explained in terms
of the transfer of electrons. The reason for this is that the electrons are on
the outside of the atom so when atoms interact (e.g., if two materials are
rubbed together) the electrons can move from one place to another. Protons
are locked inside the atomic nucleus so proton transfer does not occur. When
a neutral object loses electrons, it becomes positively charged and when it
gains electrons it becomes negatively charged.
An object is said to be “earthed,” when it is connected to the Earth by a good
conductor (e.g., an electrical wire). From the point of view of electrostatics,
the Earth itself is a huge conducting sphere that can gain or lose any number
of electrons while remaining neutral. Another way of looking at this
is to say that the Earth is always at zero potential so that any object
connected to the earth is also at zero potential. The symbol to show
an earth connection is shown on the right.
17.2.1 Charging by Friction

When a polythene rod is rubbed against a cloth it becomes negatively charged.
Electrons have been transferred from the cloth to the rod. Since polythene
is a good electrical insulator, it can remain charged for some time and can be
used to demonstrate electrostatic effects.
This process is called “charging by friction.” It is the same process that allows
us to build up a charge when walking on a carpet. If the charge becomes large
enough we can experience a sharp electrical shock when we touch an earthed
object. The shock is caused by the brief electric current that flows between
our body and earth in order to neutralize us.
FP.CH17_3pp.indd 350 3/15/2023 4:29:02 PM

Electric Charge and Electric Fields • 351
surface negave
charge on rod
polythene rod
polythene rod
electron transfer + +
+ + +
+ ++
+
cloth rubbed back and
surface
forth along surface of
posive
An acetate rod can be charged positively in the same way. However, in this
case, electrons jump from the rod to the cloth and the rod retains a positive
surface charge. This is because atoms on the surface of the rod are no longer
electrically neutral; some of them have fewer electrons than protons and so
have a net positive charge. However, this positive charge has, once again, been
brought about by the movement of the negatively charged electrons.
17.2.2 The Gold Leaf Electroscope

The gold leaf electroscope is a useful device for detecting the presence of
charge and for demonstrating simple electrostatic effects. Here is a diagram
showing the structure of a gold leaf electroscope:
steel cap
insulator
steel rod
gold leaf
earthed box
FP.CH17_3pp.indd 351 3/15/2023 4:29:03 PM

The cap rod and leaf are all metallic conductors but they are isolated from the
earth by an insulator. If a positively or negatively charged rod is held close to
the cap but not touching it, the leaf will rise. This is because the charge on the
rod exerts electrostatic forces on the free electrons making them move. The
rod and leaf gain the same charge and repel one another. The leaf is very thin
so responds to this by rising. In the diagrams below only the cap, leaf and rod
are shown.
The process by which the cap gains a charge opposite to the charge on the
rod is called electrostatic induction. In the examples above there is no net
charge on the cap leaf and rod, the electrons have just been redistributed.
However, it is possible to use induction to give the electroscope a net positive
or negative charge – this is called charging by induction and is explained in
the sequence of diagrams below (resulting in a net positive charge).
1. Hold a negatively charged rod close to the cap. The leaf rises.
2. Keeping the negatively charged rod in place momentarily earth the cap
and then disconnect the earth. Electrons flow to earth and the leaf falls.
3. Remove the rod. The electroscope has lost electrons so the cap, rod, and
leaf have a net positive charge. When the rod is removed electrons spread
out and there is a positive charge on the cap, leaf, and rod. The leaf rises.
FP.CH17_3pp.indd 352 3/15/2023 4:29:04 PM

A similar process, starting with a positively charged rod, can be followed to

charge the electroscope negatively. Note that the final charge is always oppo-
site to the charge on the rod.
17.2.3 Using a Coulomb Meter

A coulomb meter is an electronic meter that can be used to measure charge.
The charge to be measured must be transferred to the coulomb meter and
then the amount can be read off from the display. Here is an image of a cou-
lomb meter.
this terminal should be charge to be measured is

pressing buon
connected to the earth transferred to the cap
connects cap to earth
and zeroes the meter
charge is measured in
nano-coulombs
It is important that the meter is connected to the earth before use.

The charge on an isolated conductor can be measured using a coulomb meter.
For example, a conducting sphere could be isolated by suspending it from an
FP.CH17_3pp.indd 353 3/15/2023 4:29:04 PM

insulating thread. It can then be charged by touching it momentarily with a

lead connected to the high voltage terminal of an HT supply (high voltage
supply). To measure the charge, it can then be connected momentarily to
the cap of the coulomb meter (after the HT lead has been disconnected!).
Charge flows from the sphere to the cap and the amount of charge trans-
ferred can be read from the display.
17.3 ELECTROSTATIC FORCES

Charges obey a very simple qualitative force law: like charges repel and
unlike charges attract. The magnitude of the force between two point charges
depends on their size and separation and is given by Coulomb’s law, an inverse-
square law similar in form to the law of gravitation.
17.3.1 Coulomb’s Law

The electrostatic force between two point charges Q1 and Q2 separated by a
distance r in a vacuum is directly proportional to the product of the charges
and inversely proportional to the square of their separation:
Q1Q2
F=
4 πε 0 r 2
where ε0 is the “permittivity of free space,” a constant representing the ability

of the vacuum to support an electric field. The SI unit of ε0 is the Fm−1 (F is
the farad, a unit of capacitance equivalent to a coulomb per volt, CV−1). If the
charges are embedded in a different medium the permittivity of free space
is replaced by the permittivity of the medium ε which is usually written as
ε = εrε0 where εr is the relative permittivity of the medium, a dimensionless
number. Here are some values of relative permittivity:
vacuum: 1 (by definition), air: 1.0006 (at STP) so this is usually taken to
be 1; polythene: 2.25; paper: 3.85; mica: 3–6; water: 80.1 (at 20°C); titanium
dioxide: 86–173; calcium copper titanate: > 250 000.
Capacitors (see Chapter 19) consist of two conductors separated by an insulat-
ing (or dielectric) material and their capacitance is directly proportional to the
permittivity of the insulator.
While Coulomb’s law applies to point charges it can be shown that if charge
is distributed uniformly over the surface of a sphere or uniformly throughout
FP.CH17_3pp.indd 354 3/15/2023 4:29:04 PM

the volume of a sphere it acts like a point charge of the same total charge
located at the center of the sphere.
17.3.2 Investigating Electrostatic Forces

It is possible to investigate Coulomb’s law using simple apparatus but care has
to be taken to avoid effects due to induction, etc., with surrounding objects.
Two small, isolated conducting spheres are charged from the same HT supply
and then suspended by insulating strings.

T T
F Q Q F
mg mg
The charges repel one another so the system hangs in equilibrium with the strings
making an angle q to the vertical. q and r can be determined using a digital cam-
era to capture an image of the apparatus with a suitable scale placed behind. The
force F can then be found using the condition for equilibrium of forces:
T cos q = mg
T sin q = F
so F = mg tan q
Q can be found by discharging the sphere to a coulomb meter.
If Q is kept constant (charging from the same high voltage) and the mass of the
spheres is varied then F and r will change, so the inverse-square law can be tested.
Q can be varied by charging the spheres from a different voltage but this will
also change r unless a compensating change in mass is made.
FP.CH17_3pp.indd 355 3/15/2023 4:29:04 PM

17.4 THE ELECTRIC FIELD

Michael Faraday introduced the idea of an electric field to explain how d istant
charges can affect one another, even if there is a vacuum between them. Each
charge sets up a field in the surrounding space and responds to fields from
other charges. This makes electrostatic forces local effects (the field acts on
the charged particle where the charged particle is) rather than an action-
at-a-distance across space. It also implies that the electric field has physical
properties at each point in space: it has strength and direction. It is a vector
field – there is an electric field vector at each point.
17.4.1 Electric Field Strength

Electric field strength E is defined as the force per unit positive charge at a
point in space. If you were to place a small charge q at that point, it would
experience a force F. The magnitude of the electric field strength at that point
is given by:
F
E=
q
the SI unit for electric field strength is NC−1, which is equivalent to Vm−1. The
electric field is a “vector field”: the electric field points in the direction of the
force on a positive charge.
This can be represented by drawing field lines. The direction of the field lines
is the direction of the electric field and the separation of the field lines repre-
sents the strength of the field. The two diagrams below show the shape of the
electric field close to a point positive and a point negative charge.
+
FP.CH17_3pp.indd 356 3/15/2023 4:29:04 PM

Electric field lines:

start on positive charges and end on negative charges,
cannot cross
point in the direction of the force that would act on a positive charge
Electric fields obey a principle of superposition. The resultant field at any
point in space is the vector sum of all the electric fields at that point. In the
example shown below the resultant field at point C is the vector sum of the
fields at that point due to charges A and B: EC = EA + EB
EC
EB
C EA
A +
B +
The diagram below shows the electric field close to a dipole, two charges of equal
magnitude separated by a small distance. Many molecules have electric dipoles.
- +
FP.CH17_3pp.indd 357 3/15/2023 4:29:05 PM

17.4.2 Electric Field Strength of a Point Charge

Imagine placing a small charge q a distance r from a point charge Q. The force
exerted on q would be, from Coulomb’s law:
Qq
F=
4pe 0 r 2
The electric field strength at that point is:
The electric field of a point charge obeys an inverse-square law.

This also makes sense from a geometrical point of view. Imagine the field
lines from a point positive charge spreading uniformly out in all directions
in three-dimensional space. They would spread over an area 4πr2 at distance
r from the charge, so the density of field lines would fall off as an inverse-
square law.
This formula can be used to find the resultant electric field from a number or
distribution of charges.
The diagram below shows an electric dipole consisting of two charges, +Q
and −Q, separated by a distance 2a. Consider the electric field at a point P
distance x from the center of a dipole along its axis:
x
+Q Q
E E+
2a P
+Q −Q − axQ
EP = + =
4pe 0 ( x + a ) 4pe 0 ( x − a ) pe 0 ( x2 − a2 )
2 2 2
The negative sign indicates a resultant field to the left.
FP.CH17_3pp.indd 358 3/15/2023 4:29:05 PM

E+
P
x E

+Q Q
2a
Here is another example at another point in the dipole field:

Point P is equidistant from the two charges so the magnitude of fields E+ and
E− are equal. The vertical components cancel. The resultant is therefore a
horizontal field of strength:
2Q cos θ 2Qa
=EP =
4pe 0 ( x2 + a2 )
3
4pe 0 ( x2 + a2 ) 2
Sometimes the electric fields of two or more charged particles sum to zero
at a point. This is called a neutral point. For example, half-way between two
charges of the same magnitude and sign:
neutral point
a a
+ P +
P = − =0
πε0 πε 0
17.4.3 Gauss’s Law

Consider the electric field that passes through a closed spherical surface of
radius r centered on a point positive charge. Since lines of electric field can-
not start or end in empty space the flux of field lines through the surface is the
same whatever the radius of the sphere.
The electric flux ΦE is the product of the normal component of the electric
field E and the area A of the surface then:
ΦE = EA = E×4p r2
FP.CH17_3pp.indd 359 3/15/2023 4:29:06 PM

spherical surface
of area A = 4r2
+
When this is rearranged and compared with the expression for the field
strength of a point charge it is clear that the flux through the surface is directly
related to the charge within that surface.
ΦE Q
=E =
4p r 2
4pe 0 r 2
so that: Φ E =Q
e0
This is an example of a general result known as Gauss’s theorem. This states
that:
The total flux through any closed surface is equal to the total charge con-
tained within that surface divided by the permittivity of free space (or of
the medium if the charge is not in a vacuum).
This is a powerful theorem that can be used to understand how the electric
field behaves in a range of important situations. Gauss’s theorem can be stated
more precisely using an integral:
∑
i= N
Qi
=
ΦE ∫=
E .dS i =1
surface
e0
where dS is an infinitesimal element of the surface area at some point and

E.dS is the product of the perpendicular component of E at that point and
the area dS. The integral sums these contributions to find the total flux (in
the previous example the field strength was constant and perpendicular to
i= N
the surface so we simply multiplied the values together). ∑Q
i =1
i
is the sum
of charges contained inside the closed surface. The next section shows how
Gauss’s theorem can be used to derive some important results.
FP.CH17_3pp.indd 360 3/15/2023 4:29:06 PM

17.4.4 Using Gauss’s Theorem

(i) The electric field near the surface of a charged conductor
The electric field strength inside a perfect conductor must be zero. If it was
not zero charges would move until it became zero. This means that when
a conductor is charged the charge stays on the surface. The diagram below
shows a small section of the surface of a charged conductor that has a surface
charge density s. The dotted lines indicate a Gaussian surface in the shape of a
cylinder with the top surface of the cylinder (of area A) just above the surface
of the conductor and the lower surface just below it.
+ + + + + + + + + + + + + + + +
conductor
The electric field through the lower surface must be zero because it is
inside the conductor.
The electric field through the sides of the cylinder must also be zero because
there can be no component of electric field on the surface of the conductor.
All of the electric fields must pass through the upper surface.
Using Gauss’s theorem:
Flux through upper surface = EA = charge contained inside Gaussian surface
divided by ε0 = σA/ε0.
Conclusions:
s
The electric field strength close to a conducting surface is E = and this is
perpendicular to the surface. e0
(ii) Electric field strength inside a hollow conductor (Faraday cage)

Consider a closed hollow box made of conducting material and containing no
free charges as shown below. The dotted line represents a closed Gaussian
surface entirely within the conductor.
The electric field strength is zero inside the conductor so the flux through the
Gaussian surface is also zero. Using Gauss’s theorem the total charge con-
tained within the Gaussian surface and therefore inside the hollow box must
also be zero. This remains the case even if other charged objects are brought
close to the box or if electromagnetic waves are incident on the box. The field
strength inside the box remains zero.
FP.CH17_3pp.indd 361 3/15/2023 4:29:06 PM

hollow Gaussian
conducng surface
box
This is an example of a Faraday cage – a conducting box used to shield its con-
tents from external electric fields and electromagnetic waves. Metal cars and
aircraft act as Faraday cages protecting their occupants from lightning strikes.
A room with conducting walls, floor, and ceiling can also be used to provide
security against unwanted communications – for example, mobile phone sig-
nals (which cannot enter or leave the room). The effect can be demonstrated
by placing a mobile phone inside a metal cookie tin. When the tin is closed it
is impossible to ring the mobile phone.
17.5 ELECTRIC POTENTIAL ENERGY AND ELECTRIC

POTENTIAL
Electric fields exert forces on electric charges, so when a charge moves from
one point to another in an electric field its electric potential energy changes.
If the field does work on the charge then the electric potential energy falls
and if work is done on the charge by an external agent then the electric poten-
tial energy increases. This is shown in the diagram below where two positive
charges are moved in a uniform electric field.
B
The electric field exerts + The electric field
a force downwards on exerts a force
A but A moves upwards downwards on B and B
so work is done on A by moves downwards so
an external agent and work is done by the
the electric potenal + electric field and the
energy increases. A electric potenal
energy decreases.
FP.CH17_3pp.indd 362 3/15/2023 4:29:07 PM

17.5.1 Electric Potential and Potential Difference

The electric potential V at a point in space is equal to the electrical potential
energy (EPE) per unit charge (Q) if a small positive charge were placed at
that point. This can be written as:
EPE
V=
Q
and the SI unit of potential is the joule per coulomb: JC–1 which is the volt, V.
In other words, 1 V = 1 JC–1. Electric potential, like energy, is a scalar quantity,
so the potential at any point in space is the sum of potentials due to all fields
at that point.
The electrical potential difference ΔV between two points is equal to the work
that must be done per unit charge in moving the charge between the two
points concerned.
ΔV = W/Q
For example, if there is a potential difference of 3.0 V across an electrical

component then 3J of energy is transferred from electrical potential energy to
other forms when 1 C of charge passes through the component.
(While we have distinguished the absolute potential V from the potential dif-
ference ΔV here it is often the case that V is used for both.)
17.5.2 Electric Potential Gradient and Electric Field Strength

To derive the relationship between electric field strength and electric potential
we must investigate the work done on free charges placed in an electric field.
E
F = EQ
+
x
If a free point charge of value +Q is moved a distance δx by an electric field of

strength E then the force on the charge is given by:
F = QE
FP.CH17_3pp.indd 363 3/15/2023 4:29:07 PM

and the work done on it is:

δW = Fδx = QEδx
This work has been done by the electric field so the electric potential energy of
the system has fallen (in the same way that gravitational potential energy falls
when an object is dropped):
δEPE =
−QEδx
And the change in electric potential is:

δEPE
δV = =− Eδx
Q
δV
= −E
δx
In the limit that δx→0 the relationship becomes:

dV
E= −
dx
electric field strength = negative potential gradient.
This can also be written as an integral to find the potential difference between
two points A and B in an electric field:
VB xB
∆V =∫ dV =− ∫ Edx
VA xA
Three immediate consequences of these equations are:

there is no change in potential when a charge is moved in a direction per-
pendicular to the electric field lines. In this direction E = 0 so dV/dx = 0
too and the potential is constant.
the potential changes more slowly with distance when the field is weak
(dV/dx is lower because E is lower).
The direction of the electric field is toward lower potential (this is the
significance of the minus sign).
Equipotential surfaces can be drawn perpendicular to the field lines. The dia-
gram below shows the equipotential surfaces (spherical surfaces) surrounding
FP.CH17_3pp.indd 364 3/15/2023 4:29:07 PM

a point charge. The steps in potential between adjacent equipotentials are

constant so their separation increases farther from the central charge, this is
because the field is weaker farther out so the potential is changing more slowly.
equipotenal
surface
+
When a potential difference V is set up between two parallel conducting plates

separated by a distance d the electric field between them is approximately
uniform so that the potential gradient dV/dx = V/d. The equation E = − dV
dx
can then be used to find the strength of the electric field.
dV V
E=
− =
−
dx d
17.5.3 Accelerating Charged Particles in an Electric Field

Charged particles are often accelerated by an electric field. When a potential
difference V is set up between two electrodes a free particle of charge Q loses
electric potential energy and gains kinetic energy as it moves between them.
If there are no other energy transfers then:
1
mv2 = QV
2
2QV
The velocity is then: v = where m is the particle mass.
m
However, we must be careful when using this equation. If the velocity is a
significant fraction of the speed of light (e.g., v > 0.05 c) then we should use
relativistic equations. The Newtonian equations above are useful for lower
velocities but only give an approximate value for the final velocity.
Many electron tubes use an electron gun to accelerate electrons and form a
beam. Here is a simplified diagram showing how the electron gun works.
FP.CH17_3pp.indd 365 3/15/2023 4:29:08 PM

accelerang
voltage V
Vacuum
Heater supply electron beam
electrons
(low voltage)
hot cathode anode

The cathode (negative terminal) is heated by passing a small current through
it. This increases the kinetic of free electrons inside the cathode and allows
them to leave the surface of the metal. The accelerating voltage V creates
an electric field between the cathode and anode that accelerates the elec-
trons. The electrodes are placed inside a vacuum tube so that the electrons
are not scattered. The loss of electric potential energy eV is transferred to the
kinetic energy of the electrons as they reach the anode. A narrow beam passes
through a small gap in the anode to create a beam. The electron velocity is:
2eV
v=
m
17.5.4 Deflecting Charged Particles in an Electric Field

When a charged particle is projected into an electric field the electrical force
on the particle acts parallel to the field. If a uniform electric field is perpen-
dicular to the initial velocity of the charged particle, then the force is constant
and deflects the charged particle into a parabolic path. This is analogous to
the parabolic path followed by a massive particle projected horizontally in the
Earth’s gravitational field.
An electron deflection tube uses an electron gun to fire a horizontal beam of
electrons into a region of uniform vertical electric field. The field is created by
two parallel plates connected to a separate voltage supply.
The resultant force on the electron is vertically upwards:
eV
= =
F eE
d
FP.CH17_3pp.indd 366 3/15/2023 4:29:08 PM

y-axis
+V
electron beam
inial velocity v x-axis
d
there will be a vertical acceleration:

eV
a=
md
and the vertical deflection in the field is (using suvat):
eVt 2
y=
2 md
There is no horizontal force so the horizontal component of velocity is con-

stant and equal to v. The horizontal distance traveled in the field is therefore:
x = vt
Eliminating t from the equations for x and y gives:
 eV  2
y= 2 x
 2 mv 
This is the equation of a parabola.
17.5.5 The Absolute Electric Potential of a Point Charge

The equation for the potential difference in terms of field strength:
xB
∆V =− ∫ Edx
xA
FP.CH17_3pp.indd 367 3/15/2023 4:29:09 PM

can be used to find the absolute potential at point B if we know the potential
at point A. In order to find the potential at any point in the Universe we must
define the position where the potential is zero. This is an arbitrary decision
because forces and fields only depend on differences in potential and not
on their absolute value. However, it makes sense to choose the position for
the zero of potential in such a way that it is easy to use in calculations. The
zero of potential in the electric field is taken to be at infinity. In other words,
if all charges were separated so that they were at infinite distance from one
another, then the total electrical potential energy would be zero.
Here are two qualitative examples of how this work:
Consider two point positive charges a distance r from one another. These
will repel one another and if no other forces act on them, they will move
toward infinity. While they are moving electrical forces are doing work
on them transferring electrical potential energy to kinetic energy, so the
electrical potential energy is decreasing all the time. However, we know
that their electrical potential energy will be zero at infinity so the initial
electrical potential energy of two positive charges must have been posi-
tive. The same argument shows that the initial electrical potential energy
of two negative charges a distance r apart must also be zero.
Now consider separating a positive and a negative charge that is initially a
distance r apart. These two particles attract one another so we would have
to apply an external force to each of them to move them out to infinity.
Work must be done by this external agent to separate them so the electric
potential energy increases all the time and eventually becomes zero. The
initial electric potential energy must have been negative. This is the case
for the electron inside a hydrogen atom. It is attracted to the proton in
the nucleus of the atom and the system has a negative electric potential
energy. The energy that must be put in to separate the two particles is the
ionization energy for the hydrogen atom.
We can now derive an expression for the electric potential energy at a point in
the electric field of a point charge. This is done by deriving an expression for
r
x
+Q to infinity
P
FP.CH17_3pp.indd 368 3/15/2023 4:29:09 PM

the potential difference between that point and infinity and using the fact that
the potential is zero at infinity.
The electric field strength at a point distance r from a point positive charge Q is:
Q
E=
4pe 0 r 2
so the potential difference between P (at distance r from Q) and infinity is:
x=
∞ x=
∞ x= ∞
Qdx  −Q  −Q
∆V =V∞ − VP =−
=x r =x r
∫ Edx =− ∫ 4pe 0 x 2
=−   =
 4pe 0 r  x = r 4pe 0 r
Q Q
VP =
V∞ − =
4pe 0 r 4pe 0 r
The potential varies as 1/r.

Summary of equations for the electric field of a point charge:
Electric field (magnitude of vector) at distance r from a charge Q:
Q
E=
4pe 0 r 2
Q
Electric potential (scalar) at distance r from a charge Q: E =
4pe 0 r
17.6 EXERCISES
1. (a) A small electric cell can supply a continuous current of 2.0 A for 2
hours. How much charge passes through the cell in this time?
(b) A large capacitor (component for storing charge) is charged using a
steady current of 4.0 mA. It takes 0.75 s to reach full charge. How
much charge does it store?
(c) An electrostatic generator stores 8.0 C on a large metal dome. This
charge leaks away through the air in 40 s. What is the average electric
current during this time?
2. Write an instruction sheet explaining how to charge an electroscope posi-
tively using just a polythene rod and a cloth.
FP.CH17_3pp.indd 369 3/15/2023 4:29:09 PM

3. The diagram below is a model of an electric dipole.
D 2.0 C A + 2.0 C B
x
4.0 m
The dipole is located on the x-axis with its center at the origin.
(a) Calculate the electric field strength (and direction) and the electric
potential at points A to E:
A (0, 0)
B (+4.0 μm, 0)
C (0, +4.0 μm)
D (−4.0 μm, 0)
E (0, −4.0 μm)
4. A hydrogen atom consists of a proton and an electron orbiting at a dis-

tance of 0.053 nm.
(a) Calculate the force between the electron and the proton.
(b) Assuming this force provides the centripetal force for circular motion
of the electron work out the electron’s orbital speed and kinetic energy.
FP.CH17_3pp.indd 370 3/15/2023 4:29:09 PM

(c) Calculate the electrostatic potential energy of the electron.

(d) Calculate the total energy of the electron and explain the significance
of its sign.
(e) State the ionization energy of the hydrogen atom.
5. (a) Sketch the field lines and equipotentials round an isolated charged
conducting sphere of radius 2.0 cm at a potential of +5000 V (include
equipotentials at 1000V, 2000V, 3000V, and 4000V).
(b) Calculate the electric field strength and potential at distances of 4.0
cm and 11 cm from the center of the positively charged sphere.
6. Copy the diagram below and sketch the electric field lines and equipo-
tentials between the charged sphere and the earthed conducting plane
shown below.
7. Use Gauss’s theorem to show:
(a) that the electric field strength inside a charged conducting sphere is
zero,
(b) that the electric field strength immediately above a charged conductor
is perpendicular to the surface and has a magnitude E = σ/ε0 where σ
is the charge density on the surface,
(c) that the flux of electric field entering a volume of empty space is equal
to the flux of electric field leaving that volume of space.
8. (a) Prove that the uniform electric field between two parallel conducting
plates separated by distance d and connected to a potential difference
V is given by: E = V/d.
FP.CH17_3pp.indd 371 3/15/2023 4:29:10 PM

(b) Sketch the electric field lines and equipotentials between two p arallel
metal plates 2.0 cm apart with a potential difference of 5000 V (include
equipotentials at 1000V, 2000V, 3000V, and 4000V).
(c) Calculate the electric field strength between the plates.
(d) Calculate the force on an alpha particle (charge +2e) half way between
the plates.
(e) How does this force vary if the alpha particle moves close to the posi-
tive or negative plate (ignore induction effects)?
(f) An air molecule between the plates loses two electrons and becomes
ionized. What is the ratio of the acceleration of the electron to the
acceleration of this positive ion in the electric field? Assume that the
mass of the ion is 60 000 times greater than that of the electron.
9. An electron in a vacuum tube is accelerated horizontally in an electron gun
through a potential difference of 1500 V. It then enters a region of uniform
vertical electric field of strength 2.0×104 Vm−1 that extends 5.0 cm horizon-
tally. It is deflected by the field and emerges into a field-free region.
(a) Calculate the velocity of the electron as it enters the region of vertical
electric field.
(b) Describe the shape of the electron’s path in the field. Assume the
direction of the field is vertically upwards.
(c) Calculate the time spent by the electron in the vertical field.
(d) Calculate the vertical component of velocity gained by the electron in
moving through the vertical field.
(e) Calculate the angular deflection of the beam.
(f) Calculate the work done by the vertical electric field on the electron.
FP.CH17_3pp.indd 372 3/15/2023 4:29:10 PM

CHAPTER
18
DC Electric Circuits
18.0 DIRECT CURRENT (DC) CIRCUITS AND CONVENTIONAL

CURRENT
Electric current is the rate of flow of electric charge and DC refers to circuits
in which the current flows in one direction around the circuit. Since current
can be carried by positive or negative charges the direction of electric current
needs a convention:
Conventional current direction: the direction in which a free positive
charge would flow.
There are three different microscopic ways in which current can flow.
Negative charge carriers (e.g., electrons in a metal or an n-type semiconduc-
tor). The electrons move in the opposite direction to the conventional current.
conventional current I
− − −
−
− − − −
Positive charge carriers (e.g., positive ions in an ion beam or holes in a p-type
semiconductor). The positive charge carriers move in the same direction as
the conventional current.

FP.CH18_3pp.indd 373 3/15/2023 4:33:42 PM

Positive and negative charge carriers (e.g., when current flows through an
electrolyte). The positive charges move in the direction of conventional cur-
rent and the negative charges move in the opposite direction.
− −
−
−
In all the examples above, the velocities of the charge carriers represent their
average drift velocities, the particles will also have random thermal motion
which might involve speeds that are several orders of magnitude greater than
the drift velocities.
18.1 CHARGE AND CURRENT

Electric charge Q is measured in coulombs (C). Electric current I is the rate
of flow of electric charge:
dQ
I=
dt
The SI unit of electric current is the ampère or amp (A) and 1 A = 1 Cs− 1.
If the current is constant the equation above becomes I = Q/t or Q = It
where Q is the charge passing a point in t seconds.
18.1.1 Charge Carriers and Charge Carrier Density

When current flows through a material charge carriers move inside the mate-
rial and we can derive an expression to link the microscopic movement of
these charge carriers to the macroscopic current. In the diagram below we
will assume that the charge carriers are positive but the form of the relation-
ship is not affected by this and it can be used for positive or negative charge
carriers.
cross-sectional
conventional current I area A
v v v v P

v v v
vt
FP.CH18_3pp.indd 374 3/15/2023 4:33:44 PM

DC Electric Circuits • 375
There are n charge carriers per unit volume in the conductor and each charge
carrier has a charge q.
Consider the charge leaving the right-hand end of the wire at P during a short
time δt.
All charge carriers within a distance vδt will leave at this time. The volume
within this distance of point P is Avδt so the number of charge carriers is
N = nAvδt and the total charge δQ passing P in time δt will be:
Q qnAvt
Q
so the current is: I nAvq
t
Very often the charge carriers have a charge equal to the electronic charge, e,
so this equation becomes:
I = nAve
Typical values of the charge carrier density n for two different metals and two
types of semiconductors are shown below. The values are for a temperature
of 300 K.
Copper n = 8.5×1028 m− 3
Silver 1.1 × 1028 m− 3
Silicon 1.5 × 1013 m− 3
Germanium 2.4 × 1016 m− 3
Note that the carrier density in the pure semiconductors is very much
smaller than for the metals so the drift velocity in a semiconductor carrying
the same current as a metal of the same dimensions will be much greater
than in the metal. The carrier density is highly temperature dependent
in semiconductors and increases rapidly with temperature. Pure, or intrinsic,
semiconductors contain both negative charge carriers (electrons) and positive
charge carriers (holes) in equal numbers but doped, or extrinsic, semiconduc-
tors have small quantities of other elements added to increase their carrier
density of a factor of 106 or more.
18.1.2 Measuring Current

Electric current is measured using an ammeter. A
FP.CH18_3pp.indd 375 3/15/2023 4:33:52 PM

The ammeter must be connected in series at the point where the current is
to be measured.
An ideal ammeter has zero resistance so it has no effect on the current it is
measuring. In practice ammeters do have a small internal resistance, so the
current measured by the ammeter is slightly less than the current would be
if the ammeter was not in the circuit. This difference is usually small enough
to be neglected but can be important, especially if the circuit resistance is
particularly low.
Small currents are often measured in milli-amps (mA) or micro-amps (µA).
1 mA = 0.001 A = 10− 3 A 1 µA = 0.001 mA = 0.000001 A = 10− 6 A
18.1.3 Currents in Circuits – Kirchhoff’s First Law

An electric circuit provides a complete conducting path for charge. Charge is
conserved so the amount of charge in the circuit cannot change. This leads to:
Kirchhoff’s first law – at a junction in an electric circuit the sum of currents
entering the junction is equal to the sum of currents leaving the junction.
I2
A
I1 A A I3 I1 = I2 + I3 + I4
A
I4
If there are no junctions in the circuit, then the current is the same every-
where – that is, at all points around a series circuit.
The current that flows in each branch of a parallel circuit is inversely propor-
tional to the resistance of the branch.
FP.CH18_3pp.indd 376 3/15/2023 4:33:52 PM

18.2 MEASURING POTENTIAL DIFFERENCE

Potential difference is literally the difference in electric potential between
two points in a circuit. This is measured using a voltmeter. The voltmeter
must be connected in parallel between the two points. The diagram below
shows how a voltmeter can be connected to measure the potential difference
across a resistor.
An ideal voltmeter has infinite resistance so that it does not draw any current
from the circuit. In practice, a real voltmeter will have a large but not infinite
resistance and this can affect readings if there are also very large resistors in
the circuit.
18.2.1 EMF Potential Difference and Voltage

The potential difference across a source of electrical energy such as a battery
or generator is called an emf (electromotive force) whereas the potential dif-
ference across a component that transfers electrical energy to other forms is
called a potential difference or simply a “voltage.”
Small voltages are often measured in milli-volts (mV) or micro-volts (µV).
Batteries are made up of cells so the symbols for a battery and a single cell
are different:
battery consisting single cell

of several cells
The longer line represents the positive end of the cell or battery. The emf
of a battery consisting of cells in series is equal to the sum of the emfs of the
individual cells. An ideal cell provides a constant emf and has no internal
FP.CH18_3pp.indd 377 3/15/2023 4:33:52 PM

resistance. A real cell does have an internal resistance so must be treated as

an ideal cell of emf E in series with a resistor of resistance r and is usually
represented as shown below:
E
r
The behavior of real cells is analyzed in Section 18.5.
18.2.2 Kirchhoff’s Second Law

When a charge carrier completes a closed loop around an electric circuit and
returns to its initial position it has exactly the same amount of energy at the
end of the loop as it had at the beginning. If this was not the case then energy
would either have been created or destroyed, and this is impossible. This
means that each charge must gain as much energy as it loses in completing a
closed loop in the circuit. This leads to:
Kirchhoff’s second law – the sum of emfs is equal to the sum of potential
differences around any closed loop in an electric circuit.
emfs p.d.s
closed closed
loop loop
If the potential differences are all across resistive components this can be
written:
i N
emfs IiRi
closed i1
loop
where there are N resistors in the loop and the ith resistor has a resistance Ri
and current Ii.
The example below shows how Kirchhoff’s second law can be applied to two
different loops in an electric circuit:
FP.CH18_3pp.indd 378 3/15/2023 4:33:56 PM

Loop
V2
V1
E1 V V3
Loop
P
E2
E3 V4
Each loop starts and ends at P, works in a clcokwise direction and provides a
different equation:
Loop 1: E1 − E3 − E2 = V1 + V5
Loop 2: E1 − E2 = V1 + V2 + V3 + V4
Care must be taken over the signs of the emfs and pds.
While two loops have been shown in the diagram there is also a third loop in
the right-hand section of the circuit. This has not been included because it
is not independent of the other two. The fact that all parts of this thrid loop
were included in parts of the first two loops means that it would not provide
additional information.
Kirchhoff’s first and second laws generate a series of simultaneous equations
that can be used to solve complex circuit problems (see Section 18.6.1).
18.3 RESISTANCE
Resistance R is defined as the ratio of potential difference V across a compo-
nent to current I through the component.
V
R=
I
FP.CH18_3pp.indd 379 3/15/2023 4:33:57 PM

The SI unit for resistance is the ohm (Ω) and 1 Ω = 1 VA− 1. Large resistances
are measured in kilo-ohms (kΩ) and mega-ohms (MΩ).
1 kΩ = 1000 Ω = 103 Ω 1 MΩ = 1000 000 Ω = 106 Ω
18.3.1 Measuring Resistance

Resistance can be measured using an ammeter and voltmeter in the circuit
shown below.
resistance =
voltmeter reading
divided by ammeter
reading
V
A
While dedicated ammeters and voltmeters can be used, multimeters can be
used instead. A multimeter can be used as a voltmeter or an ammeter depend-
ing on its settings. When using a multimeter it is important to select appropri-
ate settings before connecting he circuit to the power supply otherwise there
is a danger of damaging the meter or the circuit. For example, if the multim-
eter was set as an ammeter but connected in parallel like a voltmeter it would
short out the component and draw a large current.
Multimeters can also be used as an ohm-meter to measure resistance directly.
When it is used like this the component must first be isolated from the circuit,
and then the meter must be connected directly across the component.
The image below shows a typical multimeter. The values shown on each range
indicate the maximum value that can be measured on that range and shows
the units in which it will be displayed. For example, a setting of 2 V DC would
measure voltages from 0.00 V to 2.00 V whereas a setting of 200 mV would
measure from 000 mV to 200 mV. When choosing a suitable scale it is best
to choose the most sensitive scale that has a maximum value greater than the
value to be measured.
FP.CH18_3pp.indd 380 3/15/2023 4:33:58 PM

Rotating
switch to
Resistance
select
ranges
range.
D.C. D.C.
current voltage
ranges ranges: this
meter is set
A.C. A.C.
current voltage
ranges ranges
10A
V terminal:
terminal:
the second
the second
lead must be
lead must
connected
be
to this
connected
terminal for
to this
the meter to
terminal
work as a
for the
voltmeter or
meter to
COM terminal: ‘common’ connection for all ohm-meter.
work as an
uses of the meter. One lead must be
ammeter.
connected to this terminal.
FP.CH18_3pp.indd 381 3/15/2023 4:33:58 PM

The resistance of a component can also be found in its current-voltage

characteristic. For example, if a particular component produced the charac-
teristic shown below its resistance at point P would be R = VP/IP.
current / A
P
IP
P
P =
P
potential
0
0 VP difference / V
18.3.2 Current–Voltage Characteristics

The current–voltage characteristic of an electrical component can be deter-
mined using the circuit below. The part of the circuit in the grey box is called
a potentiometer. As the moving contact goes from left to right the potential
difference applied to the circuit containing the component R under test varies
from 0 V to the maximum voltage of the battery.
R
A
V
To reverse the current through the component the battery or the component
itself can be turned around. This allows both positive and negative values for
FP.CH18_3pp.indd 382 3/15/2023 4:33:59 PM

I and V. The current-voltage characteristics for the three components are

shown below.
A carbon resistor or a metal at constant temperature.
current / A
voltage / V
Current I is directly proportional to the potential difference V across the

component:
I∝V
V
= constant
= R
I
The resistance is constant.
Components that behave like this are described as “ohmic conductors” and
are said to obey Ohm’s law. Gustav Ohm investigated the electrical behavior of
metals kept at constant temperature and discovered that the current passing
through them and the potential difference across them were directly propor-
tional so Ohm’s law really only applies to metals at a constant temperature.
Nowadays Ohm’s law is invoked whenever current and voltage are directly
proportional. The equation V = IR is also sometimes referred to as Ohm’s law,
but this is not really correct – the equation defines resistance and only cor-
responds to Ohm’s law when the resistance is constant.
A metal filament (e.g., in a lamp).
FP.CH18_3pp.indd 383 3/15/2023 4:34:06 PM

Metal wires heat up when electric currents pass through them and their
resistance changes. They are non-ohmic conductors.
current / A
voltage / V
The fact that the graph is not a straight line through the origin shows that this
is a non-ohmic conductor.
The ratio of V to I is increasing so the resistance has increased as the current
has increased.
The reason this happens is that the charge carriers passing through the metal
transfer energy to the metal ions making them vibrate more rapidly and this
in turn increases the scattering of charge carriers. More work has to be done
to maintain the current (re-accelerate the charge carriers after scattering) so
a higher voltage is needed for the same current. This increases the ratio V/I
and increases the resistance.
Components whose resistance changes with temperature are called ther-
mistors. The resistance of a metal increases with temperature so it is a posi-
tive temperature coefficient (PTC) semiconductor. Platinum is commonly
used as a PTC thermistor and can be used as a resistance thermometer.
Semiconductors have a resistance that falls as temperature increases so they
are used as negative temperature coefficient NTC thermistors. The circuit
symbol for a thermistor is shown below:
FP.CH18_3pp.indd 384 3/15/2023 4:34:06 PM

Some semiconductors are light-dependent. Their resistance falls when they

are illuminated. These are called light-dependent resistors LDRs and are
useful both to measure light intensity and in sensing circuits that respond to
changes in light intensity. The circuit symbol for an LDR is shown below:
A semiconductor diode.
The circuit symbol for a semiconductor diode is shown below:
This is a conductor that conducts with very low resistance in one direction
(forward bias) and acts as an insulator (infinite resistance) in the other (up to
its breakdown voltage, at which point the diode resistance suddenly drops and
it is likely to be destroyed by the current surge).
A small voltage is required in the forward direction before the diode begins
to conduct. This switch-on voltage depends on the material from which the
diode is made. For silicon diodes, it is about 0.6 V and for germanium diodes,
it is about 0.2 V. Once the forward voltage exceeds this value the diode begins
to conduct with very low resistance and care must be taken not to allow the
current to grow so large that it melts the diode (e.g., by having a fixed resistor
in series with it).
Diodes are important components in rectifier circuits, used to convert alter-
nating current to direct current.
current / A
forward bias
breakdown
voltage
voltage / V
switch-on
voltage
FP.CH18_3pp.indd 385 3/15/2023 4:34:07 PM

Some diodes emit light when they conduct. These are called light-emitting
diodes or LEDs. The circuit symbol for an LED is shown on the right:
18.3.3 Resistors in Series and in Parallel

Resistors in series
To replace several resistors in series with a single resistor RS we would need a
resistor that has the same ratio of V to I as the set of series resistors. For the
resistances to be equal, the total voltage across all the series resistors must
equal the voltage across the single resistor. The same current I flows through
all of the resistors so:
V1 V2 V3
I I
R1 R2 R3
VS
I I
RS
VS V1 V2 V3
VS IRS IR1 IR2 IR3
RS R1 R2 R3
The total resistance of several resistors connected in series is the sum of the
individual resistances. The general relationship for N resistors in the series is:
i N
Rseries Ri
i1
Resistors in parallel
To replace several resistors in parallel with a single resistor RP we would need
a resistor that has the same ratio of V to I as the set of parallel resistors. We
FP.CH18_3pp.indd 386 3/15/2023 4:34:15 PM

start by using the fact that the same potential difference is across each arm of
a parallel circuit:
V
i1
R1
i2
I I
R2
i3
R3
V
I I
RP
Using Kirchhoff’s first law: I i1 i2 i3
V V V V

RP R1 R2 R3
1 1 1 1

RP R1 R2 R3
In general, the reciprocal of the equivalent resistance is equal to the sum of
the reciprocals of all the resistances of the resistors connected in parallel. For
N resistors in parallel, this can be written more formally as:
1 i N 1

RP i1 Ri
For the simple but very common situation where there are just two resistors
in parallel, the equation can be rearranged to:
R1 R2
RP
R1 R2
this is easily remembered as “product over sum,” however it only works for
two resistors in parallel.
The formula for parallel resistors shows that the total resistance of several
resistors connected in parallel is always less than the smallest resistance in the
network.
FP.CH18_3pp.indd 387 3/15/2023 4:34:25 PM

18.3.4 Resistivity
The resistance of a component depends not only on the type of material from
which it is made but also on the dimensions of the component. Resistivity
is a property of the material alone and does not depend on its dimensions.
For example, the resistances of copper wires of different lengths and cross-
sectional areas differ but the resistivity of the copper from which they are
made is the same (at the same temperature).
For a cylindrical wire of length l, cross-sectional area A and resistance R the
resistivity is given by:
RA

l
The SI unit of resistivity is Ωm.
Typical resistivities (at 20°C) for different conductors are shown below:
Metal Resistivity
Silver ρ = 1.6 × 10− 8 Ωm
Copper ρ = 1.7 × 10− 8 Ωm
Aluminum ρ = 2.8 × 10− 8 Ωm
Tungsten ρ = 5.8 × 10− 8 Ωm
Platinum ρ = 1.1 × 10− 7 Ωm
Constantan ρ = 4.9 × 10− 7 Ωm
Steel ρ = 7.2 × 10− 7 Ωm (varies depending on type of steel)
Nichrome ρ = 1.3 × 10− 6 Ωm
The resistivity of a length of a resistance wire can be measured using the cir-
cuit below:
resistance wire
l sliding contact
FP.CH18_3pp.indd 388 3/15/2023 4:34:27 PM

There is a constant current I in the resistance wire. This is measured using

the ammeter.
The moving contact is used to measure the potential difference V for a
range of lengths l.
The diameter d of the wire is measured using a micrometer screw gauge.
It is best to measure several different diameters at different points on the
wire to obtain an average value. The cross-sectional area of the wire is
¼ πd2.
The length of wire l can be measured using a meter ruler.
The resistivity is then found from a graph of V against l:
4 I
V IR 2 l
d
4 I
The gradient of the graph of V against l is equal to 2 .
d
d
2
Resistivity, gradient .
4I
18.4 ELECTRICAL ENERGY AND POWER

When a charge Q moves between points at different potential, energy is trans-
ferred to the charge from the electric field or vice versa. The energy transfer
per unit charge is equal to the potential difference V, so the total energy
transfer E is:
E = QV
The rate of transfer of energy is the electrical power P:

dE
P=
dt
the SI unit for power is the watt (W).
For charge moving between points with a constant potential difference:
dQ
P=V or P = VI
dt
Resistors transfer electrical energy to thermal energy. The power transfer in
a resistor is given in various forms by using the equations P = VI and V = IR:
FP.CH18_3pp.indd 389 3/15/2023 4:34:42 PM

V2
= =
P VI = I2R
R
If the current in the resistor is constant then the energy transfer E in time t
is given by:
E = VIt
18.4.1 EMF and Internal Resistance of a Real Cell

While an ideal cell is simply a source of constant emf, a real cell has an inter-
nal resistance. This arises because work must be done to move charge carri-
ers through the material of the cell itself. A real cell is modeled as a constant
source of emf E in series with a fixed internal resistance r.
E
representation r
of a real cell
When the cell is connected to an external circuit, current flows through it

and there is a voltage drop across the internal resistance. This results in the
potential difference at the cell terminals, the terminal pd. V, falling below the
emf of the cell.
E Ir
current Ito
external circuit
terminal p.d. V
The potential difference across the internal resistance is equal to the work
that must be done per coulomb of charge to move charge carriers through the
cell. The terminal pd is therefore equal to the difference between the cell emf
and these “lost volts”:
V E Ir
FP.CH18_3pp.indd 390 3/15/2023 4:34:49 PM

As the current drawn from the supply increases the terminal pd falls.
The internal resistance also limits the maximum current that can be drawn
from the cell. This occurs when the cell is short-circuited by connecting its
terminals together with a conductor of negligible resistance. The only resist-
ance in the circuit is the internal resistance so the short-circuit (maximum)
current from the cell is:
E
ISC =
r
The internal resistance of a typical 1.5 V alkaline AA cell is about 0.15 Ω so the
maximum current that could be drawn from it is 1.5/0.15 = 10 A. However,
shorting the cell would drain it very quickly so the working current for a
device operated by AA cells must be much less than this. If a cell or battery is
required to provide a very large current it must have a very low internal resist-
ance. Car batteries have an emf of 12 V but need to provide over 100 A when
the ignition is switched on and the battery turns the starter motor. Typical car
batteries have an internal resistance of less than 0.01 Ω.
18.4.2 Measuring the Internal Resistance and emf of a Cell

The emf and internal resistance of a cell or battery can be determined using
the circuit below.
E r
V
As the value of the load resistor is changed the current I in the circuit and the
terminal pd. V change too. I and V are related by the equation:
V E Ir
comparing this to: y mx c
FP.CH18_3pp.indd 391 3/15/2023 4:34:55 PM

a graph of V against I is a straight line with a negative gradient and a positive

intercept:
terminal p.d./V
gradient = r
0 current / A
0
intercept on y-axis = emf

gradient = − r
The terminal pd is equal to the cell emf when no current is drawn so a simple
way to measure the emf is to connect a voltmeter with very high resistance
across the cell terminals. The open-circuit terminal voltage is equal to the emf
of the cell.
18.4.3 Power Transfer from a Real Cell to a Load Resistor

The power transferred to the load resistor in the circuit below is P = I2R, but
I depends on the load resistance R. How does P depend on R?
E
r
FP.CH18_3pp.indd 392 3/15/2023 4:34:56 PM

E
I
R r
E2 R
P I2 R
R r 2
It is clear that P is zero when R = 0.
P also becomes asymptotic to zero as R → ∞ since the denominator then
dominates the expression.
The expression is positive for all values of R so there must be a maximum
value for P at some point. This occurs when R = r, that is, the load resistor has
a resistance equal to the internal resistance.
Power transfer P
Pmax
Load resistanceR
0
0 R=r
The fact that maximum power transfer occurs when R = r can be demon-
strated using calculus. You need to find the condition for dP/dR = 0, this must
be a stationary value of the function and, in this case, it is a maximum.)
It might be tempting to think that maximum power transfer also corresponds
to maximum efficiency for the system, but this is not the case.
power transferred to R
efficiency 100%
power transferred from the emf
E2 R
power transferred to R is: Pout I 2 R
R r 2
power transferred from the emf is:
E2
Pin EI
R r
FP.CH18_3pp.indd 393 3/15/2023 4:35:03 PM

efficiency is given by:

Pout R
efficiency 100% 100%
Pin R r
Surprisingly the maximum efficiency would be when R → ∞, but this would
also correspond to zero power transfer. At maximum power transfer, the effi-
ciency is 50%.
18.5 RESISTANCE NETWORKS

18.5.1 Potential Dividers
A potential divider is used to provide an output voltage Vout that is a fraction
of the supply voltage Vin.
I
R2
Vin
R1 Vout
The current I in the circuit is the same through both resistors.

Using Kirchhoff’s second law:
Vin IR1 IR2
Vout = IR1
Vout R1

Vin R1 R2
If Vout is connected across a load resistor RL then the fraction changes because
the lower part of the potential divider now has two resistors in parallel (R1
and RL) so that:
Vout RP

Vin RP R2
FP.CH18_3pp.indd 394 3/15/2023 4:35:13 PM

R1 RL
where RP .
R1 RL
Potential divider circuits can also be used as sensing circuits, producing an
output voltage that depends on external conditions such as temperature or
light intensity.
I I
R2 R2
Vin Vin
R THERM Vout RLDR Vout
Temperature sensor – as Light intensity sensor – as light

temperature increases R THERM falls so intensity increases R LDR falls soV
out
Vout falls. Swapping positions of R 2 and falls. Swapping positions of R 2 and
R THERM makes R out rise with R LDR makes R out rise with light
temperature. intensity.
18.5.2 Using Kirchhoff’s Laws to Solve Resistance Networks

Kirchhoff’s two circuit laws can be used to produce a set of simultaneous
equations for currents and pd’s that can be used to find all the currents and
pd’s in a network. Here is an example:
E1
R1 R2 R3
i1 i2
i3 R4
E2
Applying Kirchhoff’s first law: i1 = i2 + i3
FP.CH18_3pp.indd 395 3/15/2023 4:35:15 PM

Applying Kirchhoff’s second law to the loop passing through resistors R1, R2,
and R3:
E1 i1 R1 i2 R2 i2 R3
Applying Kirchhoff’s second law to the loop passing through resistors R1, R4,
and E2:
E1 i1 R1 i3 R4 E2
There are three independent equations so it is possible to solve for three

unknowns. For example, if we know the two emfs and the values of all three
resistors we can find the current at any point in the circuit.
18.6 SEMICONDUCTORS AND SUPERCONDUCTORS

18.6.1 Semiconductors
Semiconductors have a resistivity that lies between that of a conductor, such as
copper, and an insulator such as polythene. At room temperature, the charge
carrier density for a pure (intrinsic) semiconductor is typically 1013–1016 m− 3
compared to 1027–1028 for metal. However, the charge carrier density in a
semiconductor is highly temperature dependent and can be changed by the
addition of impurity atoms (this is called doping).

h+ h+ e
e
h+ e
current I h+ current I
e
+ +
h h
e h+ e
e
+
e h+ h
intrinsic semiconductor
At absolute zero semiconductors are pure insulators with no free charge
carriers. However, as temperature increases, thermal agitation frees a small
proportion of the valence electrons and allows them to become free or con-
duction electrons. Whenever an electron is freed it leaves behind a hole which
behaves very much like a particle carrying a positive charge +e. This means
that a pure semiconductor contains equal numbers of electrons and holes and
FP.CH18_3pp.indd 396 3/15/2023 4:35:22 PM

that both contribute to current flow. While the density of holes and electrons
is equal they do not contribute equally to the current because their mobility
(ability to move through the material) differs.
The electrical properties of intrinsic semiconductors can be changed by intro-
ducing impurity atoms. Semiconductors such as silicon and germanium have
four electrons in the outer shell, so adding atoms of an element such as boron,
aluminum, or gallium, with three outer electrons, has the result of donating
one additional hole per doping atom. Semiconductors doped in this way have
an excess of holes (positive charge carriers) and are called “p-type semicon-
ductors.” If phosphorus, antimony, or arsenic, with five electrons in the outer
shell, is added this will donate one electron per doping atom. Semiconductors
doped in this way have an excess of electrons (negative charge carriers) and
are called n-type semiconductors. Semiconductor devices such as diodes,
transistors, and integrated circuits depend on doped semiconductors and are
often constructed from junctions between n-type and p-type materials.
18.6.2 Variation of Resistance of a Metal with Temperature

When an electric current passes through a metal electrons drift in the oppo-
site direction and collide with ions in the metal structure. This scatters the
electrons and dissipates energy, creating electrical resistance. The higher
the temperature the more random thermal energy in the metal and the
greater the amplitude of ionic vibrations. This increases the electron scatter-
ing and the resistivity of the metal. The rate of increase is roughly linear so the
variation can be modeled by the equation:
0 1 T T0
ρ = resistivity at temperature T
ρ0 = resistivity at reference temperature T0
Values for the temperature coefficient of resistance for different metals based
on T0 = 293 K is shown below:
Metal Temperature coefficient of resistance
Aluminum α = 4.3×10− 3 K− 1
Copper α = 4.0×10− 3 K− 1
Silver α = 4.0×10− 3 K− 1
Tungsten α = 4.5×10− 3 K− 1
FP.CH18_3pp.indd 397 3/15/2023 4:35:23 PM

It might be expected that this linear dependence of resistivity on tempera-

ture would continue down to absolute zero and would approach zero at that
value. In 1908, the Dutch physicist, Kamerlingh Onnes, investigated how the
resistivity of mercury varied as he lowered the temperature toward absolute
zero. At first, the resistivity dropped approximately linearly but at 4.2 K it sud-
denly dropped to zero. The mercury conducted electric currents with zero
resistance – it had become a superconductor. Other metals exhibit similar
properties, having a characteristic transition temperature below which they
super-conduct. Here are the transition temperatures for several different
metals:
Metal Transition temperature

Aluminum 1.2 K
Lead 7.2 K
Mercury 4.2 K
Tin 3.7 K
Zinc 0.9 K
These are called low-temperature superconductors. In the 1980s, sev-

eral high-temperature superconductors were discovered. These tend to be
ceramic materials with complex crystalline structures, such as yttrium barium
copper chloride, but they become superconducting above the temperature of
liquid nitrogen (77 K).
Superconductors are essential for the construction of extremely strong electro-
magnets (e.g., in particle accelerators such as the LHC at CERN) because huge
currents can circulate in superconducting coils without generating any heat.
Physicists hope one day to discover or manufacture a room temperature
superconductor. Among other applications, this would allow the transmission
of electrical energy with no energy losses from ohmic heating.
18.7 EXERCISES
1. Explain in terms of the free-electron model of a metal:

(a) how metals conduct electricity
(b) why the resistance of a metal increases with temperature
FP.CH18_3pp.indd 398 3/15/2023 4:35:23 PM

2. A copper wire has a radius of 0.20 mm and carries a DC current of 120 mA.

Calculate the drift velocity of the electrons in the wire. The charge carrier
density in copper is n = 8.5 × 1028 m− 3.
3. The resistivity of a sample of germanium is 0.46 Ωm. A disc of germanium
1.0 mm long and 5.0 mm in radius has a pd of 2.0 V applied across its
circular faces.
(a) What current passes through the specimen?
(b) What is the current density (I/A) in the sample?
(c) What is the conductivity (1/ρ) of germanium?
(d) What is the conductance (1/R) of this sample?
(e) The density of charge carriers in germanium is about 1019 m- 3,
calculate their average drift velocity (assume the charge on each
charge carrier is 1.6 × 10- 19 C).
(f) Name the majority charge carriers in each of the following:
(i) a metal;
(ii) a p-type semiconductor;
(iii) an n-type semiconductor.
4. A is a 240 V 100 W filament lamp. B is a 12 V 6 W filament lamp.

(a) What happens when they are connected in parallel across a 12 V DC
supply?
(b) What happens when they are connected in series across a 240 V AC
supply?
5. A cell of emf E and internal resistance r is connected across a load resistor
R and the pd across R is measured using a high resistance voltmeter.
(a) Draw a circuit diagram of this arrangement.
(b) Derive an expression for V in terms of E, R, and r.
(c) A set of values for V are obtained for a wide range of values of R.
Rearrange the equation derived in (a) to show how a straight-line
graph can be drawn from this data and used to determine E and r.
(d) Sketch the graph, indicating its important features.
FP.CH18_3pp.indd 399 3/15/2023 4:35:23 PM

6. The table below gives the resistivities of three metals at two different
temperatures
Metal Resistivity at 273K Resistivity at 373K
−8
Aluminum 2.45 × 10 Ωm 3.55 × 10− 8 Ωm
−8
Copper 1.55 × 10 Ωm 2.38 × 10− 8 Ωm
Iron 8.70 × 10− 8 Ωm 16.61 × 10− 8 Ωm
(a) Which is the best conductor?

(b) An insulated cable contains a single metal core of diameter 0.50 mm.
What is the resistance of 5.0 m of this cable at 273K if it is made of:
(i) Aluminum (ii) Copper (iii) Iron
(c) By what percentage does the resistance of each cable increase when
its temperature rises from 273K to 373K?
(d) Estimate the resistance of the copper cable at room temperature
(about 293K) and state any assumptions you had to make.
(e) A 5.0 m composite wire is made by joining 1.0 m of copper wire of
diameter 0.50 mm to 4.0 m of iron wire of diameter 0.60 mm. What
current would flow through this wire at 273 K if a cell of EMF 6.0 V
was connected across its ends?
7. (a) Calculate the readings on the ammeter and voltmeter in the circuit
below assuming that the internal resistance of the battery is negligible.
12.0 V
200 200
200
A
V
(b) In fact, the internal resistance is 10 Ω. How does this affect your
answers to part (a)?
FP.CH18_3pp.indd 400 3/15/2023 4:35:23 PM

8. You are provided with three identical resistors each of resistance R.

Calculate and list the values of all the resistances you can make using
one or more of these resistors stating which arrangement corresponds to
which value.
9. A resistance network is constructed using 12 identical 1 Ω resistors, with
each resistor placed so that it forms one edge of a cube. Calculate the
resistance between two opposite corners of the cubic array.
10. The circuit below is used to detect changes in light intensity. The resist-
ance of the LDR is 50 kΩ in the dark and 100 Ω in bright light.
1.0 k
6V
RLDR Vout
(a) Calculate the output voltage in the dark and in bright light.
(b) How does the value of the top fixed resistor affect the range of the
output voltages?
11. Use Kirchoff’s laws to determine the readings on the three ammeters and
the voltmeter in the circuit below. Assume that the cell’s internal resist-
ance can be neglected.
6.0 V
A1
V 500 30 10
A3
A2
3.0 V
FP.CH18_3pp.indd 401 3/15/2023 4:35:24 PM

FP.CH18_3pp.indd 402 3/15/2023 4:35:24 PM
CHAPTER
19
Capacitance
19.1 WHAT IS A CAPACITOR?

A capacitor consists of two conductors
separated by an insulator. This might be
two parallel metal plates with air between
them or it could be a person standing on
an insulator that separates them from the
earth. When there is a potential differ-
ence between the plates, opposite charges
gather on each plate and the capacitor is
said to be charged. Capacitors are com-
mon components that are present in most
electrical circuits.
The image shows a range of capacitors.
The cylindrical capacitors are electrolytic
capacitors that must be connected in a par-
ticular direction in the circuit (otherwise
they might explode) while the others are
ceramic or paper capacitors. The insulator or dielectric material between the
plates affects the amount of charge that can be stored on the plates.
19.1.1 Capacitors and Charge

The circuit below can be used to investigate what happens when a capacitor
is charged.
FP.CH19_3pp.indd 403 3/15/2023 4:17:29 PM

VS
A A
S R C
connection
to
oscilloscope
When switch S is closed both ammeters jump to a positive reading that gradu-
ally falls back to zero. This shows that charge has moved around the circuit,
however, it cannot cross the gap between the two plates so what has happened
is electrons have left one plate (giving it a net positive charge) and electrons
have moved onto the other plate (giving it a net negative charge). The current
in the circuit is monitored by connecting an oscilloscope across the resistor in
series with the capacitor.
current
time
The oscilloscope trace actually displays the voltage across the resistor but this
is directly proportional to the current because the resistor is constant. The
current decays exponentially to zero as the capacitor charges up. Once fully
charged there is a charge of +Q on one plate and −Q on the other plate. While
it is true that the net charge on the capacitor is actually zero, we refer to the
charge on the positive plate and say that the capacitor is “charged” or “stores
a charge Q.”
FP.CH19_3pp.indd 404 3/15/2023 4:17:30 PM

Capacitance • 405
19.1.2 Capacitance
The charge stored on a capacitor is directly proportional to the potential dif-
ference between the two conductors. This can be verified using the arrange-
ment shown below:
flying lead
coulomb
meter
Variable H.T.
aluminium plates
supply 0 – 5 kV
separated by a few
mm
Connect the flying lead momentarily to the positive terminal of the HT

supply.
Zero the coulombmeter.
Move the flying lead to the terminal of the coulombmeter and measure
the charge transfer.
Increase the voltage and repeat.
Typical results look like this:
charge stored / C
potential difference / V
charge stored on the capacitor is directly proportional to the pd between its

plates
Q∝V
Q = CV
where C is the capacitance.
FP.CH19_3pp.indd 405 3/15/2023 4:17:33 PM

The SI unit of capacitance is the farad (F), 1F = 1CV− 1. A 1F capacitor would

store 1 C of charge per volt of potential difference between its plates. This
would be a very large capacitance; typical values used in electronic circuits are
usually measured in nF or µF.
19.1.3 Energy Stored on a Charged Capacitor

When a capacitor is charged the power supply forces charges onto the plates
against the electrostatic repulsion of the charges that are already there. This
increases the electric potential energy in an analogous way to how the com-
pression of a spring stores strain energy. The charged capacitor is therefore
a store of electrical energy that can be released if it is allowed to discharge
through a suitable circuit.
When a capacitor with initial charge Q is discharged through a fixed resistor of
resistance R the work done by the capacitor in moving a charge δq around the
circuit is equal to Vδq, where V is the potential difference across the capacitor
as the charge moves around the circuit. The total work done will be the area
under a graph of V against Q or the integral of Vδq.
V p. d. /
C

V
S
charge
R
Q stored / C
Energy stored = area up to charge Q E = ½ QV

(It might seem strange that the graph shows the charge increasing to the
right when the capacitor is discharging – remember that the x-axis does not
represent time!)
Using calculus:
qQ qQ
qdq Q2
E
q 0
Vdq C 2C
q 0
FP.CH19_3pp.indd 406 3/15/2023 4:17:37 PM

Capacitance • 407
The two expressions for E are equivalent since C = Q/V. There is also a third
form of this expression in terms of just V and C (found by using Q = CV to
eliminate Q): E = ½ CV2.
Here are the three equations for energy stored on a capacitor:
Q2 CV 2 QV
=E == = ==
2C 2 2
One of the key uses for a capacitor is as a temporary store of electrical energy
that can then be used later when the capacitor is discharged – for example,
in a camera flashlight. One of the advantages of using capacitors as energy
storage devices is that the energy can be accumulated slowly and released
quickly so that the power delivered can be much greater than the power used
to charge the capacitor.
19.1.4 Efficiency of Charging a Capacitor

When a capacitor is charged from a supply the work done by the supply is
always greater than the energy stored on the capacitor. This is because some
of the work done by the supply is transferred to heat by resistance in the
circuit.
charge Q passes
through supply
A A
R C
+Q Q
charge Q leaves charge Q arrives
positive plate negative at plate
Work done by supply: W = QV

Energy stored on capacitor: E = ½ QV
Energy dissipated in resistor: E = ½ QV
The charging process has an efficiency of 50% regardless of the resistance.
FP.CH19_3pp.indd 407 3/15/2023 4:17:39 PM

19.2 THE PARALLEL PLATE CAPACITOR

A simple capacitor can be constructed from two metal plates of area A sepa-
rated by a distance d in air. A uniform electric field (ignoring edge effects) of
strength E is set up between the plates when a pd. V is placed across them.
The electric field strength is related to the charge den-
V
sity by the equation (see Section 17.4.4):
Q
E
0 A0
+Q Q where σ is the charge density on one plate.
The electric field strength is also equal to the (negative)
potential gradient:
V
E
d
Equating the two expressions for E (ignoring signs):
Q V

A0 d
The capacitance of a parallel plate air capacitor is
d therefore:
Q 0 A
C
V d
And if the air gap is replaced by a dielectric material of relative dielectric
constant εr the capacitance is given by:
0 r A
C
d
To create large capacitance, we need a high dielectric constant, large area,
and small separation.
19.3 CAPACITOR CHARGING AND DISCHARGING

In order to find a mathematical description for the variation of charge, current,
and voltage during charging and discharging we must start with a differential
FP.CH19_3pp.indd 408 3/15/2023 4:17:53 PM

Capacitance • 409
equation for the rate of flow on or off of the capacitor when it already stores a
charge q and has a potential difference V across its plates. It is simpler to set
this up for discharge than for charging so we will begin with the equations for
the discharge of a capacitor through a fixed resistor.
19.3.1 Equations for Capacitor Discharge

A capacitor of capacitance C is connected in parallel with a fixed resistor of
resistance R. The capacitor has an initial charge Q0 and an initial potential dif-
ference V0. When the switch S is closed the voltage across the capacitor and
the resistor are equal so current flows through the resistor and the capacitor
gradually discharges. t seconds after S has closed the charge has fallen to Q
and the voltage across the plates is V. At this instant the discharge current is:
dQ V
I
dt R
the negative sign indicates that the charge on the capacitor is falling.
However, V = Q/C so we can form a first-order differential equation that can
be solved by the separation of variables:
dQ Q

dt RC
Q t t
dQ dt

Q Q0
Q

t 0
RC
Qt t
ln
Q0 RC
t

Q t Q0 e RC
The charge on the capacitor decays exponentially. Since V = Q/C and

I = V/R = Q/RC, the current and voltage are both directly proportional to the
t
−
charge and decay at the same rate. The term e RC
is equal to the fraction of
charge remaining after time t.
The graph below shows how the charge, current or pd across the capacitor
changes as it discharges:
FP.CH19_3pp.indd 409 3/15/2023 4:18:08 PM

percentage of Q0, I0, or V0
100
90
80
70
60
50
40
30
20
10
0 time / s
t = RC t = 2RC
The quantity RC is called the “time constant” for the circuit. The dimensions
of RC are the dimensions of time so in SI units it is a time in seconds. After
RC

one-time constant (t=RC) the fraction remaining is e e1 0.37 so there
RC
will be 37% of the initial charge remaining after one time constant. After n
time constants, the fraction remaining is e− n. This falls rapidly with n. After
three time constants, there is 0.050 of the original charge so the discharge is
95% complete. After five time constants 0.0067 of the original charge remains
so the discharge is more than 99% completed. As a rule of thumb discharging
(and charging) processes are considered complete when a time t = 5RC (five-
time constants) has elapsed.
The time constant controls the rate of charging and discharging.
percentage of Q0, I0 or V0
100
90
80
70
60
increasing time
50 constant (RC)
40
30
20
10
0 time
tim
time
FP.CH19_3pp.indd 410 3/15/2023 4:18:10 PM

Capacitance • 411
19.3.2 Equations for Capacitor Charging

The diagram below shows a circuit used to charge a capacitor.
VS We can use Kirchhoff’s second law to set up a differen-

tial equation for the charge on the capacitor:
VS VR VC
Q
VS IR
C
C dQ Q
R VS R
I dt C
dQ 1
+Q Q CVS Q
dt RC
dQ 1
VR VC QF Q
dt RC
where QF = CVS is the final charge when the voltage across the capacitor is
equal to the supply voltage.
This is a first-order linear differential equation whose solution is:

t

Q QF 1 e RC

The difference between Q and QF decays exponentially toward zero.
The voltage across the capacitor is directly proportional to the charge stored
(V = Q/C) so the voltage rises in a similar way to the charge:

t

V VS 1 e RC

Current is the rate of change of charge so the charging current decays expo-
nentially from an initial value I0 = VS/R.
t

I I0 e RC
FP.CH19_3pp.indd 411 3/15/2023 4:18:50 PM

percentage of final charge or

voltage on capacitor
100
80
60
40
RC= 2.0 s
20
0 time /s
0 2 4 6 8 10 12
19.4 CAPACITORS IN SERIES AND PARALLEL

When capacitors are combined in series or in parallel we can calculate the
effective capacitance of the arrangement, that is, the value of the single capac-
itor that could replace them.
19.4.1 Capacitance of Capacitors in Series

When several capacitors are connected in series the wires connecting the
inside plates together are isolated from the external supply. If the plate at
one end of such a wire gains a positive charge, then the plate connected to the
other end must gain an equal negative charge. As a consequence, all of the
series capacitors must have a charge equal to that on the outermost plates that
are connected to the external supply.
C1 C2 C3
+Q Q +Q Q +Q Q
V1 V2 V3
The total charge on the set of series capacitors is therefore Q and the voltage
across the set is equal to the sum of voltages across the individual capacitors.
V V1 V2 V3
Q Q Q Q

Cseries C1 C2 C3
FP.CH19_3pp.indd 412 3/15/2023 4:19:02 PM

Capacitance • 413
1 1 1 1

Cseries C1 C2 C3
For n capacitors in series:
i n
1 1

Cseries i 1 Ci
This has the same form as the equation for resistors in parallel but here it
applies when the capacitors are in series.
19.4.2 Capacitors in Parallel

When capacitors are connected in parallel, they all have the same voltage
across them and the total charge stored is the sum of the charges stored on
each capacitor.
V
C1
Q Q1 Q2 Q3
C2 Cpara V C1 V C2 V C3 V
Cpara C1 C2 C3
C3
The total capacitance of several capacitors connected in parallel is the sum of

their individual capacitances. For n capacitors in parallel:
i n
Cpara Ci
i 1
This has the same form as the equation for resistors in series but here it applies
when the capacitors are in parallel.
19.5 THE CAPACITANCE OF A CHARGED SPHERE

When a charge Q is placed on a spherical conductor of radius a it spreads
evenly over the surface. The potential of the surface is given by:
Q
V
4 0 a
FP.CH19_3pp.indd 413 3/15/2023 4:19:15 PM

So the capacitance of the sphere with respect to the earth is:

Q
Csphere 4 0 a
V
19.6 EXERCISES
1. A 220 µF capacitor is charged from a 6 V DC supply.

(a) Calculate the charge stored on the capacitor.
(b) Calculate the energy stored on the capacitor.
(c) By what factors would the charge and energy stored on the capacitor
change if the supply voltage was doubled to 12 V?
2. A 470 µF capacitor is connected in series with a 220 Ω resistor and charged
from a 10 V DC supply.
(a) Calculate the time constant for this circuit.
(b) Roughly how long will it take to charge the capacitor?
(c) How would your answer to (b) change if the capacitor was charged
to 20 V? Explain your answer.
(d) Once the capacitor is charged up (to 10V) it is disconnected and
discharged through a different resistor. If the discharge current falls
to 50% of its original value in 10 s, what is the value of this resistor,
and what was the original value of the discharge current?
3. When S in the circuit below is closed the capacitor charges.
V1
A1 A2
S R C
V2 V3
FP.CH19_3pp.indd 414 3/15/2023 4:19:21 PM

Capacitance • 415
(a) Sketch a graph to show how the readings on ammeters A1 and A2

vary from the moment S is closed to the time that the capacitor is
almost fully charged.
(b) Sketch a graph with three lines on it to show how the readings on
voltmeters V1, V2, and V3 vary from the moment S is closed to the
time that the capacitor is almost fully charged.
4. The circuit below shows two capacitors that can be connected to a supply
and/or to each other by closing one or both of the two switches.
Initially, both switches are open and both capacitors are discharged.
S1 is now closed and C1 charges.
8.0 V
S1
C1 = 50 F
S2
C2 = 100 F
(a) Calculate the charge and energy stored on C1.

S1 is now opened and then S2 is closed connecting the two capacitors
together.
(b) Calculate the charge on each capacitor once they reach equilibrium.
(c) Calculate the total energy stored on both capacitors and compare
this to the value you obtained in (a). Account for the difference.
Now consider a different scenario. Both capacitors are discharged
and then:
S1 is now closed and C1 charges. S1 is kept closed and S2 is also closed
connecting the two capacitors together while still connected to the
power supply.
(d) Calculate the charge on each capacitor once they reach equilibrium.
(e) Calculate the total energy stored on both capacitors and compare
this to the value you obtained in (a). Account for the difference.
FP.CH19_3pp.indd 415 3/15/2023 4:19:22 PM

5. You are provided with three identical capacitors each of capacitance C.

Calculate and list the values of all the capacitances you can make using
one or more of these capacitors. Stating which arrangement corresponds
to which value.
6. The diagram below shows a parallel plate air capacitor of area A and plate
separation d connected to a power supply and a resistor.
The plates are pulled apart by an external agent and their separation is
doubled.
Describe and explain what happens:
to the charge stored on the capacitor
to current flow in the circuit
to the energy stored on the capacitor
to the voltage across the capacitor.
7. A 470 µF capacitor is charged to 10 V and then discharged through a 2200
Ω resistor. Calculate the charge and voltage remaining on the capacitor
after 1.5 s.
FP.CH19_3pp.indd 416 3/15/2023 4:19:22 PM

CHAPTER
20
Magnetic Fields
20.0 THE MAGNETIC FIELD

If a flat card is placed on top of a bar magnet and then iron filings are scattered
on the card they form a distinct pattern as shown in the image.
Each filing is like a small rod of iron that lines up with an invisible force field.
If a magnetic compass is moved nearby it too lines up with this force field and
can be used to trace its pattern.
Michael Faraday explained these effects by saying that a bar magnet has two
poles, north and south, and that these create a magnetic field in the space
surrounding the magnet. The lines of magnetic field begin on north poles and
end on south poles, they cannot start or end in space (although they can form
closed loops). The direction of the field lines is the direction of force that
FP.CH20_3pp.indd 417 3/15/2023 12:44:56 PM

would be exerted on a free north pole (even though free north poles do not
exist in nature). This is all very similar to the description of the electric field
and the link between electric field lines and electric charges.
N S
The behavior of a compass needle can be explained by considering the forces

acting on its poles from the magnetic field. If the needle is not aligned with
the field, then two forces of equal magnitude act along different parallel lines
creating a couple. The couple is zero when the needle lines up the field.
magnetic field
The Earth has its own magnetic field. The North Pole of compass points
toward the Earth’s North Pole. This implies that the geographic North Pole of
the Earth is actually a south magnetic pole. Sometimes the poles on perma-
nent magnets are called “north-seeking” or “south-seeking” poles so that the
north-seeking pole of a compass always points toward magentic north.
From a distance, the magnetic field in space created by the Earth has a similar
shape to that of a large bar magnet. However, the center of the Earth is very
hot and this high temperature would destroy any permanent magnetism (it is
above the Curie point for iron and nickel). The Earth’s field is generated by
electric currents.
FP.CH20_3pp.indd 418 3/15/2023 12:44:56 PM

Magnetic Fields • 419
20.1 PERMANENT MAGNETS

Some materials can be permanently magnetized – for example, iron, steel,
cobalt, or nickel. These are described as ferromagnetic materials, and perma-
nent magnets are constructed from these materials. If the material is easily
magnetized and easily loses its magnetization it is described as magnetically
soft. If it is difficult to magnetize and demagnetize it is described as magneti-
cally hard. Pure iron is a soft magnetic material and steel is a hard magnetic
material. Alloys made of neodymium, iron, and boron are used to make
the most powerful of all permanent magnets. These are called neodymium
magnets.
The atoms of ferromagnetic materials are themselves magnetic, with each
atom having its own dipole (like a tiny bar magnet). In an unmagnetized sam-
ple of the material the atomic dipoles are in random orientations, so there is no
net magnetic field. When the sample is magnetized, the atomic dipoles align
and their fields add together in small regions called domains. The greater the
magnetization the larger the domains grow and the more they align with one
another to create a stronger the field.
Magnetization can be achieved by placing a ferromagnetic sample in an exter-
nal field – for example, from another permanent magnet or from an electro-
magnet. Molten rocks from volcanic eruptions often contain ferromagnetic
minerals. As they solidify, they become weakly magnetized in the direction
of the Earth’s own magnetic field. The alternating magnetization directions
of volcanic rocks on the ocean floor have provided strong evidence that the
Earth’s magnetic field has changed polarity many times in the past and is likely
to do so again in the future.
If a magnetized sample is heated, thermal vibrations can cause the atomic
dipoles to change their orientation and destroy the alignment. There is a criti-
cal temperature for each ferromagnetic material above which it loses its per-
manent magnetic properties. This is called the Curie temperature or Curie
point. The Curie temperature for several materials is shown below:
Iron 1043 K
Cobalt 1400 K
Nickel 637 K
Neodymium magnets ∼ 590–640 K
FP.CH20_3pp.indd 419 3/15/2023 12:44:56 PM

20.2 MAGNETIC FORCES ON ELECTRIC CURRENTS AND

MOVING CHARGES
In 1820, a Danish physicist, Hans Christian Oersted, demonstrated that mag-
netic fields are created by moving electric charges (electric currents). He did
this by holding a magnetic compass close to a wire and switching on an elec-
tric current. The compass needle deflected from the Earth’s field – a magnetic
field had been created by the electric current. Further investigation shows
that the magnetic field forms concentric rings around the current, getting
weaker with distance. The magnetic field of a long straight current-carrying
wire is shown below:
The direction of the magnetic field can be predicted using a “right-hand grip
rule.” If you make a “thumbs up” with your right hand and align your thumb
with the conventional current direction, then your fingers are curling in the
direction of the magnetic field lines.
Electric current is a flow of charge so the source of the magnetic field is the
moving charges inside the wire. A beam of charged particles moving through
a vacuum would create a similar pattern of the magnetic field.
It turns out that ALL magnetic fields originate from moving charges. Even
the magnetic fields of permanent magnets originate from the movements of
electrons inside the atoms of the material.
20.2.1 The Magnetic Force on an Electric Current

Electric currents create magnetic fields and they also experience forces from
them. This can be demonstrated using the apparatus below:
FP.CH20_3pp.indd 420 3/15/2023 12:44:57 PM

circuit to
control current
A
current-carrying wire of length l
I
N x S
l
top pan
constant
balance
uniform
magnetic field
directed out
of page
In the diagram above, the X on the end of the wire represents a current into
the page.
A permanent magnet with flat pole pieces is placed on a top pan balance. The
magnet creates a uniform horizontal magnetic field (directed from N to S). A
separate circuit is used to control the current through a wire that passes hori-
zontally between the poles perpendicular to the magnetic field lines. As the
current is increased from zero the reading on the top pan balance decreases,
showing that the magnet is being pulled up and the wire (by Newton’s third
law) is being pulled down. The magnitude of the force increases as the current
is increased. Experiments like this can show that the magnetic force F
is directly proportional to the current I
is directly proportional to the length of wire in the field l
is perpendicular to the current and the magnetic field
F ∝ Il
The constant of proportionality depends on the strength B of the magnetic

field so we can use this effect to define the magnetic field strength:
F
B=
Il
The SI unit of magnetic field strength is therefore the NA−1m−1. This is called
the tesla (T): 1T = 1 NA−1m−1.
FP.CH20_3pp.indd 421 3/15/2023 12:45:02 PM

The magnetic force on a current-carrying conductor is often called the “motor

effect” because this is used as the driving force in an electric motor. The direc-
tion of the force can be predicted using Fleming’s left-hand rule:
thumb:
‘thrust’
or force
first finger: field
second finger:
current
In the experiment above, the current is perpendicular to the magnetic field.

If the angle between the magnetic field and the current is varied it is found
that the magnetic force only acts on the component of current perpendicular
to the field, so the more general equation for the magnetic force on a current-
carrying conductor is:
F BIl sin
where θ is the angle between the magnetic field and the current.
20.2.2 The Force on a Moving Charge

The force on a current-carrying wire is really the sum of forces on the individ-
ual moving charges inside the wire. We can derive an expression for the force
on a moving charge by using the microscopic equation for electric current.
magnetic forces f on individual moving

charges
conventional conventional
current I q q vq q v current I
v vq
q v v q v
l cross-sectional
magnetic field of strength B area A
directed into the page
FP.CH20_3pp.indd 422 3/15/2023 12:45:08 PM

The total magnetic force on a length l of a conductor carrying current I at

right angles to a uniform magnetic field of strength B is:
F = BIl
I = nqAv
F = BnqAvl
This force arises from the individual forces on N charge carriers where:
N = nAl
so the magnetic force on each charge carrier is:
f = Bqv
If the charge is moving in a direction at an angle θ to the magnetic field direc-

tion the magnetic force on a moving charge is:
f Bqv sin
This can be written using a vector cross product: f = qv∧B.

The direction of the magnetic force is given by Fleming’s left-hand rule but
bear in mind that current direction refers to conventional current, so if the
moving particle is negatively charged the current is in the opposite direction
to the motion.
Note that magnetic fields only affect moving charges. If v = 0 there is no mag-
netic force on the charge.
20.2.3 The Path of a Moving Charged Particle in a Magnetic Field

The magnetic force is always perpendicular to the velocity therefore it:
cannot do any work on the charged particle so it does not change the
speed of the particle.
acts as a centripetal force, changing the direction of motion of the charged
particle.
Consider a particle with mass m and positive charge q moving into a region
of uniform magnetic field of strength B with its velocity perpendicular to the
magnetic field:
FP.CH20_3pp.indd 423 3/15/2023 12:45:18 PM

r
magnetic field of
f q strength B directed
into page
v
q
The magnetic field provides a constant centripetal force so the charged parti-
cle moves in an arc of a circle of radius r:
mv2
= =
f Bqv
r
mv
r=
Bq
When particles are created in particle accelerators like the LHC at CERN
particle physicists use strong magnetic fields to deflect them so that they can
determine their momentum and mass.
Lorentz Force Law

The total force on a charged particle in the electromagnetic field has two
parts, the force from the electric field and the force from the magnetic field.
This is expressed by the Lorentz force law:
f = felectric + fmagnetic
f = qE + qv∧B
Note that the electric force is parallel to the electric field whereas the mag-
netic force is perpendicular to the magnetic field (in a direction given by
Fleming’s left-hand rule).
20.2.4 The Velocity-Selector: Crossed Electric and Magnetic Fields

Moving charged particles can be deflected by both electric and magnetic
fields. If the fields are perpendicular to one another and to the velocity of the
FP.CH20_3pp.indd 424 3/15/2023 12:45:22 PM

incoming beam of charged particles, then the electric and magnetic force act
along the same line and can be made to oppose one another. By adjusting the
values of the fields the two forces can be made to cancel out so that the beam
is undeflected.
For the beam to be undeflected:
fmagnetic = felectric
Bqv = Eq
E
v=
B
uniform electric field of strength E
fmagnetic
q q q
magnetic field of
strength B directed
into page felectric
For any ratio of electric field strength to magnetic field strength there is just
one velocity that will be undeflected. This arrangement is called a velocity
selector. If a stream of charged particles with a range of incident velocities
enters the region of crossed fields then only those satisfying the equation
above go straight through.
Mass spectrometer
A velocity selector is used to send ion beams with a particular velocity into a
mass spectrometer. The beams are then deflected in a constant magnetic field
so that they move in a semi-circular path. The radius of curvature of this path
FP.CH20_3pp.indd 425 3/15/2023 12:45:27 PM

can be used to determine the masses of the ions. It is also possible to measure
the amount of each type of ion that is present in the beam, so mass spectrom-
eters are ideal for analyzing ionic ratios in samples.
A simplified diagram to illustrate the principle of the mass spectrometer is
shown below. The entire apparatus is evacuated.
the radius of curvature in the

uniform magnetic field is
proportional to the mass of
the ion
r
region of uniform
E magnetic field B2
source of ions
B1
with a range of
velocities
B2
only ions with velocity v = E/ B1

enter the region of uniform
magnetic field
region of crossed fields: faster and
slower ions are deflected
The mass of the ion can be calculated using:

B2 qr B1 B2 qr
=
m =
v E
The greater the radius of curvature the greater the mass of the ion.
FP.CH20_3pp.indd 426 3/15/2023 12:45:29 PM

20.3 THE MAGNETIC FIELDS CREATED BY ELECTRIC

CURRENTS
20.3.1 The Biot–Savart Law
The magnetic field created by a small element of an electric current is given
by the Biot–Savart law. However, small current elements do not exist in isola-
tion so we must use integration to find the magnetic field at a point in space
created by an electric current in a wire. This is analogous to the way we use
Coulomb’s law to find the resultant electric field at a point in space by adding
up contributions from all the charges present.
The diagram below shows the contribution to the magnetic field strength δB
at a point P from a small current element of length δl.
current
element
I
l
I x
P
The direction of δB is found using the right-hand rule and using the continu-
ation of the current element as the current direction. The magnitude of δB is
given by the equation:
0 I sin
B l
4 x2
where µ0 is the permeability of free space, a constant that determines the
ability of a vacuum to support a magnetic field.
20.3.2 The Magnetic Field at the Center of a Narrow Coil

This is the simplest example of the use of the Biot–Savart law because the
angle θ is the same (and equal to π/2) for all current elements around the coil.
FP.CH20_3pp.indd 427 3/15/2023 12:45:35 PM

.
The diagram on the right has the axis of the coil into the page. The resultant
magnetic field is also into the page, shown by the cross at the center:
l I
x B
0 I sin / 2
B l
4 r 2
0 I sin / 2
l 2 r
B
l 0
4 r 2
l

All the terms in the curly bracket are constant so this is a very simple integral
resulting in:
0 I
B
2r
If the coil has N turns the magnetic fields add together:
0 NI
B
2r
An expression for the field strength at other points along the axis of the nar-
row coil can also be determined using the Biot–Savart law. The integration is
a little more complicated but the result is:
0 Ir 2
B
2 z2 r 2
3
2
Where z is the distance along the axis from the center of the coil. When z = 0
the expression reduces to the previous equation (as it should).
FP.CH20_3pp.indd 428 3/15/2023 12:45:55 PM

20.3.3 The Magnetic Field of a Long Straight Current-Carrying Wire

For an infinitely long straight wire, all of the current elements lie along the
same line.
B P
x
r
l
l I
The resultant magnetic field forms concentric rings around the line of the
electric current:
0 I sin
B l
4 x2
The contribution from current elements to the left and right of P are the same
so the resultant magnetic field strength at P is:
l
0 I sin
B2
l 0
4 x2
dl
The three variables θ, x, and l are all related so in order to carry out the inte-
gration we need to express two of them in terms of just one of them and the
constant distance r:
r
x
sin
r dl r
l so 2
tan d sin
The integral now becomes:
0
0 I sin d
B 2 r

2
0 I
B
2 r
FP.CH20_3pp.indd 429 3/15/2023 12:46:10 PM

20.3.4 The Magnetic Field Along the Axis of a Solenoid

A solenoid is a long coil so the magnetic field along its axis can be found by
integrating the contributions from narrow coils along the length of the sole-
noid. The result is that the field strength at the center of a long solenoid is:
0 NI
B
l
where N is the number of turns, I is the current in the solenoid and l is the
length of the solenoid.
The field strength drops to half of this value at each end and is small or negli-
gible outside the solenoid (except near the ends).
The external field pattern is similar to that of a dipole bar magnet, one end of
the solenoid acts like a north pole, and the other acts like a south pole.
I I
end center end

2 2
An electromagnet consists of a long coil wound around a soft iron core. The
magnetization of the core increases the magnetic field strength and the field
at the center is then given by:
0 r NI
B
l
where µr is the relative permeability of the ferromagnetic core. Some typical
values are shown below but magnetic permeability depends strongly on the
magnetic field strength so these are representative values only.
Iron 5000
Ferrite 600
Nickel 300
Carbon steel 100
FP.CH20_3pp.indd 430 3/15/2023 12:46:13 PM

20.3.5 Ampère’s Theorem

Ampère’s theorem or “circuit law” states that, for any closed loop, the integral
of the length elements multiplied by the component of magnetic field paral-
lel to each element is proportional to the current enclosed by the path. This
is really a consequence of the Biot–Savart law but can be useful in situations
with simple geometry.

I
B.dl is the component of B in the direction of dl multiplied by dl (i.e., the

scalar product of B and dl).
The constant of proportionality is the permeability µ0.
B.dl 0 I
closed enclosed
loop by loop
Ampère’s theorem can be used to provide a simple derivation of the mag-

netic field strength a distance r from a long straight wire carrying a current I.
By symmetry, the magnetic field strength B is constant and forms concentric
rings with the current at its center.
l
I
X
r
FP.CH20_3pp.indd 431 3/15/2023 12:46:18 PM

B.dl 2 rB
enclosed
0I 0I
closed
loop by loop
0 I
B
2r
20.4 ELECTRIC MOTORS

The “motor effect,” where a force is exerted on a current-carrying conductor,
is used in electric motors. This provides a way to transfer electrical energy to
mechanical energy.
20.4.1 The Turning Effect on a Coil in a Uniform Magnetic Field

A simple electric motor consists of a current-carrying coil in a magnetic field.
Consider a rectangular coil of sides a and b, carrying a current I and lying in
the plane of a uniform magnetic field of strength B.
B C
north pole
south pole
I I
b
A D
a
Sides AB and CD are perpendicular to the magnetic field so magnetic forces

act on these wires. On AB the force is into the page and on CD it is out of the
page.
The magnitude of each force is F = BIb. These two forces create a turning
effect about the central vertical axis (dotted line). The resultant couple or
torque is:
FP.CH20_3pp.indd 432 3/15/2023 12:46:26 PM

a
2 BIb
BIab BIA
2
Where A = ab is the area of the coil. If the coil has N turns the torque is:
NBIA
20.4.2 A Simple DC Electric Motor

To make a working motor using a coil in a uniform magnetic field, the turning
effect on the coil must always be in the same direction. For this to be the case
the direction of current in the coil must be reversed every half rotation. The
reason for this is clear from the diagrams below, which show the end view of
a motor coil as it rotates in the field. The end of the coil is shown in three dif-
ferent positions with side AB on the left and CD on the right. For the coil to
continue to rotate in the same direction when AB moves past the vertical dot-
ted line, the current direction must reverse. The thick black arrows represent
the magnetic forces on the wires as the coil turns.
reverses
current
AB CD
north pole
south pole
AB . X CD
AB CD
The device used to reverse the current is called a split-ring commutator. For
a simple motor with a single coil, the commutator consists of a conducting
cylinder split in half so that the halves are separated by an insulator. The com-
mutator is attached to the axis of the motor and rotates with it. Current enters
and leaves the coil via brushes that make sliding contact with the surface of
the commutator.
FP.CH20_3pp.indd 433 3/15/2023 12:46:31 PM

brushes
conductors
coil rotates with
commutator
rotation axis
insulator separating
two halves of
commutator
The commutator acts as a rotating switch ensuring that the direction of cur-
rent in the coil stays the same as it rotates.
20.5 EXERCISES
1. The Earth’s magnetic field is very similar to the field of a dipole bar mag-
net. However, geophysicists are sure that the field is not caused by a per-
manent magnet inside the Earth. Explain why not.
2. Two long straight parallel wires separated by 0.045 m each carry a current
of 2.0 A in the same direction.
(a) Draw a diagram showing the magnetic field around one of the wires
interacts with the other wire, and use this diagram to explain why the
wires exert a force on one another. State the direction of this force.
(b) Calculate the magnitude of the force per unit length on each wire.
(c) Explain why coils of wire carrying a very large current must be able
to withstand large stresses.
3. A mass spectrometer is used to measure the masses and abundances of
different isotopes. It does so by accelerating ions of each isotope to the
same speed and then deflecting them into a semi-circular path in a strong
uniform magnetic field.
(a) Explain how a velocity-selector, consisting of perpendicular electric
and magnetic fields can be used to select ions of the same speed
from a group containing a wide range of different speeds.
FP.CH20_3pp.indd 434 3/15/2023 12:46:31 PM

(b) Ions of mass m1 and m2 and equal charge q enter the same uniform
magnetic field of strength B at the same velocity v at right angles to
the field lines and both are detected after moving through a complete
semi-circle in the field. Derive an equation for their separation.
4. The diagram below shows an end view of a rectangular coil in a uniform
magnetic field of strength 0.05 T. The dot represents current out of the
page and the cross represents current into the page.
(a) Use a diagram to explain why there is a resultant moment on the coil
and state the direction of this moment.
(b) Describe qualitatively how the moment on the coil changes as θ var-
ies from 0 to 90°.
The coil has 80 turns and an area of 0.012 m2. It carries a constant
current of 0.65 A.
(c) Calculate the moment on the coil when θ = 0°, 30°, 45°, 60°, and 90°.
(d) Explain how the coil would move if it was released from a horizontal
position and allowed to move freely about a central axis directed into
the page.
(e) Explain what has to be done to the current in the coil if it is to oper-
ate as a DC motor.
FP.CH20_3pp.indd 435 3/15/2023 12:46:31 PM

FP.CH20_3pp.indd 436 3/15/2023 12:46:31 PM
CHAPTER
21
Electromagnetic Induction
21.1 INDUCED EMFS

21.1.1 What Is Electromagnetic Induction?
An electric motor transfers electrical energy into mechanical work. This is
a reversible process. If mechanical work is used to turn a motor, an emf is
generated across its terminals and if these are connected to an external load
electrical energy is generated. Motors and generators/dynamos are like mirror
images of one another.
electrical Motor mechanical

energy energy
mechanical Generator
electrical
energy energy
The underlying physics is electromagnetic induction – when a conducting wire

cuts through a magnetic field or when the magnetic field passing through a coil
changes, there is an induced emf in the conductor or coil. Electromagnetic
induction was discovered by Michael Faraday in 1831.
FP.CH21_3pp.indd 437 3/15/2023 12:18:56 PM

21.1.2 Electromagnetic Induction Experiments

The “motor effect” is the force exerted on a current-carrying wire when it is
perpendicular to a magnetic field. The force acts to push the wire in a direc-
tion perpendicular to the field and current. The inverse process is to use an
external force to push the wire so that it cuts perpendicularly across the lines
of the magnetic field. This results in an induced emf in the wire.
Here is a simple experiment that can be used to investigate electromagnetic
induction effects when a wire cuts across the lines of a magnetic field.
galvanometer deflects
N S when wire moves and
G
reads zero when wire is
staonary in or out of field
wire pushed into page to

cross magnec field lines
Several simple observations can be made:

The deflection is greater if the wire is moved faster.
There is no induction if the wire is stationary in or out of the field.
The sign of the deflection changes if the direction of motion changes.
If the wire is moved parallel to the field lines, there is no deflection.
We can explain all of these effects in the following way:
When the wire cuts the lines of the magnetic field there is an induced emf
across the ends of the wire.
The magnitude of the emf is directly proportional to the rate of cutting
field lines.
The sign of the emf depends on the direction in which the field lines are
cut.
Here is an experiment that can be used to investigate electromagnetic induc-
tion effects when the magnetic field through a coil is changed.
FP.CH21_3pp.indd 438 3/15/2023 12:18:57 PM

Electromagnetic Induction • 439
bar magnet S
moved along
doed line N
coil G

The deflection is greater if the magnet is moved faster.
There is no induction if the magnet is stationary in or out of the coil.
The sign of the deflection changes if the direction of motion changes.
The effects are exactly the same if the coil is moved and the magnet is
stationary.
Moving the magnet toward or away from the coil changes the amount of
magnetic field passing through the coil.
When the magnetic field in the coil changes there is an induced emf
across the ends of the coil.
The magnitude of the emf is directly proportional to the rate of change of
the magnetic field through the coil.
The sign of the emf depends on whether the magnetic field through the
coil is increasing or decreasing.
A third experiment shows that electromagnetic induction does not need rela-
tive motion, a changing magnetic field from an electromagnetic can be used
instead. This experiment is similar to Faraday’s original experiments in the
19th century.
FP.CH21_3pp.indd 439 3/15/2023 12:19:00 PM

primary secondary
coil coil
so iron
core

When S is closed there is a momentary deflection of the galvanometer and
then it returns to zero.
When S is opened there is a momentary deflection of the galvanometer
and then it returns to zero.
If the supply voltage is increased the deflections are larger.
If the experiment is repeated without the iron core the effects are similar
but MUCH weaker.
When current flows in the primary coil it becomes an electromagnet.
The iron core increases the strength of the field and links the two coils
magnetically.
When the magnetic field in the secondary coil changes there is an induced
emf across the ends of the coil.
The magnitude of the emf is directly proportional to the rate of change of
the magnetic field through the secondary coil.
The sign of the emf depends on whether the magnetic field through the
secondary coil is increasing or decreasing.
If the magnetic field passing through the secondary coil is constant or zero
there is no induction.
It is clear that electromagnetic induction only induces an emf in the secondary
coil when the switch is opened or closed and the magnetic field through the
secondary is changing. Opening or closing the switch causes a rapid change
and results in a large induced emf. We can extend this experiment to show the
effect of a continuously changing magnetic field by replacing the DC supply
FP.CH21_3pp.indd 440 3/15/2023 12:19:03 PM

to the primary coil with an AC supply. The output from the secondary coil is
now an AC emf that can be displayed on an oscilloscope. If a dual beam oscil-
loscope is used the emf in the secondary can be compared with the current in
the primary (which is in phase with the magnetic field).
primary secondary
coil coil
so iron
core
A.C.
to oscilloscope
input 2
to oscilloscope
input 1
Note that the peaks of the induced emf occur at times when the rate of change
(gradient) of the magnetic field through the secondary coil is greatest and that
the induced emf is zero at times when the rate of change of the magnetic field
through the secondary coil is zero (gradient is zero).
21.2 THE LAWS OF ELECTROMAGNETIC INDUCTION

The experimental observations in the previous experiments can be explained
using one simple equation – this is usually referred to as “Faraday’s law”
but the mathematical formulation was actually first found by Neumann and
the sign of the emf was explained by Lenz! This equation involves two new
concepts – magnetic flux and magnetic flux-linkage.
21.2.1 Magnetic Flux and Magnetic Flux Linkage

So far, we have given a qualitative explanation of electromagnetic induction in
terms of changing the amount of magnetic field through a coil or changing the
rate at which the magnetic field lines are cut. To give a quantitative formula
FP.CH21_3pp.indd 441 3/15/2023 12:19:05 PM

for the induced emf we need to define what we mean by “amount of magnetic
field.” This will depend on three factors, the strength of the field, its orienta-
tion, and the area through which it passes.
Magnetic Flux
If a constant uniform magnetic field of strength B passes normally through a
surface of area A the flux Φ is defined as the product BA.
magne
c field of strength B
area A perpendicular
to magne
c field
= BA
a surface of area A the flux is defined as the product BA.

In general, the flux through an element of area δA that makes an angle θ to the
normal to the surface element will contribute a flux element δΦ = BδA cos θ
to the total flux through the surface.
magnec field of strength B
A
The total flux is found by integrating these contributions across the surface:
=Φ ∫ B cos θdA
surface
where B might vary from point to point.

The SI unit for magnetic flux is the tesla-meter-squared (Tm2) which is called
the weber (Wb).
1 Wb = 1 Tm2 or 1 T = 1 Wbm− 2
writing the tesla in this way emphasizes the fact that it is the magnetic flux
density.
FP.CH21_3pp.indd 442 3/15/2023 12:19:07 PM

Magnetic Flux-Linkage
When magnetic flux passes through a coil of N turns it links each turn in the
coil, so it is convenient to define the magnetic flux-linkage as the product of
the magnetic flux and the number of turns linked by that flux:
flux-linkage = NΦ
If the flux is caused by a constant uniform field of strength B along the axis of
the coil the flux linkage is simply:
flux-linkage = NBA
where A is the area of the coil. N is a dimensionless number so the SI unit for
flux-linkage is the weber (Wb).
21.2.2 Faraday’s Law of Electromagnetic Induction

Faraday realized that the induced emf is directly proportional to the rate at which
a conductor cuts the magnetic flux or the rate at which the magnetic flux-linkage
of a coil changes. This can be written mathematically in the following way:
d ( NΦ )
E= −
dt
We will discuss the significance of the minus sign later.
dNΦ
The term can be interpreted as the rate of cutting flux or the rate of
dt
change of flux-linkage in a coil, depending on the context. The equivalence
of these two descriptions can be seen by considering the emf induced in a
straight conductor moving at constant velocity perpendicular to a magnetic
field of strength B.
conducng rail
region of
magnec field of
strength B into
G v v d
vt conducng rail
FP.CH21_3pp.indd 443 3/15/2023 12:19:09 PM

The conductor runs along two parallel conducting rails connected back to a
stationary galvanometer. As the conductor moves it cuts through lines of mag-
netic field. In time δt it cuts all the magnetic field lines in the darker shaded
area and the flux through the circuit increases by an amount equal to the flux
through that additional area.
Flux cut in time δt:
δΦ = Bdvδt
Rate of flux-cutting:
dΦ
= Bvd
dt
Using Faraday’s law, the induced emf in the moving conductor is:
E= ( − ) Bvd
Lenz’s Law
The negative sign in the equation for Faraday’s law can be interpreted in the
following way:
The direction of the induced emf is such as to oppose the change that
caused it.
This sounds a little obscure but becomes clear when we apply it to the exam-
ple above. In order to induce an emf we needed to push the wire to the right.
The existence of an emf in the circuit causes a current to flow. However,
when a current flows in a conductor lying perpendicular to the magnetic
field there is a motor effect force on the conductor that is perpendicular to
both the current and the field. In this case, the force must lie either in the
direction of v or in the opposite direction. If it was in the direction of v then
once we had started the wire moving the induced emf would create a cur-
rent that experienced a force in the direction of motion and the wire would
accelerate with no need for us to do any further work on it. Energy would
be generated from nothing! This violates the law of conservation of energy
so cannot occur. The force on the induced current must oppose the force
moving the wire so that we do work to move it and “pay” for the electrical
energy it generates. Lenz’s law ensures that energy is conserved when an
emf is induced. It is the reason we need to burn fuel to turn the generators
in a power station.
FP.CH21_3pp.indd 444 3/15/2023 12:19:11 PM

21.2.3 Changing the Flux-Linkage in a Coil

Consider a coil of N turns and area A initially perpendicular to a uniform mag-
netic field of strength B that is rotated through 90° so that it finally lies parallel
to the field. The magnetic flux falls from a maximum value at the start to zero
in a time δt. What is the average induced emf during this rotation?
Coil of N turns and area A
B
t
change in flux-linkage: δΦ = NBA

δΦ NBA
rate of change in flux-linkage: =
δt δt
NBA
average induced emf: E= ( −)
δt
The reason that this is an average value rather than a constant value is that the
rate of change of flux-linkage varies as the coil turns, being low at the begin-
ning, and a maximum near the end.
A more interesting situation is one where the flux-linkage through the coil
varies sinusoidally. This is the case in most electrical generators, where a coil
rotates in a magnetic field.
Coil of N turns and area A
cos
FP.CH21_3pp.indd 445 3/15/2023 12:19:12 PM

=
Flux-linkage: Φ NB0 A cos ωt
d ( NΦ )
Induced emf: E= − =
ωNB0 A sin ωt
dt
This is an AC output with peak value ωNB0A.
21.3 INDUCTANCE
When there is current in a coil the coil becomes an electromagnet and creates
a magnetic field. If the magnetic field changes, there is changing flux-linkage
inside the coil so an induced emf is created that opposes the external supply
that is changing the current. This is often called a “back emf.” This opposi-
tion to changing current is called inductance. Coils act as inductors in electric
circuits.
21.3.1 Self-inductance
The tendency for a coil to oppose changes in current is called inductance.
The greater the inductance, the stronger the opposition to changing currents.
Inductance is defined by the equation:
dI
E = −L
dt
−E
L=
 dI 
 
 dt 
Inductance is equal to the back emf per unit rate of change of current.
The SI unit for inductance is VsA− 1 or the henry (H). 1H = 1 VsA− 1.
An expression for the self-inductance of a long solenoid can be determined by
comparing the definition of inductance with Faraday’s law:
dI dNΦ
E=
−L = −
dt dt
µ 0 N 2 IA
=
Φ NBA
=
l
FP.CH21_3pp.indd 446 3/15/2023 12:19:14 PM

so
dI µ 0 N 2 A dI
L =
dt l dt
µ0 N 2 A
L=
l
The symbol for an inductor is:
21.3.2 The Rise of Current in an Inductor

When an inductor is connected to a DC power supply the current through the
inductor begins to increase, but as it does so there is a back emf that opposes
the supply and that limits the rate at which the current can rise:
dI E
= −
dt L
This effect means that it takes time for the current to reach its steady final
value and that the supply does work against the back emf as this current is
established. During this time energy is being transferred from the supply to
the magnetic field.
VS current
VS /R
S
me
VR VL 0
0
FP.CH21_3pp.indd 447 3/15/2023 12:19:17 PM

When S is closed Kirchhoff’s second law gives us:

V=
S VR + VL
dI
V=
S IR + L
dt
This is a first order differential equation that can be solved to give an expres-
sion for I.
 −  t 
R
=I I0  1 − e  L  
 
 
This has a similar form to the equation for the charging of a capacitor and
(R/L) is the effective time constant.
If S is opened the current falls to zero in a very short time so that dI/dt is very
large. This results in a large back emf that can cause sparks across the switch
terminals. Interrupting the current in a coil is a way to generate spikes of high
voltage – for example, for a car’s spark plugs.
21.3.3 The Energy Stored in an Inductor

When the current in an inductor is interrupted there is a large back emf and
the energy stored in the magnetic field is rapidly dissipated – for example, by
sparks and heating effects. We can derive an expression for the energy stored
in an inductor of inductance L when a current I0 flows in it by considering the
work done to establish that current.
dI
= = IL
P IV
dt
separating variables and integrating:
t=∞ I = I0
1 2
=E
=t 0=I 0
∫=
Pdt ∫=
ILdI
2
LI
21.3.3 Mutual Inductance

When two coils are placed close together their magnetic fields can affect
one another. If the current is changed in one coil there will be a changing
FP.CH21_3pp.indd 448 3/15/2023 12:19:18 PM

Coil 2: I2, N2 , A2
Coil 1: I1, N1, A1
flux-linkage in the other coil and an induced emf. The strength of this cou-
pling is measured by the mutual inductance of the system.
The mutual inductance of two coils is defined by the equations:
dI dI
E2 = M 1 E1 = M 2
dt dt
The SI unit for mutual inductance is the henry (H).

Mutual inductance depends on the self-inductance of each coil:
M12 = kL1 L2
Where k is a coupling constant that depends on how the coils are arranged
and how they are magnetically linked.
21.4 TRANSFORMERS
Electromagnetic induction is the key principle on which the generation and
distribution of electrical energy depend. Thermal power stations use fossil or
nuclear fuels to transfer chemical or nuclear energy to thermal energy and
then use this to power generators that output AC electricity. Transformers step
up the voltage so that electricity can be transmitted with low losses over large
distances. More transformers are used to step down the voltage for consumers.
FP.CH21_3pp.indd 449 3/15/2023 12:19:21 PM

21.4.1 An Ideal Transformer

A transformer is a device that uses mutual induction to step up or step down
an AC voltage.
so iron
so iirron lam
lami
laminated
inat
ateed ccore
ore
I
output
o utput
A.C
A.C.
.C.. ssupply
upply V1 R
volta
tage
ge
voltage
pprimary
rim
ri il,, N1 turns
mary ccoil,
oil turns il,, N1 turns
secondary ccoil,
secondary oil turns
The arrangement is very similar to that of Faraday’s original experiment

AC current in the primary creates an alternating magnetic flux in the core. The
soft iron core enhances the strength of the magnetic field and acts as a mag-
netic circuit, linking the primary with the secondary. The changing magnetic
flux in the secondary induces an AC emf in the secondary. The s econdary emf
is given by:
dNΦ
E= −
dt
where NΦ is the flux-linkage in the secondary coil and Φ ∝ I1 (primary

current).
Flux in core
induced emf in
secondary
me
FP.CH21_3pp.indd 450 3/15/2023 12:19:26 PM

The peaks of emf correspond to times when the flux in the core has its greatest
rate of change.
For an ideal transformer all of the flux created by the primary passes through
the secondary. If there are equal numbers of turns on both primary and second-
ary the back emf in both coils will be equal and will equal the supply voltage.
However, the voltage on the secondary is also directly proportional to the num-
ber of turns on the secondary (V2 ∝ N2) so changing the turns ratio (N2/N1) must
change the voltage ratio (V2/V1) in a similar way. This leads to the transformer
equation:
V2 N2
=
V1 N1
“the voltage ratio is equal to the turns ratio.”

If N2 > N1 then V2 > V1 and it is a step-up transformer.
If N2 < N1 then V2 < V1 and it is a step-down transformer.
While a transformer can increase voltage it cannot create energy, the power
supplied to the transformer must be equal to or greater than the power it
delivers to a load. An ideal transformer has 100% efficiency so:
Pin = Pout
I1 V1 = I2 V2
I2 V1 N1
= =
I1 V2 N2
“the current ratio is the inverse of the voltage and turns ratios.”
21.4.2 Transmission of Electrical Energy

The electrical power generated at power stations is transmitted over long dis-
tances by transmission lines. These are conducting cables that have resistance
so there is ohmic heating and some of the power is dissipated. In order to
reduce the power loss and increase the efficiency of transmission, transformers
are used to step up the voltage in the transmission lines. This reduces the cur-
rent in the cables and therefore reduces the amount of ohmic heating, increas-
ing the efficiency of transmission. Step-down transformers are then used to
reduce the voltage for consumers (very high voltages are difficult to insulate).
FP.CH21_3pp.indd 451 3/15/2023 12:19:28 PM

V1 from generator resistance R of V3 to consumer

transmission lines
I1 I2
V1 V2 V3
step-up step-down
transformer transformer
The input power from the generator is:

P1 = I1 V1
The power transmitted is:

=
P2 I=
2 V2 P1
P1
I2 =
V2
The power loss in transmission is:
RP12 1
=
PR I=
2
2R 2
∝ 2
V2 V2
so increasing the transmission voltage reduces the power loss.

Note that V2 is the potential difference between the transmission lines and
NOT the potential difference across resistor R. The potential difference
across R depends on I2 and falls as V2 increases.
21.4.3 Real Transformers

The ideal transformer discussed above is 100% efficient. Real transformers
can have high efficiency but do have some energy losses. There are four main
causes of energy loss in a real transformer:
Copper losses: ohmic heating of the conducting wires forming the coil
(usually copper wires).
FP.CH21_3pp.indd 452 3/15/2023 12:19:31 PM

Flux losses: the magnetic circuit provided by the transformer core might
not confine all of the magnetic flux so that some of the flux created by the
primary coil does not pass through the secondary coil.
Eddy current losses: transformers work with AC so there is an alternat-
ing magnetic field in the core itself. If this is made from a conducting
material (e.g., soft iron) there will be induced current loops in the core.
These will dissipate heat. In order to reduce these losses the core is usually
laminated – cut into slices that are separated by thin layers of insulator.
High-frequency transformers, for example, in radio frequency circuits use
non-conducting cores (e.g., made ferrite cores to prevent these losses).
Hysteresis losses: the continual magnetization and demagnetization of
the core create alternating stresses inside the core that dissipate energy as
heat. Soft iron has low hysteresis losses so, in addition to its high perme-
ability, it is a good material for low-frequency transformer cores.
Our discussion of transformers has assumed that the loads attached to them
are purely resistive and ignore the phase relationships between currents and
voltages in the two coils. In practice, this is quite complex and the inductance
and capacitance of the circuits need to be taken into account.
21.5 A SIMPLE AC GENERATOR

A simple AC generator consists of a coil rotating with constant angular veloc-
ity in a constant uniform magnetic field. The diagram shows an end view of
such a coil when it has rotated through an angle θ from the horizontal. The
coil has area A and N turns.

south
north
FP.CH21_3pp.indd 453 3/15/2023 12:19:33 PM

Flux linkage:
=Φ NBA sin θ
The coil is rotating with constant angular velocity ω:

θ = ωt
dNΦ
Induced emf: E = ( − ) = ( − ) ωNBA cos ωt
dt
induced
voltage
NBA
me
T=2/
The negative sign in the equation has been ignored in the graph. The output
is an AC voltage with peak value E0 = ωNBA.
21.6 ELECTROMAGNETIC DAMPING

Lenz’s law states that the induced emf is always in such a direction as to
oppose the change that caused it. If the induced emf creates an induced cur-
rent the forces that act on this current oppose the motion causing the change
in flux linkage. This acts like a braking force and can be demonstrated using
the apparatus shown on the next page.
A bar magnet is supported by a spring. It is displaced and allowed to oscillate
vertically while the switch is left open. An alternating induced emf is created
in the coil but there are no induced currents because there is not a complete
circuit. The oscillations are undamped (apart from air resistance etc…) and
the magnet oscillates for a long time.
FP.CH21_3pp.indd 454 3/15/2023 12:19:35 PM

When the switch is closed the induced emf causes alternating currents in the
coil. By Lenz’s law, these currents flow in a direction that opposes the change
that caused them. As the north pole approaches the top of the coil the current
moves in such a direction that the top of the coil is also a north pole, repelling
the magnet. When the magnet moves away from the coil the induced currents
form a south pole at the top of the coil, attracting the magnet. It is clear that
the electromagnetic forces act in the opposite direction to the motion, like
friction, and provide additional damping. The oscillations decay more rapidly
with the switch closed.
S
oscillaons
N
switch
change that caused them. As the north pole approaches the top of the coil
Another simple demonstration involves placing a cylindrical neodymium mag-
net on a thick sheet of aluminum or copper and then tipping the sheet until
the magnet slides down. Aluminum and copper are not ferromagnetic metals
so when the magnet is stationary there are no magnetic forces acting on it.
However, when it begins to move the magnetic flux cuts through the conduc-
tor inducing emfs. The induced emfs create current loops and the magnetic
fields of these loops oppose the motion of the magnet. There is a magnetic
drag force that slows the magnet down. This can be a surprisingly large effect
if the metal sheet is titled at a large angle.
Another intriguing demonstration involves dropping a cylindrical neodymium
magnet down a copper water pipe with an internal diameter slightly greater
FP.CH21_3pp.indd 455 3/15/2023 12:19:37 PM

than the diameter of the magnet. The magnet falls at a slow terminal velocity.
Electromagnetic damping forces balance its weight.
copper
pipe
FEM
S
neodymium
v
magnet
mg N
As magnet falls its flux cuts the conductor surrounding it. This
induces current loops above and below the falling magnet. These
create magnetic fields that oppose the magnet’s motion.
21.7 INDUCTION MOTORS

Electromagnetic damping exerts a damping force on a magnet when it moves
past a conductor. By Newton’s third law, there is an equal and opposite force
on the conductor. This can be used to create a motor. If you move a neo-
dymium magnet rapidly past a thin sheet of non-ferromagnetic metal (e.g.,
copper or aluminum) the sheet will begin to accelerate in the direction of the
moving magnet.
moon of magnet
F
FP.CH21_3pp.indd 456 3/15/2023 12:19:43 PM

The moving magnetic field induces emfs in the metal that create current
loops. By Lenz’s law, these create magnetic fields that oppose the relative
motion, resulting in the forces on the magnet and conductor as shown above.
The effect is that the conductor tends to follow the moving magnetic field.
In an induction motor, coils are used to create a rotating magnetic field. A
non-ferromagnetic rotor is placed in the rotating field and it too rotates. This
has a big advantage that there are no brushes. The idea was first suggested by
Nikola Tesla and induction motors are now widely used from DVD players to
electric cars.
A rotating magnetic field can be created by superposing two alternating mag-
netic fields at right angles to one another and giving them a phase difference
of π/2. If we represent each magnetic field by a phasor then the resultant of
the two phasors represents the rotating field.
By B

Bx
=
Bx B0 cos ωt
=
By B0 sin ωt
Resultant magnetic field strength:
=
B Bx2 + B=
2
y B0 cos2 ωt + sin 2 ω=
t B0
The direction of the magnetic field is:
 By 
θ = tan −1   = tan −1 ( tan ωt ) = ωt
 Bx 
In other words, this creates a magnetic field of constant strength B that rotates
with constant angular velocity ω. A simple way to create such a field is to use
two pairs of coils arranged as shown below:
FP.CH21_3pp.indd 457 3/15/2023 12:19:47 PM

rotor
Vy = V0 sin t
Vx = V0 cos t
21.8 EXERCISES
1. The diagram below shows a bar magnet and a coil.
bar magnet S
moved along
doed line N
coil G
FP.CH21_3pp.indd 458 3/15/2023 12:19:50 PM

While the magnet is moving toward the coil the galvanometer deflects to the
right.
(a) Explain why this occurs.

(b) State and explain, using Faraday’s law, what happens when:
i. The magnet is moved toward the coil at a higher speed.
ii. The magnet is moved away from the coil.
iii. The magnet is stationary inside the coil.
(c) The experiment is repeated but with the galvanometer removed so that
the ends of the coil are not connected to anything. Discuss whether
the motion of the magnet near the coil has any effect now.
2. The Earth’s field has a strength of about 50 μT. A square coil of side 20 cm
having 200 turns is placed so that its plane is perpendicular to the Earth’s
field. The coil has a total resistance of 50 ohms.
(a) Calculate the flux through the coil.

(b) Calculate the flux linkage through the coil.
(c) The coil is quickly turned (about an axis in the plane of the coil per-
pendicular to the field) through 90° so that its plane is now parallel to
the Earth’s field. It takes 0.50 s to complete the rotation. Explain why
there is a current in the coil as it rotates.
(d) Calculate the average current as the coil is rotated.
(e)
Calculate the charge that has moved around the coil during the
process.
(f) How (if at all) would your answers to parts (d) and (e) be affected if the
coil was turned through 90° in 0.10 s instead of 0.50 s? Explain.
3. When a strong magnet is dropped through a copper tube it falls at a con-
stant velocity. Use your knowledge of electromagnetism to explain this as
fully as you can. Use diagrams if this helps.
4. An aircraft is flying through due north at 200 ms− 1 in a place where the
vertical component of the Earth’s magnetic field is 40 μT. Its wingspan is
50 m and its wings are made of a conducting material.
(a) Explain why there is a potential difference between its wing tips.
(b) Calculate the potential difference between its wing tips.
FP.CH21_3pp.indd 459 3/15/2023 12:19:50 PM

(c) Discuss whether the voltage generated could be used to power an

electrical device using wires connected to each wing tip.
5. (a) Draw a labeled diagram of a transformer that could step down a mains
supply of 240 V 50 Hz AC to 20 V 50 Hz AC. The primary coil has
1200 turns.
(b) Explain how the transformer works.
(c) Explain the function of the core and explain why soft iron is a suitable
material for it.
(d) Explain how laminating the core reduces energy loss from the trans-
former.
(e) Transformers are very efficient, but there are still losses. State three
ways in which energy can be dissipated by a transformer.
(f) A 20 V, 40 W lamp is connected to the secondary of the transformer.
Calculate the current drawn from the primary (assume that the effi-
ciency is 100% and neglect inductive effects).
(g) Sketch a graph to show how the voltage in the secondary coil is related
to the flux in the core. Explain how the graph illustrates both Fara-
day’s and Lenz’s laws.
6. A coil with inductance and resistance is connected via a switch to a
power supply of 1.0 V. The graph below shows how the current in the coil
increases with time from the moment the switch is closed.
2.5
2
current / A
1.5
0.5
0
0 0.05 0.1 0.15 0.2 0.25
me / s
(a) Use the graph to calculate the inductance and resistance of the coil.
(b) Calculate the energy stored when there is a current of 2.0 A in the coil.
FP.CH21_3pp.indd 460 3/15/2023 12:19:51 PM

(c) When the switch is opened, a spark is observed to jump across its contacts.
Explain why this occurs.
7. The diagram below shows an end view of a simple generator consisting
of a coil rotating at a constant rate in a uniform magnetic field. The AC
output is connected via slip rings to a purely resistive load.
south
north
Here is some data about the generator:

Magnetic field strength: 0.12 T
Area of coil: 80 cm2
Number of turns on coil: 250
Rotation frequency: 60 Hz
Load resistance: 50 Ω
(a) Calculate the peak output emf.
(b) Sketch a graph to show how the output emf varies with time as the coil
completes one rotation.
(c) Calculate the peak current in the load.
(d) Calculate the peak power transferred to the load (assume that the
resistance of the coil is negligible).
(e) Discuss how your answers to (b), (c), and (d) would be affected if the
resistance of the coil was not negligible.
FP.CH21_3pp.indd 461 3/15/2023 12:19:52 PM

FP.CH21_3pp.indd 462 3/15/2023 12:19:52 PM
CHAPTER
22
AC
22.1 AC AND DC
DC stands for “direct current” and AC stands for “alternating current.”
DC current flows in one direction around a circuit, so the supply polarity
is c onstant. AC currents change direction periodically, so the polarity of the
supply alternates. Most AC supplies are sinusoidal and can be represented by
the equations:
=V V0 sin ωt
=I I0 sin ωt
voltage or
current
T= 2/
V0 or I0
0 me
0
FP.CH22_2pp.indd 463 3/14/2023 5:38:44 PM

22.1.1 AC Power and rms Values

The instantaneous power of an AC supply is given by P = IV and varies
throughout each cycle:
= = I0 V0 sin 2 ωt
P IV
power
T= 2/
I0V0
½ I0V0
00 me
The power is always positive and peaks twice as frequently as the current or
voltage. The peak power is I0V0 and the average value of the power is ½ I0V0.
This can be shown using the trigonometric relation:
1 1
sin 2 A= − cos2 A
2 2
1 1
sin 2 ωt = − cos2ωt
2 2
The average value of a cosine term over a whole number of cycles is zero so
the average of a sine-squared term is ½.
The average AC power is therefore:
1
PAC = I0 V0
2
The root-mean-square values of current and voltage (for sinusoidal variation) are:
I0
Irms =
2
V0
Vrms =
2
FP.CH22_2pp.indd 464 3/14/2023 5:38:44 PM

AC • 465
so the average AC power is:

PAC = Irms Vrms
By using the rms values instead of peak values we can express the formula for
average AC power in the same way as we express the formula for DC power.
This means that an AC supply voltage of 240 V rms would light a lamp to the
same brightness as a DC supply of constant value 240 V. In this sense, the
rms value is the “DC equivalent” value. However, an AC voltage of 240 V rms
actually peaks at ± 339 V. We can show where the factor of 1 / 2 comes from
as follows.
=V V0 sin ωt
=V 2 V0 2 sin 2 ωt
1 2
=
V2 V0 2 sin=
2
ωt V0
2
V0
V=
rms V=
2
V0 2 sin 2 ω=
t
2
AC meters are usually calibrated to give rms values.
22.2 RESISTANCE AND REACTANCE

While current and voltage in a resistor are always in phase, this is not the case
for capacitors or inductors. These components introduce a phase difference
between the current and voltage so it is not possible to define resistance in the
same way as for a resistor.
22.2.1 Resistors in AC Circuits

The relationship between current and voltage for a pure resistance is defined
by the equation:
V
R=
I
V0 sin ωt V0
so if V=V0sinω t: =R =
I0 sin ωt I0
FP.CH22_2pp.indd 465 3/14/2023 5:38:45 PM

The resistance is constant and independent of the frequency of the AC The

current and voltage are in phase with one another and the average power
dissipated by the resistor is:
Vrms 2
=P = Irms=Vrms Irms 2 R
R
22.2.2 Capacitors in AC Circuits

The situation is different for a capacitor. The equation that defines the rela-
tionship between current and voltage for a capacitor derives from the defining
equation for capacitance:
Q = CV
dQ dV
=I = C
dt dt
so the current depends on the derivative of the voltage:
=V V0 sin ωt
I =ωCV0 cos ωt =I0 cos ωt
I0 = 2p fCV0
ωCV0 =
This introduces a π/2 phase difference between the voltage and the current
with the current leading the voltage:
current in
I0
capacitor
voltage
V0 across
capacitor
0 time
0
FP.CH22_2pp.indd 466 3/14/2023 5:38:45 PM

AC • 467
The ratio of V/I is of little use because it varies continuously (being zero at some
points and infinite at others). However, the ratio of the peak values of voltage
and current is constant, this is the reactance XC (“kie-cee”) of the capacitor.
V0 1 1
X=
C = =
I0 wC 2p fC
The SI unit for reactance is the ohm (W) but reactance is not the same as
resistance because the peak values occur at different times.
Reactance can be used to find the peak or rms current in a capacitor if we
know the peak or rms AC voltage across it. However, the interesting thing
about reactance is its frequency dependence.
1
XC ∝
f
At low frequencies the reactance is high, becoming infinite at f = 0 (i.e., for

DC). This makes sense because the capacitor is actually a break in the cir-
cuit so no DC current can pass through it. At high frequencies the reactance
becomes very small, so high frequency signals can pass through a capacitor
almost unimpeded. Capacitors (and inductors) are used in circuits designed
to filter out or separate high or low frequencies.
It is also interesting to consider the energy flow when a capacitor is connected
to AC:
1
P= IV= I0 V0 cos ωt sin ωt= sin 2ωt
2
The average value of a sine over an integer number of cycles is zero, so the
average power delivered to the capacitor during one cycle of AC is also zero.
For half the time energy flows from the supply to the capacitor (as it charges)
and for the other half energy flows from the capacitor back to the supply (as it
discharges). There is no net energy flow onto the capacitor.
22.2.3 Inductors in AC Circuits

The equation that defines the relationship between current and voltage for an
inductor derives from the defining equation for inductance:
dI
V =− E =L
dt
FP.CH22_2pp.indd 467 3/14/2023 5:38:46 PM

E is the back emf in the inductor which opposes the supply voltage V, hence
the change of sign.
If the current is:
=I I0 cos ωt
V = −ωI0 sin ωt = V0 sin ωt
V0 = −ωLI0
Once again there is a π/2 phase difference but this time the voltage leads the
current by π/2.
I0 current in
inductor
V0
me
voltage
across
inductor
The reactance CL (“kie-ell”) of the inductor is defined as the ratio of peak

voltage to peak current.
V
X L = 0 =ωL =2p fL
I0
This is also measured in ohms (Ω) and is frequency dependent. At low fre-
quencies the inductor has very low reactance, low-frequency signals pass
through an inductor easily. At high frequencies, the reactance becomes very
great so that high-frequency signals are severely impeded. This frequency
dependence is the opposite of that for the capacitor.
FP.CH22_2pp.indd 468 3/14/2023 5:38:46 PM

AC • 469
In common with the capacitor, however, there is no net energy flow into the
inductor. During one cycle energy is used to build up the magnetic flux in the
inductor but then energy flows out of the inductor as the field collapses.
In practice inductors consist of coils of conducting wire which has some resist-
ance, so we do not usually encounter pure inductance. A real inductor is mod-
eled as a pure inductor in series with a fixed resistor:
L R
22.3 RESISTANCE, REACTANCE, AND IMPEDANCE

Resistors, capacitors, and inductors all respond differently to AC signals. The
reason for this is that while current and voltage are in phase in a resistor, cur-
rently leads voltage by p/2 in a capacitor and lags behind the voltage by p/2
in an inductor. To determine the relationship between the supply voltage and
the voltages across individual circuit components we must consider their rela-
tive phases.
22.3.1 Phasor Diagrams for AC Series Circuits

The simplest way to include relative phases is to use a phasor diagram. The
phasors representing voltages across a resistor VR, capacitor VC, and inductor
VL connected in series are shown below (phasors rotate counterclockwise, and
current is used as the reference because it is the same in all components in a
series circuit). The lengths of phasors representing VC and VL will vary with
frequency.
VL = IX L
VR = IR
I
VC = IX C
FP.CH22_2pp.indd 469 3/14/2023 5:38:46 PM

The supply voltage is equal to the sum of the phasors across the three
components.
22.3.2 Impedance
The impedance Z of a load that contains resistive, capacitive, and inductive
components is defined as the ratio of the peak voltage across the load to the
peak current in the load:
V
Z= 0
I0
and is measured in ohms (Ω).

For a series circuit, V0 is found by adding the voltage phasors for the compo-
nents and, except in special cases, the voltage and current in the circuit are
not in phase so the peak voltage and peak current occur at different times.
The general expression for the impedance of a series AC circuit containing all
three types of components is derived below.
VL = IX L
VC = IX C
VR = IR
VS

I
Pythagoras’s theorem can be used to find the magnitude of VS:
VR 2 + ( VL − VC ) = I R 2 + ( X L − X C )
2 2
VS =
and the impedance is:

2
V  1 
Z = S = R 2 + ( X L − X C ) = R +  ωL − ωC 
2
2
The phase angle φ is given by:

VL − VC × L − × C
tan f
= =
VR R
FP.CH22_2pp.indd 470 3/14/2023 5:38:47 PM

AC • 471
For particular values of L and C, VL, and VC are equal in magnitude and add to zero
(because they are p out of phase with each other). Under these circumstances, the
phase angle is 0 and the impedance is a minimum and purely resistive: the current
has a maximum value and is in phase with the supply voltage. This is called reso-
nance and if resistance is small it can result in a large increase in current.
The only component that dissipates energy is the resistor so the power dis-
sipated in this circuit is:
= =
P IVR IVS cos f
cos f is called the “power factor” for the circuit.
22.4 AC SERIES CIRCUITS

AC electric circuits containing resistors, capacitors, and inductors have inter-
esting frequency-dependent behavior. Some circuits can be made to oscillate
and resonate at particular frequencies. These tuned circuits are extremely
important in communications systems.
In series circuits, the amplitude and phase of the current through each com-
ponent are the same but the voltage can vary both in amplitude and phase.
22.4.1 RC Series Circuit

The behavior of an RC series circuit is quite straightforward. As the frequency
is increased the reactance of the capacitor falls so VC also falls. This reduces
the impedance of the circuit and reduces the phase difference between the
supply voltage and the current. The amplitude of the AC current rises toward
a maximum value determined only by the resistance, Imax = VS/R. At high fre-
quencies, the circuit behaves as if it is purely resistive and the supply voltage
is in phase with the circuit current.
VS
A.C.
C
R
VR VC
FP.CH22_2pp.indd 471 3/14/2023 5:38:47 PM

VR = IR
I

VS
VC = IX C
VS = VR 2 + VC 2 = I R 2 + X C 2
2
 1 
VS
= R +  2 πfC 
2
=
Z = R2 + X C2
I  
XC 1
tan=
φ =
R 2 πfRC
22.4.2 RL Series Circuit

The behavior of an RL series circuit is also straightforward. As the frequency
is increased the reactance of the inductor increases so VL, the circuit imped-
ance, and the phase difference between the supply voltage and the cur-
rent also increase, and the amplitude of the AC current falls toward zero.
The maximum current is approached at low frequencies. In the DC limit,
Imax = VS/R and the supply voltage and current are in phase.
VS
A.C.
L
R
VR VL
FP.CH22_2pp.indd 472 3/14/2023 5:38:48 PM

AC • 473
VL = IX L VS
VR = IR
I
VS = VR 2 + VL 2 = I R 2 + X L 2
VS
R 2 + X L 2 = R 2 + ( 2p fL )
2
=
Z =
I
XC 1
tan=
φ =
R 2 πfRC
22.4.3 RCL Series Circuit

In an RCL series, circuit the frequency dependence of the capacitor reac-
tance and inductor reactance oppose one another – as frequency increases the
reactance of the capacitor falls and the reactance of the inductor rises. This
leads to resonance behavior at the frequency when they are equal.
VS
A.C.
C
R
VR VC VL
FP.CH22_2pp.indd 473 3/14/2023 5:38:48 PM

VL = IXL
VS

I
VR = IR
VC = IX C
Previously, we derived expressions for the impedance and phase of an RCL

circuit:
2
V  1 
Z = S = R 2 + ( X L − X C ) = R +  ωL − ωC 
2
2
The phase angle φ is given by:
VL − VC X L − X C
=
tan φ =
VR R
Resonance occurs at a frequency f0 when the reactance of the capacitor is

equal to the reactance of the inductor.
XL = XC
1
ωL =
ωC
1
2 πf0 L =
2 πf0 C
1 1
f0 =
2 π LC
At this frequency, the current in the circuit has its maximum value
VS
Imax =
R
FP.CH22_2pp.indd 474 3/14/2023 5:38:49 PM

AC • 475
The circuit acts as a purely resistive load and the current and supply voltage
are in phase.
The graph below indicates how the resistance, impedance, and circuit current
vary with frequency for an RCL series circuit.
Resistance
Impedance
Current
impedance
resistance
current
0
0 f = f0 frequency
For f < f0 the reactance of the capacitor is greater than that of the inductor and
the circuit is said to be capacitive. For f > f0 the reactance of the inductor is
greater than that of the capacitor and the circuit is said to be inductive.
22.4.4 Parallel Circuits Containing Resistors, Capacitors, and Inductors

The analysis of parallel circuits follows a similar procedure to the analysis of
series circuits. However, the voltage across each component is now in phase
and can be used as a reference phasor. The current in the resistor is in phase
with the voltage, the current in the capacitor leads by π/2 and the current in
the inductor lags by π/2.
FP.CH22_2pp.indd 475 3/14/2023 5:38:49 PM

VS
A.C.
IR
IC
IL
IC = VS/X C
IR = VS/R
VS
IL = VS/X L
A voltage resonance occurs when XL = XC when the circuit behaves as a purely

resistive load and the impedance is R. The resonant frequency is again f0:
1 1
f0 =
2 π LC
At resonance, the currents in the capacitor and inductor are equal in magni-
tude but opposite in direction at every moment.
22.5 ELECTRIC OSCILLATORS

A simple oscillator circuit can be constructed from a capacitor and an induc-
tor as shown below. If the capacitor is charged from the DC supply and then
connected across the inductor currents in the circuit undergoes a series of
oscillations.
FP.CH22_2pp.indd 476 3/14/2023 5:38:50 PM

AC • 477
C L
When the capacitor is connected to the inductor the voltages across them
must be equal so that:
Q dI
= −L
C dt
Since I = dQ/dt, we can form a second-order differential equation for Q and t:
Q d 2Q
= −L 2
C dt
d 2Q  1 
2
= − Q
dt  LC 
This equation has exactly the same form as the equation of motion for simple
harmonic motion so the solutions will have the same form too:
Q = Q0 sin ù t
where
1
ω=
LC
The charge on the capacitor oscillates with a frequency
1 1
f=
2 π LC
FP.CH22_2pp.indd 477 3/14/2023 5:38:50 PM

The same as the resonant frequency for an RCL circuit.

The voltage across the capacitor oscillates at the same frequency:
Q0
=
VC sin ωt
C
And the current is
dQ
I= =
ωQ0 cos ωt
dt
In practice, it is not possible to connect a capacitor to a perfect inductor.

There is always some resistance in the circuit so energy is dissipated as the
oscillations occur. This leads to damping. The rate of decay of the amplitude
of the oscillations depends on the resistance in the circuit, the more resistance
the greater the rate of decay.
22.5.1 A Mechanical Analogy

The equations for electrical oscillations have the same form as those for the
oscillation of a mass on a spring.
Compare the equations for electrical and mechanical oscilaltions:
d 2Q  1 
2
= − Q
dt  LC 
d2 x k
2
= −  x
dt m
This identity of the mathematic form allows us to make an analogy

between mechanical oscillators and electrical oscillators using the following
correspondences:
Mass-spring oscillator LCR electrical oscillator
Charge, Q Displacement, x
Current, dQ/dt Velocity, dx/dt
Mass (inertia), m Inductor, L
Spring constant, k Inverse capacitance, 1/C
FP.CH22_2pp.indd 478 3/14/2023 5:38:50 PM

AC • 479
Analogies like this occur all over physics. They help us to understand new phe-
nomena in terms of ones we are already familiar with but they are also useful
in their own right. For example, this correspondence between the mechanical
and the electrical means that we can model mechanical systems using electri-
cal circuits – for example, to test new designs for structures.
22.6 EXERCISES
1. (a) Discuss the advantages and disadvantages of using DC batteries

compared to AC mains as sources of electrical energy.
(b) Explain the advantages of using AC for long-distance transmission of
electricity.
2. The diagram below enables the same lamp to be connected to either a DC
or an AC supply.
20 V D.C. variable A.C.
The AC supply is adjusted until the lamp lights equally brightly from both
supplies.
(a) Calculate the peak value of the rms voltage.
(b) State the rms value of the AC voltage.
(c) Discuss whether the lifetime of the bulb (a filament lamp) will
depend on the type of supply used to light it.
FP.CH22_2pp.indd 479 3/14/2023 5:38:51 PM

3. Copy and complete the table below by adding resistance or reactance

values at each frequency:
Component/frequency 0 Hz (DC) 100 Hz 1000 Hz 10 000 Hz

100 Ω resistor
100 µF capacitor
100 µH inductor
4. The circuit below contains a resistor and an inductor.
16 V rms
A.C.
22 6.5 mH
VR VL
The frequency of the AC supply is 1000 Hz.

(a) Calculate the impedance of the circuit.
(b) Calculate the rms current in the circuit.
(c) Calculate the peak voltage across the inductor.
(d) Calculate the power dissipated by the resistor.
(e) Explain why no power is dissipated by the inductor.
(f) Calculate the phase difference between the supply voltage and the
current.
FP.CH22_2pp.indd 480 3/14/2023 5:38:51 PM

AC • 481
5. The circuit below has a resistor, a capacitor, and an inductor connected

in series.
12 V rms
A.C.
10 22 F 50 H
VR VC VL
(a) Calculate the impedance of the circuit at 0 Hz, 1.0 kHz, 5.0 Hz,
10 kHz and 100 kHz.
(b) Explain what is meant by resonance and calculate the resonant
frequency for this circuit.
(c) Sketch a graph to show how the rms current varies with frequency
from 0 Hz to 100 000 Hz.
(d) Sketch a graph to show how the phase difference between the supply
voltage and the current varies as the frequency changes from 0 Hz to
100 000 Hz.
(e) Describe the energy transfers that take place in the circuit at
resonance.
FP.CH22_2pp.indd 481 3/14/2023 5:38:51 PM

FP.CH22_2pp.indd 482 3/14/2023 5:38:51 PM
CHAPTER
23
The Gravitational Field
23.1 GRAVITATIONAL FORCES AND GRAVITATIONAL FIELD

STRENGTH
Gravity is one of the four fundamental forces. It has an infinite range and
obeys a similar inverse-square law to electrostatics. All masses create gravita-
tional fields but unlike the electrostatic forces between charges, which can be
attractive or repulsive, gravitational forces are always attractive. The gravita-
tional force acting on a mass close to the surface of the Earth is called weight.
23.1.1 Newton’s Law of Gravitation

Newton stated that two-point masses would exert an attractive force on one
another that is directly proportional to the product of the masses and inversely
proportional to their separation.
m1 F F m2
Gm1 m2
F= −
r2
the minus sign indicates attraction.

G is the universal constant of gravitation, G = 6.674 × 10− 11 Nm2kg− 2.
Newton was also able to show that the force of attraction between spheres of
uniform density is the same as the attraction between two point masses placed at
FP.CH23_2pp.indd 483 3/14/2023 6:41:05 PM

their centers. This means that we can treat object like planets and stars as point
masses when considering the orbital motion. It is also important to note that,
by Newton’s third law, the forces on each mass have the same magnitude, even
if the masses are different. For example, the weight of an apple in the Earth’s
gravitational field is the same as the weight of the Earth in the apple’s gravita-
tional field. It is also the case that the gravitational force exerted on the Earth by
the Moon is equal in magnitude to the gravitational force exerted on the Moon
by the Earth.
The resultant gravitational force on a body affected by the gravitational fields of
several other objects (e.g., the Earth affected by the Sun, Moon, and other planets)
is the vector sum of the gravitational forces from each of the other objects.
23.1.2 Gravitational Field Strength

The idea that gravitational forces arise from a gravitational field removes the
difficulty of an action-at-a-distance explanation. The Moon is attracted to the
earth because it experiences a force from the gravitational field where it is,
that is, a “local force.”
The gravitational field strength g at a point in space is defined as the gravita-
tional force per unit mass at that point.
gravitational force F
=g =
mass m
The SI unit for gravitational field strength is newton per kilogram (Nkg− 1).
For a point or uniform spherical mass, the field strength at a distance r from
the center of mass M can be determined by considering the force per unit
mass acting on a small mass m placed at that distance:
F GM
g= = − 2
m r
This is an inverse-square law.

The gravitational field can be represented using field lines in a similar way to
the electric field.
The fact that this is an inverse-square law and gravitational field lines can
only begin on masses allows us to create a form of Gauss’s theorem for the
gravitational field.
FP.CH23_2pp.indd 484 3/14/2023 6:41:05 PM

The Gravitational Field • 485
Gravitational flux through a surface is defined as:
ΦG = ∫ g.dA = −4p G ∑ m
enclosed
closed
surface
The negative sign again arises because of the attractive nature of the gravita-
tional field, the flux entering through a closed surface is equal to 4πG times the
total mass enclosed by the surface. In the same way, as in electrostatics, this
form of Gauss’s law is particularly useful for situations with spherical sym-
metry. For example, we can use it to show that the gravitational field strength
inside a hollow uniform spherical shell is zero everywhere.
outer Gaussian surface encloses mass

M so by symmetry:
4 4

r1
uniform spherical shell of

r2 mass M
Inner Gaussian surface encloses no

mass, so by symmetry:
4 0
0
FP.CH23_2pp.indd 485 3/14/2023 6:41:06 PM

23.1.3 The Gravitational Field Strength of the Earth

We will model the Earth as a spherical mass of uniform density. This is not
strictly correct – the Earth is actually an oblate spheroid, with a larger equa-
torial diameter than polar diameter, and its density increases toward its core.
However, this simple model is useful to give an idea of how the Earth’s field
varies outside and inside its surface. Assume that the Earth has mass ME and
radius RE and uniform density ρ.
The field outside the surface is that of a point mass ME located at the center
of the Earth:
GM
g(r > RE ) =
− 2E
r
We can determine an expression for the field strength inside the earth by
recalling that the field strength inside a hollow sphere is zero. This implies
that the field strength at distance r < RE from the Earth’s center is that of the
mass inside radius r.
The gravitaonal field strength

RE r P at P is created by the mass
contained inside the spherical
surface of radius r
Mass inside radius r is m:

4
m = p r3r
3
ME
r=
4
p RE 3
3
4 ME r3
m = p r3 × = 3 ME
3 4
p RE 3 RE
3
FP.CH23_2pp.indd 486 3/14/2023 6:41:06 PM

GME r
g ( r < RE ) =
−
RE 3
The magnitude of g is zero at the center of the Earth and increases linearly to
the surface. The gravitational field strength at the Earth’s surface is:
GME
gsurface = −
RE2
This is about 9.8 Nkg− 1 although it varies by a few percent at different locations.
g
r = RE
0 r

In many situations on Earth, we assume that g is constant. This is a realistic

assumption when vertical heights h are much smaller than the radius of the
Earth. When this is not the case we must use the inverse-square law. However,
even at the height of the Hubble space telescope (559 km), the gravitational
field strength is about 8.2 Nkg− 1, that is, over 80% of its surface value. The
apparent weightlessness of astronauts inside an orbiting spacecraft is not due
to g being zero, it isn’t. They are in free fall with the same acceleration as their
spacecraft so they do not experience a reaction force to their own weight. The
very fact that they are orbiting is because of their weight!
23.2 GRAVITATIONAL POTENTIAL ENERGY AND

GRAVITATIONAL POTENTIAL
Gravitational potential energy is the energy a body has because of its posi-
tion in the gravitational field. When it is moved from one position to another
energy is either transferred to or from gravitational potential energy. For
example, lifting a case from the floor and placing it on a table requires an
external agent to apply an upward force and to move this force, so work is
done on the case, and its gravitational potential energy increases.
FP.CH23_2pp.indd 487 3/14/2023 6:41:06 PM

If an apple falls from a tree to the ground a gravitational force (its weight)
acts on the apple and does work on it, so its gravitational potential energy
decreases.
23.2.1 Change in Gravitational Potential Energy

Consider a particle of mass m moving from point A to B in a gravitational field
as shown below.
A m mg

x
The work done by the gravitational field to move it a short distance δx along
the line AB is:
d W mg cos θδx
=
so the total work done is:

B
WAB = ∫mg.dx
A
where mg.dx is the scalar product of the two vectors mg and δx, that is, the
sum of the component of the gravitational force in the direction of motion
along each line element. The gravitational forces do work on the particle so its
gravitational potential energy falls:
B
− ∫m g.dx
∆GPE AB =
A
The differential form of this equation is:
dGPE AB
= −m g
dx
FP.CH23_2pp.indd 488 3/14/2023 6:41:07 PM

The rate of change of gravitational potential energy with distance is equal in

magnitude but opposite in direction to the gravitational force.
Uniform Gravitational Field
Changes of gravitational potential energy in a uniform field of strength g pro-
vide a simple (and familiar) formula.
A
m
h
x
B
− ∫mg.dx =
∆GPE AB = mgx cos è =
= − mgh
A
While this is only exact when g is constant it is a good approximation for verti-
cal displacements close to the surface of the Earth that are small compared to
the radius of the Earth.
23.2.2 Gravitational Potential

The gravitational potential VG at a point in the field is equal to the gravita-
tional potential energy per unit mass at that point.
GPE
VG =
m
The SI unit for gravitational potential is the joule per kilogram (Jkg− 1).
The zero of gravitational potential is taken to be at infinity.
This is the same convention used to define the zero of electrical potential and
it makes sense because when particles are separated by very large distances
their interactions, under an inverse-square law, become negligible. Once we
have defined the zero of potential, we can determine the absolute potential
(and potential energy) at any point. This is equal to the work that must be
done per unit mass to move a small mass from infinity and to place it at point P.
FP.CH23_2pp.indd 489 3/14/2023 6:41:07 PM

Gravitational Potential at P a Distance r from a Point or Uniform

Spherical Mass M
A small mass m is moved from infinity and placed a distance r from a mass M.
M P mg m x
to infinity
r
x
The change in gravitational potential energy of the small mass is:

x=r
GPE (• ) GPE ( r ) =
∆GPE = − ∫ m g.dx
x= ∞
where
GM
g= −
x2
x= ∞
 GM  GMm
GPE (• ) GPE ( r ) =
∆GPE = − ∫ m  − 2  dx =
x=r  x  r
GMm
GPE ( r ) = −
r
GPE ( r ) GM
VG ( r ) = = −
m r
All gravitational potential energies are negative because work would have to
be done to move any mass to infinity. This is a consequence of the attractive
nature of gravitational forces.
23.2.3 Gravitational Field Lines and Equipotentials

When a mass moves in a direction perpendicular to the lines of the gravita-
tional field the component of the gravitational field in the direction of motion
is zero, so no work is done on or by the gravitational field. The gravitational
potential energy of the mass is constant and it is moving along a gravitational
FP.CH23_2pp.indd 490 3/14/2023 6:41:08 PM

equipotential. For a point or uniform spherical mass, the equipotentials are

concentric spherical surfaces.
The field lines are perpendicular to the equipotentials.
Equipotenal surfaces for a

point or uniform spherical mass.
For equal increases in potenal
the separaon of the surfaces
M increases with radius. This is
because the field strength is
decreasing with radius.
Close to the surface of the Earth the gravitational field is approximately uni-
form. The equipotentials are planar surfaces parallel to the surface of the
Earth and are equally spaced.
equipotenal
increasing
surface of Earth: potenal
7 1
VG6.310 Jkg
23.2.4 Gravitational Potential Energy in the Earth’s Field

The gravitational potential at the Earth’s surface is:
GM
VG ( RE ) =
− −6.25107 Jkg −1
=
RE
The change in potential energy when a mass is moved from one place (A) to
another (B) in the field is given by:
∆GPE ( AB ) =m∆VG =m ( VG ( B ) − VG ( A ) )

FP.CH23_2pp.indd 491 3/14/2023 6:41:08 PM

Consider a mass m moved through a vertical height h from the surface of the
Earth:
 GM GM   GM GM 
∆GPE ( AB ) =
m∆VG =
m  − −−  =
m  − + 
 ( RE + h ) RE   ( RE + h ) RE 
This can be simplified to:

GMmh
∆GPE ( AB ) =
RE ( RE + h )
When h << RE this becomes:

GMmh
∆GPE ( AB ) = 2 = mgh
RE
The familiar result for situations where g is constant.

As h→∞
GMm
∆GPE ( AB ) →
RE
This is equal to the work that must be done against gravitational forces to
move a mass m from the surface of the Earth to infinity, that is, to completely
escape from the Earth’s field.
23.2.5 Escape Velocity

The escape velocity is the minimum initial velocity required for a body pro-
jected from the surface of a planet or star to escape from the gravitational
field. In order to escape the total energy of the projectile must be positive.
M
v
r m
Total energy at surface: TE = KE + GPE
FP.CH23_2pp.indd 492 3/14/2023 6:41:09 PM

1 GMm
TE = mv2 − ≥0
2 r
The escape velocity is therefore given by:
1 GMm
mvesc 2 − =
0
2 r
2GM
vesc =
r
Note that this is independent of the mass m of the projectile.

The escape velocity from the surface of the Earth is:
2GME
=vesc = 1.1210 4 ms−1
RE
Black Holes
The speed of light is a limiting speed in the universe so if the escape velocity
reaches this value then nothing, not even light, can escape. Black holes are
objects that are sufficiently massive and compact that the escape velocity at a
certain distance from the object is equal to the speed of light. This distance is
called the Schwarzschild radius and while a thorough analysis of black holes
requires general relativity, it is possible to derive some useful results from
Newton’s law of gravitation.
Consider a black hole of mass M. At a certain distance from the center, called the
“Schwarzschild radius” RS, the escape velocity is equal to the speed of light, c:
2GM
vesc= c=
RS
2GM
RS =
c2
Another way to look at this is to realize that a body of mass M will become a
black hole if all of its mass is compressed inside its Schwarzschild radius, RS.
For the Earth to become a black hole this radius is just under 1 cm. The Earth
would have to be compressed to the size of a table tennis ball.
FP.CH23_2pp.indd 493 3/14/2023 6:41:09 PM

A sphere of radius RS surrounding the black hole is called the “event horizon.”
This is because no events inside this radius can communicate with the outside
Universe. In a sense everything inside the event horizon has been cut off from
the rest of the Universe. In May 2022, astronomers using the Event Horizon
Telescope created the first image of the supermassive black hole at the center
of the Milky Way galaxy.
23.3 ORBITAL MOTION

Newton realized that circular motion requires a centripetal force and suggested
that if an object was projected horizontally at a high enough speed (and drag
was negligible) then the object would go into orbital motion around the Earth.
objects projected horizontally from a

point above the Earth’s surface
Earth
Newton derived his inverse-square law of gravitation by analyzing the motion

of the Moon. He argued that the gravitational force of attraction to the Earth
provides a centripetal force so that the Moon is in free fall. However, the cen-
tripetal acceleration of the Moon is about 1/RE2 times smaller than the free
fall acceleration (9.8 ms− 2) of a mass dropped near the surface of the Earth, so
if both objects fall because of the Earth’s gravity, its strength must get weaker
with distance and obey an inverse-square law.
Newton’s law of gravitation was published in Principia Mathematica in 1687
and provided astronomers and physicists with the mathematical tools to
FP.CH23_2pp.indd 494 3/14/2023 6:41:09 PM

explain the motions of objects in the Solar System and beyond. It is hard
to over-emphasize the importance of this work, it provided a single elegant
explanation for motions across the Universe. In our own Solar System, it
explained the motions of all the planets and other bodies (such as comets and
asteroids) using a single principle.
23.3.1 Early Ideas About Planetary Motion

The planets have been observed and identified since ancient times because
they move against the background of “fixed stars.” In many cultures, the abil-
ity to predict their motions and interpret their meaning held great political
and religious significance so that kings and emperors were prepared to sup-
port astronomers and astrologers.
The ancient Greeks proposed several models to explain their observations of
planetary motion, but the most influential was Aristotle’s geocentric (Earth-
centric) model. This was developed by Ptolemy and became the basis for
astronomical and astrological predictions for nearly 2000 years. The Ptolemaic
system placed the Earth at the center of the Universe with the Sun, Moon,
and planets moving around it. The fixed stars were embedded in a crystal
sphere surrounding the solar system. The details of the Ptolemaic system
were refined over the years and while it was physically incorrect it was still
useful. The real problem with the Ptolemaic system was the fact that it did
not explain the motions of the planets, each planet’s motion had to be set up
independently of the others so there was no simple unifying principle for the
system.
In 1542, Nicolaus Copernicus published his De Revolutionibus Orbium
Coelestiium (“On the Revolutions of the Heavenly Spheres”) in which he
argued that planetary motions could be explained more simply by assuming
that the Sun was at the center of the system (a heliocentric model). This idea
was controversial because it moved the Earth from the center of the Universe
and conflicted with contemporary religious views about the nature of the
Universe.
Johannes Kepler was a German mathematician and astronomer and strong
proponent of Copernicus’s heliocentric model. Kepler used the detailed astro-
nomical observations of the Danish astronomer Tycho Brahe to develop three
laws of planetary motion. He replaced the complex cycles and epicycles of
Ptolemy and Copernicus with elliptical orbits. The fact that all of the planets
could then be described by the same mathematical laws suggested that there
was an as-yet-undiscovered law that governed all of their motions.
FP.CH23_2pp.indd 495 3/14/2023 6:41:09 PM

Kepler’s Laws of Planetary Motion

All of the planets move in elliptical orbits with the Sun at one focus of the
ellipse.
A line drawn from the planet to the Sun sweeps out equal areas at equal
times.
The ratio r3/T2 is the same for all planets. r is the mean orbital radius and
T is the period of the orbit.
(for an elliptical orbit r is the length of the semi-major axis of the ellipse)
Galileo Galilei used the newly invented telescope to make astronomical obser-
vations. In particular, he discovered four moons of Jupiter. This was powerful
evidence to suggest that the Earth is not the center of all orbital motion in
the solar system. He was a strong supporter of the Copernican model and
Galileo’s ideas in astronomy and physics were powerful influences on Newton.
When Newton published his theory of gravitation he was able to show that
an inverse-square law led to elliptical orbits and could be used to explain and
derive all of Kepler’s laws. For the first time, there was a single simple theory
that could explain planetary motions (and much more).
Perihelion: Aphelion:
closest point to farthest point
Sun, greatest from Sun,
orbital speed lowest orbital
speed
Equal areas in equal mes –

speed is greater when planet is
closer to Sun.
23.3.2 Circular Orbits

A full mathematical analysis of elliptical orbits is beyond the scope of this
book. However, most planetary orbits and many satellite orbits are approxi-
mately circular so we will use Newton’s law to analyze circular orbits. Gravity
provides a centripetal force for circular motion.
FP.CH23_2pp.indd 496 3/14/2023 6:41:09 PM

Assume that the central body has a much greater mass than the orbiting body
so that the motion of that central body can be neglected. The center of the
orbit is then the center of the central mass. (in fact, both objects orbit about
their mutual center of mass).
v
m
mv2 GMm
=
r r2
2p r
v=
T
r 3 GM
=
T 2 4p 2
This is Kepler’s third law. Note that it is independent of m so any body in a

circular orbit of radius r about the same central mass will orbit with the same
period. This law applies for any central body, all that changes is the constant
GM/4π2 which depends on the mass of the central body. For example, all of
Jupiter’s moons have the same ratio of r3 to T2 as each other but this is differ-
ent from the ratio for the planets around the Sun. All Earth satellites will also
have the same ratio of r3 to T2 as the Earth’s Moon.
The total energy of an orbiting body is constant. This is because gravity is a
conservative field. When an object moves from one place to another (in the
absence of any frictional forces) energy is transferred between potential and
kinetic. For a circular orbit, both values are constant.
FP.CH23_2pp.indd 497 3/14/2023 6:41:10 PM

1 GMm
Total energy =KE + GPE = mv2 −
2 r2
mv2 GMm
=
r r2
so
1 2 GMm 1
KE = mv = = − GPE
2 2r 2
GMm GMm GMm

TE = − =
−
2r 2 rr 2r
The total energy is negative. This makes sense because an orbiting mass is in
a bound state. It would require an external agent to do work to remove the
mass to infinity.
23.3.3 Artificial Satellites

The first artificial satellite was Sputnik 1 launched by the Soviet Union in
1957. It moved in a slightly elliptical orbit at an altitude that varied between
200 and 900 km. Its orbital period was 96 minutes. In 2016, there are over
1000 operational satellites in orbit around the Earth, more than half of which
was launched by the United States.
Orbital period of an artificial satellite is determined by its orbital radius:
r 3 GM
=
T 2 4p 2
4p 2 r 3
T=
GM
T ∝ r2
Satellites in low Earth orbits, like Sputnik, have periods of about 90 minutes.
Low Earth polar orbits are useful for weather satellites and other satellites
that need to scan the entire surface of the Earth because they pass over a dif-
ferent strip of the surface on each orbit and can observe the entire surface in
a 24-hour period.
FP.CH23_2pp.indd 498 3/14/2023 6:41:10 PM

GPS satellites are further out, at an altitude of about 20 000 km, and have an
orbital period of about 12 hours. The GPS system creates a constellation of
satellites around the globe and relies on several satellites being visible to a
GPS receiver at any point on the Earth’s surface at any time. The minimum
number of operational satellites required to achieve this is 24 but the target
number is 33. Over 70 GPS satellites have been launched but not all of them
are still operational.
“Geostationary satellites” are placed in an equatorial orbit at an altitude of
about 35 800 km (r = 42 200 km) so that they have an orbital period of 24
hours. This ensures that they remain stationary above the same point on the
Earth’s equator because they complete one orbit as the Earth itself rotates
once. This is called a “geosynchronous orbit” and is used for communications
satellites. The fact that they remain stationary in the sky means that ground-
based antennae do not have to track them.
23.4 TIDAL FORCES

Tidal forces occur because the strength of the gravitational field varies across
an extended body. This has the effect of stretching or compressing the body
parallel or perpendicular to the field. The ocean tides on Earth are an exam-
ple of this and their regularity can be explained by changing tidal forces as
the Earth’s orbital position changes with respect to the Moon (and, to a lesser
extent, the Sun). However, tidal forces are a very common occurrence and
can have very important consequences. Jupiter’s Moon Io, for example, is
compressed and stretched by tidal forces as it orbits close to the giant planet.
These forces provide the energy to drive seismic activity on Io and active
volcanoes were observed on the surface of Io when the Voyager 1 spacecraft
passed it in 1979.
23.4.1 The Origin of Tidal Forces

The diagram on the next page shows a spherical body (body 1) in the gravita-
tional field of another body (body 2) of mass M.
The four squares A, C, D and E represent small masses of size m on the
surface of body 1 along diameters parallel and perpendicular to the line con-
necting it to the center of body 2. Body 1 is only affected by the gravitational
field so it can be considered to be in free fall toward body 2. In the free-
falling reference frame there will be a tension along line AC (C is in a stronger
FP.CH23_2pp.indd 499 3/14/2023 6:41:10 PM

field than B and A is in a weaker field so both are pulled away from mass B).
There will also be a smaller compression along line DE because of the inward
components of the gravitational field at the edges of body 1. The effect is to
stretch the body 1 along AC and to compress it along DE. This is shown in
the diagram below.
small squares
represent small
masses m at each
posion
body 1 of
radius r
gravitaonal field
A B C
distance from
E
center of body 1 to
center of body 2 = R
mass of body 2 = M
The tidal forces shown are those that act in the freely falling reference frame.
Tidal forces are differential forces and are defined as the difference between
the actual gravitational force FA on a small mass m and the force that would
be exerted on it if it was at the center of the body, FB. For example, the tidal
force on the mass at A is given by:
GMm GMm −GMm ( 2 Rr + r 2 )

tidal force ( A ) = FA − FB = − =
( R + r )2 ( R )2 ( R + r )2 ( R )2
FP.CH23_2pp.indd 500 3/14/2023 6:41:10 PM

For r<<R this simplifies to:

−2GMmr
tidal force ( A ) =
R3
the negative sign here indicates that the direction of the tidal force is away
from the center of the body. This obeys an inverse-cube law with respect to
distance so tidal forces will only be significant for bodies that are relatively
close together.
23.4.2 The Earth’s Ocean Tides

The ocean tides on planet Earth are driven by tidal forces. While both the
Moon and the Sun affect the tides, the Moon’s effect is significantly greater.
This can be shown by comparing the maximum tidal force on a 1 kg mass due
to each body.
FP.CH23_2pp.indd 501 3/14/2023 6:41:11 PM

Tidal force from the Moon:

M = 7.3 × 1022 kg
m = 1 kg
R = 3.9 × 108 m (mean distance to the Moon)
r = 6.4 × 106 m (mean radius of Earth)
−2GMmr
force ( A )
tidal = = 1.110 −6 N
R3
Tidal force from the Sun:
M = 2.0 × 1030 kg
m = 1 kg
R = 1.5 × 1011 m (mean distance to the Sun)
r = 6.4 × 106 m (mean radius of Earth)
−2GMmr
force ( A )
tidal = = 5.110 −7 N
R3
The Moon’s tidal effect is roughly double that of the Sun, but both are signifi-
cant. As the Earth, Moon, and Sun change their relative positions, the tidal
effects of the Moon and Sun on the Earth’s oceans can reinforce or weaken
one another. The largest or “spring tides” occur when the three bodies are
aligned and the smallest or “neap tides” occur when the lines from the Sun
and Moon to the Earth are perpendicular. These effects are exaggerated in
the diagrams below, which are not to scale.
Earth Sun
Moon Moon
relave posions
for spring des
FP.CH23_2pp.indd 502 3/14/2023 6:41:11 PM

The period of the Earth’s rotation is significantly less than the period of
the Moon’s orbit so the Earth effectively rotates under the tidal bulges
and there are approximately two high tides in each 24-hour period.
However, the Earth’s rotation pushes the bulge slightly ahead of the
Moon’s position. This has the effect of creating a retarding torque on
the Earth that reduces its rotation period, making day length gradually
increase, and reducing its angular momentum. However, angular momen-
tum for the system must be conserved so the radius of the Moon’s orbit
also increases. The present rate of increase in day length is about 1.7 ms
per century and the Moon’s orbital radius increases by about 3.8 cm per
year. When the Moon first formed, the length of a day on Earth was only
about 5 hours. The long-term effect of this interaction is that the Moon’s
orbital period and the Earth’s rotation period will eventually become the
same – they will be tidally locked.
Moon
Earth Sun
relave posions
for neap des
Moon
23.5 EINSTEIN’S THEORY OF GRAVITATION

Newton’s theory of gravitation is actually an approximation to a more fun-
damental theory discovered by Einstein in 1915. This is the general the-
ory of relativity, and while the mathematics of that theory are well beyond
the scope of this book some of the key ideas are accessible without the
mathematics.
FP.CH23_2pp.indd 503 3/14/2023 6:41:11 PM

23.5.1 Space–Time Curvature

The key idea in Einstein’s theory is that gravity is not really a force field but a
distortion of the geometry of space and time. What we experience as a gravi-
tational field is actually the curvature of the space–time continuum where we
are. This sounds rather obscure but can be understood by analogy. Imagine
stretching a rubber sheet out so that it is completely flat and then rolling a
ball across its surface. In the absence of friction, the ball would continue in
a straight line at constant velocity. However, if a heavy mass is placed onto
the surface and the experiment has repeated the ball follows a curved path
around the mass. The geometry of the rubber sheet has changed and the path
followed by the ball has changed too.
stretched rubber mass placed in

sheet center of sheet
ball projected on
surface
As seen above, the presence of the central mass has caused the deflection of
the ball. As far as the ball is concerned it has simply followed the curvature of
the surface. John Wheeler summarized Einstein’s theory of general relativity
by saying that:
Matter tells space how to curve.
Space tells matter how to move.
In Newtonian mechanics, an object continues to move in a straight line
unless acted upon by a resultant external force. In Einstein’s Universe objects
move along the shortest paths in curved space–time. These paths are called
geodesics. A good analogy is found in the paths followed by aircraft on long-
haul flights around the Earth. The shortest path between two points on the
FP.CH23_2pp.indd 504 3/14/2023 6:41:11 PM

Earth’s surface is an arc of a great circle around the center of the Earth.
According to Einstein’s theory, the Earth’s orbit around the Sun is actually
a geodesic in space–time. The Earth is not being deflected by a gravitational
force it is simply following the local curvature of space–time and that c urvature
has been caused by the presence of the Sun.
23.5.2 The Equivalence Principle

One powerful idea that led Einstein to the general theory of relativity is the
equivalence principle. He realized that the effects of gravity and the effects of
acceleration are actually equivalent. To understand what this means imagine
you are on a fairground ride that allows you to free-fall vertically before bring-
ing you safely to rest. While you are in free fall you cease to feel the reaction
from your seat and you feel “weightless.” From an outsider’s reference frame,
you are falling because of your weight but from your own reference frame,
the gravitational field strength seems to be zero. To Einstein, this implied that
freely falling reference frames at each point in space–time could be treated as
if the gravitational field was zero.
Alternatively, it means that the physical effects in a uniformly accelerated
reference frame must be identical to the physical effects in a uniform gravi-
tational field. An observer in a uniformly accelerated reference frame with
acceleration a cannot distinguish this from a reference frame in a uniform
gravitational field of strength g where g = a. This led to a powerful thought
experiment which showed that light must be deflected by gravity.
Imagine a laboratory inside a rocket that is accelerating vertically upwards.
The acceleration of the rocket is a and observers inside the rocket feel exactly
the same as if they were in a uniform gravitational field of strength g = a. Now
imagine that a horizontal beam of light enters the rocket through a port hole
on one side and leaves through a port hole on the other side. Since the rocket
has constant acceleration, the beam will follow a parabolic path inside the
rocket. Now imagine that the same rocket is at rest on the surface of a planet
with surface gravity g. If a beam of light enters horizontally through the same
window it must, according to the equivalence principle, follow the same para-
bolic path as before and deflect downwards toward the surface of the planet.
If this was not the case then an observer inside the rocket could tell whether
the rocket was at rest in a gravitational field or accelerating simply by observ-
ing the motion of light beams.
FP.CH23_2pp.indd 505 3/14/2023 6:41:12 PM

a
g
The British astronomer and mathematician, Arthur Eddington, tested this

prediction in 1919. He looked for the apparent shift in the positions of stars
close to the disc of the Sun during a total eclipse. He needed to do this during
a total eclipse because stars close to the Sun’s disc can only be observed when
the bright disc of the Sun is eclipsed and effects on stars farther from the disc
would be too small to measure. The measured shifts were in agreement with
Einstein’s theory. The theory has been tested many times and in many differ-
ent ways since then and the results have all been in agreement with the theory.
General relativity is regarded as one of the fundamental theories of physics.
23.5.3 Gravitational Time Dilation

Einstein realized that the distortion of space–time by gravity would, lead to
time dilation effects. Time in a stronger gravitational field runs more slowly
than time in a weaker gravitational field. This is important for GPS satellites
because they are in a weaker gravitational field than the receivers on the
surface and a correction must be built in to retain location accuracy. (They
must also be corrected for another time dilation effect caused by the relative
motion between the satellite and the receiver).
FP.CH23_2pp.indd 506 3/14/2023 6:41:12 PM

A simple formula for time dilation can be derived using Newton’s theory of
gravity. Consider a photon emitted vertically from a source on the surface of
a star of mass M. As the photon travels away from the star its gravitational
potential energy increases, so photon energy and frequency fall. The change
in a time period of the photon in different positions in the gravitational field
can be used to compare the rates of clocks at these positions. The time period
of the photon at A is T and at B is T′.
M
A B
EB − EA =mVB − mVA =m∆V
m is the mass equivalent of the photon energy and ΔV is the change in GPE
between A and B.
If the change in potential energy of the photon is small compared to its energy,
mΔV << hf then T′ ~ T so we can simplify the expression as follows:
hf ∆V
hf ′ − hf = 2
c
h h h∆V
− =
T ' T Tc2
∆T ∆V
=
T ' T Tc2
T ∆V
∆T ~
c2
 ∆V 
T′ T  1 + 2 
=
 c 
The greater the change in potential the greater the difference in clock rates.
FP.CH23_2pp.indd 507 3/14/2023 6:41:12 PM

Gravitational time dilation becomes extreme in very strong gravitational

fields (where the equation above is not valid). If we were to observe clocks
close to the event horizon of a black hole they would tick very slowly indeed
and time would stop at the event horizon. In this sense, anything that actually
enters the black hole has passed beyond the infinity of time in the external
Universe.
23.5.4 Gravitational Waves

Matter determines the curvature of the space–time continuum, so when the
distribution of matter changes (e.g., an orbiting binary system), so does the
local geometry. Einstein’s field equations link these local changes to the rest of
space–time, so that a disturbance travels outwards at the speed of light. This
is a gravitational wave. The existence of gravitational waves was predicted by
Einstein in 1916 as a consequence of his general theory of relativity.
When a gravitational wave passes, it causes periodic changes in the geometry
of the objects in its path. The diagram below shows how a spherical object
distorts along two perpendicular axes as a gravitational wave passes.
direcon of
gravitaonal wave
However, the predicted amplitude of these vibrations is tiny, even from pow-
erful astronomical events such as the collision of black holes. Typical ampli-
tudes change the dimensions of an object by about 1 part in 1020. For an
object 1 m long this is 10 billion times smaller than the diameter of an atom.
This makes them extremely difficult to detect and the first evidence for the
existence of gravitational waves was indirect.
Gravitational waves transfer energy, so an orbiting binary system loses energy
as it radiates and its period gradually changes. Alan Hulse and Joseph Taylor
analyzed the period of a binary system that consisted of a pulsar and a normal
star over a period of 30 years. They showed that the reduction in the period
FP.CH23_2pp.indd 508 3/14/2023 6:41:12 PM

was within 0.2% of the change predicted on the basis of gravitational r adiation.
The energy radiated is about 7.35×1024 W, about 2% of the Sun’s luminosity.
Their work won them the 1993 Nobel Prize in Physics and provided the first
convincing experimental evidence for the existence of gravitational waves.
The most sensitive terrestrial detectors use an interferometer to detect small
changes in light path over two perpendicular arms.
mirror
laser
mirror
beam
superposed beams
to detector
Monochromatic light is split so that half of the beam travels along the
“vertical” arm and half along the “horizontal” arm (in reality these will both be
horizontal – for example, NS and EW, and perhaps several kilometers long).
The returning beams superpose and interfere. When a gravitational wave
passes, the lengths of the arms fluctuate introducing a periodic phase differ-
ence that makes the interference pattern change at the frequency of the wave.
These fluctuations were finally discovered at the LIGO, Laser Interferometer
Gravitational Waves Detector, in 2015 and announced in February 2016 (100
years after they were predicted). The signal detected by LIGO was consist-
ent with the source being the inward spiral and coalescence of a pair of black
holes of around 36 and 29 solar masses. This discovery opens the door to a
new age of astronomy in which the most violent cosmic events are observed
using gravitational wave “telescopes.”
FP.CH23_2pp.indd 509 3/14/2023 6:41:13 PM

23.6 EXERCISES
1. Newton arrived at the law of gravitation by assuming that the centrip-

etal force required to keep the Moon in a circular orbit was provided by
gravity. In this question, we will show how this leads to an inverse-square
law. The first step is to compare the force on a 1.0 kg mass near the Earth’s
surface with the centripetal force on 1.0 kg of the Moon’s mass.
(a) Calculate the weight of a 1.0 kg mass close to the Earth’s surface.
g = 9.8 Nkg–1.
(b) Calculate the centripetal force required to keep 1.0 kg of the Moon’s
mass in its orbit around Earth. Moon’s orbital period = 27.3 days. The
mean radius of Moon’s orbit = 3.84×108 m.
Your answer in (b) is much smaller than in (c), but The Moon is fur-
ther from the Earth, so if both forces arise because of gravitational
attraction to the Earth then the strength of this attraction must fall
with distance.
(c) Show that the forces in (a) and (b) are consistent with the idea that
gravitational forces obey an inverse-square law with distance. The
radius of the Earth = 6.4×106 m.
(d) Give an argument to support the idea that gravitational attraction is
proportional to the mass of the object attracted (think of the Moon in
its orbit).
(e) Give an argument to support the idea that gravitational attraction is
proportional to the mass of the attracting object (think of Newton’s
third law).
(f) Use your answers to (c), (d), and (e) to justify the form of Newton’s law
of gravitation.
2. Consider a mass m suspended at a point P just above the surface of a
planet of mass M and radius R. Write down expressions for each of the
following quantities:
(a) The gravitational field strength and potential at P.

(b) The gravitational field strength and potential at a point Q a height 3R
above the planet’s surface.
FP.CH23_2pp.indd 510 3/14/2023 6:41:13 PM

(c) The work that must be done to raise mass m from P to Q.

(d) The final velocity of mass m, just before it strikes the planet’s surface,
if it falls freely from Q (ignore atmospheric drag).
(e) The escape velocity if the mass is projected away from the planet from
rest at point P.
(f) The escape velocity if the mass is projected away from the planet from
rest at point Q.
(g) The kinetic energy that must be supplied to m if it is to be projected
tangentially from Q and then enter a circular orbit.
3. (a) Derive an equation for g as a function of radius r from the center
of the Earth assuming the Earth’s density ρ is constant and that the
Earth is a perfect sphere.
(b) The Earth is actually an oblate spheroid, flattened at the poles. D
iscuss
how this affects the value of g at different points on the surface.
(c) The fact that the Earth rotates on its axis means that part of the gravi-
tational force holding us to its surface must provide the centripetal
force needed for us to rotate with the Earth. This affects our measured
weight. Calculate the % change in weight of a man at the e quator.
G = 6.7 × 10-11 Nm2kg− 2, ME = 6.4 × 1024 kg, RE = 6.4 × 106 m, T = 89
800 s (rotation period)
4. Consider the Earth and Moon as an isolated system. At what distance
from the Earth (along a line connecting it to the Moon) will a space craft
experience no resultant gravitational force?
G = 6.7 × 10-11 Nm2kg−2, ME = 6.4 × 1024 kg, MM = 7.2 × 1022 kg, RE = 6.4
× 106 m, r = 3.8 × 108 m (separation of centers of Earth and Moon)
5. The Moon orbits the Earth in about 27.3 days at a distance of 380 000 km.
(a) State Kepler’s three laws of planetary motion.
(b) Derive Kepler’s third law for a planet in a circular orbit.
(c) Kepler’s laws were originally stated about the planets in our Solar
System but they can be applied to any similar system where satellites
orbit a central body. Explain how the constant in Kepler’s third law is
affected when the law is applied to different systems.
FP.CH23_2pp.indd 511 3/14/2023 6:41:13 PM

(d) Use Kepler’s third law to calculate the distance from the Earth at
which an artificial satellite would be geostationary (i.e., have an orbital
period of 24 hours or 1 day).
(e) Why must geostationary satellites be placed in equatorial orbits?
(f) Use Kepler’s third law to derive a lower limit for the period of an artificial
Earth satellite.
6. Ganymede is Jupiter’s largest moon. Its orbit is circular with a radius of
1.07 × 109 m. The orbital period is 7 days, 3 hours and 43 minutes. Use
Kepler’s third law to find Jupiter’s mass.
7. If the Olympics were to be held on the Moon suggest how the shot putt
record on the Moon would compare with the record on Earth. Explain
your answer.
8. Both the Sun and Moon exert tidal forces on Earth’s oceans.
(a) Explain, qualitatively, how tidal forces arise.
(b) Show that the tidal effects caused by the Sun are significantly less than
those caused by the Moon.
9. Astronomers think that black holes may exist at the center of most g alaxies.
The evidence is from the high orbital speeds of stars near the center of
the galaxies (including our own Milky Way). The star orbits because of
attraction to the central object and that object must have a radius smaller
than that of the star’s orbit. Use this information to show that the object is
likely to be a black hole if:
c
v≥
2
Where v is the orbital speed (assume the orbit is circular) and c is the
speed of light.
2GM
The equation for the Schwarzschild radius is: RS = 2
c
where M is the mass of the central object.
FP.CH23_2pp.indd 512 3/14/2023 6:41:13 PM

CHAPTER
24
Special Relativity
24.1 THE POSTULATES OF SPECIAL RELATIVITY

Classical physics consists of Newtonian mechanics (and gravity) and
Maxwell’s electromagnetic theory. However, by the end of the 19th cen-
tury, it became clear that mechanics and electromagnetism were in conflict,
especially when physicists tried to understand what was meant by the speed
of light. Up to that point, the speed of a wave was measured relative to the
medium through which it moved, but the light is an electromagnetic wave
that travels through a vacuum so what exactly do we measure the speed of
light against?
24.1.1 Absolute Space

Galileo realized that the laws of mechanics are exactly the same whether you
are at rest or in a uniformly moving laboratory. In other words, if you find
yourself inside a closed spacecraft with no windows there is no mechanical
experiment that you can carry out inside that spacecraft to determine whether
it is at rest or moving with constant velocity. This is why we can comfortably
drink coffee, read a book and walk about inside a jet aircraft on a long-haul
flight. This idea is called Galilean relativity:
The laws of mechanics are the same in all uniformly moving reference
frames.
These reference frames are called “inertial frames.”
If the laws of mechanics cannot distinguish rest from uniform motion, can
anything? Is there a non-mechanical experiment that can distinguish motion
FP.CH24_2pp.indd 513 3/14/2023 6:42:44 PM

from rest? This is an important question because if the answer is “yes” then
there is one special reference frame that is at rest and all other reference
frames are in motion. If the answer is “no” then the idea of being at rest is not
special at all. The idea of a privileged reference frame at rest in the Universe
leads to the concept of “absolute space,” a fixed background or reference
frame against which all other motions can be measured.
This is where the speed of light comes in. This speed emerges naturally
from Maxwell’s equations and represents the speed of a disturbance in the
electromagnetic field. Other kinds of waves are all vibrations in some mate-
rial medium so physicists naturally thought that light must also have its own
medium that fills the vacuum of space and is at rest in absolute space. They
called this all-pervasive medium the “luminiferous ether.” The speed of light
would then be the speed relative to the ether and to absolute space. They
immediately began to think about how they might detect the ether, or at least
detect our motion relative to the ether.
In principle, it should be relatively easy to detect the effect of the Earth’s
motion relative to the ether. The Earth orbits the Sun so its motion through
the ether ought to vary periodically during the year. If the speed of light is con-
stant relative to the ether, then the speed of light measured on Earth would
be a relative velocity and should also vary, sometimes above and sometimes
below the value given by Maxwell’s equations. Several ingenious experiments
were carried out to try to test this idea, the most famous of which was the
Michelson-Morley experiment in 1887.
In the Michelson-Morley experiment a light beam is divided in two and each
half is sent on a round trip perpendicular to the other half of the beam. The
two beams are then combined and form an interference pattern. Any differ-
ence between the time of flight on the two paths shows up as a shift in the
interference pattern when the returning beams superpose. Since the Earth
orbits the Sun the experimenters expected the interference pattern to shift
during the year as the Earth’s motion relative to the ether changes and this
changes the relative velocity of light, producing different time delays for each
path. For example, if the Earth moves through the ether in the same direc-
tion as light then the relative velocity is lower, and if the Earth moves in the
opposite direction the relative velocity is higher.
The arrangement of the apparatus for the Michelson-Morley experiment is
shown (simplified) below. This is an example of an “interferometer” consisting
of two arms of equal length. The motion of the laboratory through the ether
FP.CH24_2pp.indd 514 3/14/2023 6:42:44 PM

Special Relativity • 515
delays the light along both arms but the delay is greater along the path p
arallel
to motion through the ether. If the apparatus is rotated through 90° the greater
delay should shift to the other arm of the interferometer and the interference
pattern should shift. The apparatus used by Michelson and Morley was sensi-
tive enough to detect shifts caused by relative velocities comparable to the
Earth’s orbital speed, so they expected to detect the Earth’s motion through
the ether.
luminiferous ether at rest in absolute space
Earth velocity v light beam velocity c

relave to ether relave to ether
light beam velocity ( c − v)

relave to Earth
Earth moving
relave to ether
luminiferous ether at rest in absolute space
apparatus at rest in laboratory

mirror
laboratory moves at velocity

v throughether
half-silvered
mirror
mirror (beam
monochromac splier)
light
interference paern
Depite carrying out the experiment carefully and repeating it at different

times of the year they could not detect any shift in the position of the interfer-
ence fringes. Similar experiments have been carried out using more modern
equipment but with the same results – that is, no shift in interference fringes.
FP.CH24_2pp.indd 515 3/14/2023 6:42:44 PM

This is called a “null result” and is probably the most famous null result in the
history of physics. All attempts to detect the ether have failed and the fact that
there is no delay in experiments such as the Michelson-Morley experiment is
very hard to explain using the concept of an ether.
This failure to detect the ether is also a failure to detect the Earth’s motion in
absolute space. It seems that the Earth’s motion does not affect the speed of
light.
24.1.2 Einstein’s Ideas About the Laws of Physics

Einstein may well have been unaware of the Michelson-Morley experiment
and its results but he was concerned about the meaning of the speed of light.
As a young man, he had wondered what light would look like if he could travel
alongside it at the same speed. In principle, the light wave would be like a fixed
disturbance of the electromagnetic field in his reference frame. However,
Maxwell’s equations did not allow for any such solution, so an observer mov-
ing at the speed of light would not be able to use the same laws of physics as
one at rest. This seemed odd to Einstein, especially since Galilean relativity
suggested that the laws of mechanics would still be the same. Einstein began
to wonder whether he should extend Galilean relativity to all of the laws of
physics.
This led to the first postulate of the special theory of relativity:
The laws of physics are the same in all inertial reference frames.
This also helped explain why, in electromagnetic induction experiments, it
does not matter whether a magnet is moved toward a coil or a coil is moved
toward a magnet – the same induced emf is produced in both cases. Absolute
motion is not important, but the relative motion is.
Einstein thought that Maxwell’s equations must be one of the fundamental
laws of physics and so should apply in the same way in all inertial reference
frames. Since Maxwell’s equations lead to a fixed value for the speed of light
Einstein proposed a second postulate:
The speed of light is the same for all inertial observers.
If the laws of physics are the same for all inertial observers then being at rest
has no special significance to physics and the idea of an absolute space must
be abandoned. If the speed of light is the same for all inertial observers then
it will be the same along each arm of a Michelson-Morley interferometer no
FP.CH24_2pp.indd 516 3/14/2023 6:42:44 PM

matter how the apparatus is moving (as long as it is not accelerating) so no

delay should occur and the null result is explained.
While these simple postulates explain the null result of the Michelson-Morley
experiment and make the speed of light into a universal constant they under-
mine the idea of absolute space and, as we shall see, the idea that there is an
absolute time.
24.2 TIME IN SPECIAL RELATIVITY

Newton thought that time was the same for all observers throughout the
Universe – an “absolute time,” as if one clock could be used to measure all
time intervals. Einstein’s postulate that the speed of light is the same for
all inertial observers undermines this idea. For the speed of light to be the
same for two observers in relative motion they cannot agree on the distance
traveled by the light and the time taken to travel from one point to another.
Distances and time intervals must depend on the motion of the observer. This
is the price that must be paid if observers in relative motion are to experience
the same laws of physics.
24.2.1 Time Dilation

A simple thought experiment can be used to show how relative motion affects
time intervals. It uses a hypothetical clock based on the constant speed of
light. The light clock ticks as a light beam bounce up and down between
two parallel mirrors. If the distance between the mirrors is L then the clock
ticks once in a time 2L/c relative to an observer in the same reference frame.
However, if the clock is in motion relative to that observer, the light path is
longer. But the speed of light is the same (according to special relativity) so,
for this observer, the clock takes longer to tick. Time in the moving relative
frame has slowed down. The diagrams below show everything from the point
of view of observer A who is at rest with respect to a light clock. He sees
another observer, B, in a laboratory that moves past at velocity v relative to A’s
reference frame.
We can use Pythagoras’s theorem to calculate the length of the light path
in B’s moving laboratory as seen by observer A. It is very important to state
which observer we are using because, for B who is at rest with respect to his
own light clock, the clock will appear to tick at its normal rate with the light
pulses moving up and down at the speed of light.
FP.CH24_2pp.indd 517 3/14/2023 6:42:44 PM

v
mirror
c
L
c
Observer B
mirror mirror mirror
c
c Observer A
mirror
mirror
mirror mirror
vT’/2 vT’/2
FP.CH24_2pp.indd 518 3/14/2023 6:42:45 PM

2L
For A, the time between ticks of his own light clock is T where: T = Also
c
for A, let the time between ticks on the moving (B’s) clock be T’. Now use
Pythagoras’s theorem to find the relationship between T and T’.
Considering either one of the two right angled triangles above:
c 2 T ′ 2 v2 T ′ 2
− =
L2
4 4
For A’s own light clock:
cT
L=
2
Therefore:
c 2 T ′2 v2 T ′ 2 c 2 T 2
− =
4 4 4
which can be rearranged to give:
T
=T′ = gT
v2
1− 2
c
where:
1
g=
v2
1−
c2
This is Einstein’s “time dilation” formula, showing how time slows down in
a moving reference frame, that is, the time between “ticks” on the “moving”
clock is increased relative to the “rest” clock.
The “gamma-factor” determines the significance of relativistic effects. For
v << c it approaches 1 and T’ = T, which is consistent with an idea of an
absolute time. However, as v increases the factor grows so that time in the
moving reference frame (B’s frame) slows down when observed from the
FP.CH24_2pp.indd 519 3/14/2023 6:42:45 PM

rest frame (A’s frame). As v approaches the speed of light the gamma-factor
increases without limit so that time would slow to a halt at v = c. Note that
we are not just talking about clocks here. As far as B is concerned time in
his reference frame is measured correctly by his own light clock so when
observed by A it is not just that the clock in B’s frame slows down, but so
does B’s aging process – time slows down.
gamma-factor
γ =1
velocity of moving
γ =0
reference frame
v=0 v=c
For relative velocities small compared to the speed of light, the gamma-factor
is negligible so relativistic effects can be neglected. This allows us to assume
that we all share the same time and space and can agree over measurements
of time intervals and distances. For example, even at the speed of a jet airliner
(about 300 ms- 1) the value of the gamma-factor is just 1.0000000000005!
If A sees B’s time slow down then what does B see when he looks back at A?
Both of the observers are in inertial reference frames so the laws of physics
are the same for both of them and the speed of light is constant in their refer-
ence frames. B will see A moving in the opposite direction at speed v and will
see the light path in A’s clock stretched out. B will see A’s time run slow by
the same time dilation factor that A saw B’s clock run slow. The effect is the
same for both observers. At first sight, this seems to lead to a contradiction
but while the clocks are in motion they cannot be placed at rest next to one
another to compare them to find out which one has gained or lost time. To do
this we would have to take at least one of them on a round trip.
FP.CH24_2pp.indd 520 3/14/2023 6:42:46 PM

24.2.2 The “Twin Paradox”

What happens when two clocks move relative to one another and then come
back together so that the clocks can be compared? An observer moving with
either clock ought to see the other clock running slow so that less time passes
in the moving reference frame than in their own. This would lead to a para-
dox – how can clock A record less time than clock B AND clock B record less
time than clock A? If the theory really predicts this then there is a problem
with relativity.
This is illustrated by the “Twin Paradox.” Imagine two twins A and B who are
together on their 21st birthday. Twin A remains on Earth while twin B under-
takes a high-speed round-trip journey to a distant star. A sees B travel away
and return and B’s time runs slow on both the outward and return journeys. A
concludes that B should be younger than A when they are reunited.
v
B
A
B
v
Star
Earth
However, all inertial reference frames are equivalent, so B might argue that
from her point of view A and the Earth move away and return and the star
approaches and recedes. During the motion, she sees A’s time run slowly and
so concludes that A should be the younger one when they reunite. Is this a
valid way to describe the journey? In fact, it is not. If we ignore the relatively
low velocity of the Earth then A stays in the same inertial reference frame
throughout and so A’s point of view is valid. B however, undergoes three sep-
arate periods of acceleration: as she leaves Earth, as she turns around and
accelerates back toward the Earth at the distant star, and then again as she
comes to rest back on Earth. To describe the journey from B’s point of view
we must take into account the effects of these periods of acceleration. From
B’s point of view, she will experience inertial forces during these periods that
are just like increases and decreases in the local gravitational field. These
changes introduce gravitational time dilation effects (see Section 23.5.3) that
are not present for A; so B’s view is not equivalent to A’s.
The traveling twin (B) will be younger when they reunite and the apparent
paradox is resolved.
FP.CH24_2pp.indd 521 3/14/2023 6:42:46 PM

24.2.3 The Relativity of Simultaneity

If time is absolute, then two distant events that occur at the same moment for
one observer will also occur at the same time for any other observers wher-
ever they are and however they are moving in the universe (once they have
corrected for the time of flight of light to reach them from each event). Once
we abandon the notion of an absolute space, as we have had to do, we cannot
assume that simultaneous events in one reference frame will be simultaneous
in all reference frames. It turns out that simultaneity is relative.
Consider an experimental method to synchronize two different clocks. They
are both zeroed and then a flash of light is sent from a point midway between
them. When the light hits a sensor on the side of each clock the clock starts.
In this way, the clocks start simultaneously and then keep the same time in
that reference frame.
Light source
c emits flash c
*
Clock A Clock B
This method could be extended throughout a three-dimensional space so that

the times of all events in that space could be measured. If two events occur
at the same time according to local clocks in this space then those events are
simultaneous in this reference frame.
The question is: will clocks synchronized in one inertial reference frame be
synchronized in all inertial reference frames, regardless of their motion?
Consider the process above carried out in a moving laboratory and observed
from another inertial reference frame. For simplicity, we will assume that the
motion is parallel to the separation of the clocks and that the flash occurs at
the moment the light source passes the stationary observer. The dotted clocks
show their initial positions for the observer.
For the stationary observer, the flash of light leaves the source and travels
away in both directions at the speed of light. However, clock A is moving
toward the source position and clock B is moving away from it. Light reaches
FP.CH24_2pp.indd 522 3/14/2023 6:42:46 PM

clock A first so that clock starts first. Clock B starts later. The two clocks are not
synchronized in the stationary observer’s reference frame even though they
are synchronized in the moving reference frame. For the stationary observer
the synchronization “error” increases with the separation of the clocks.
moving reference frame

Light source v
c emits flash c
*
Clock A Clock B
staonary
observer
This is a profound result. Two events that occur near A and B when the clocks
show the same time are simultaneous in the moving reference frame but occur
at different times for the stationary observer. This means that one moment in
a particular inertial reference frame is spread over different times in different
inertial reference frames. This destroys the concept of absolute time which
assumes a unique progression of moments for all observers.
It should again be noted that there is nothing special about the so-called “sta-
tionary observer.” He is just in a different inertial reference frame and if he
were to synchronize two distant clocks in his own reference frame then an
observer in the “moving” reference frame would think that there was a syn-
chronization error.
24.3 LENGTH CONTRACTION

Our galaxy is over 100 000 light years in diameter. A space craft traveling at
close to the speed of light would take more than 100 000 years to cross it as
measured by an observer who remains on Earth. However, for space travel-
ers, the time that passes in their reference frame would be much less than this
because of time dilation. Let’s assume they travel at a constant velocity v that
is so fast that the gamma-factor is 20 000. The time that passes in the space
craft is then only 100 000 / 20 000 = 5 years.
FP.CH24_2pp.indd 523 3/14/2023 6:42:46 PM

If the observers on board the space craft work out the distance they have
traveled, it will be 5 (v/c) light years rather than the 100 000 (v/c) light years
as observed from the Earth. The distance traveled has contracted by the same
gamma-factor as the time. This is an example of relativistic length contraction.
v v
L L
From the Earth’s reference From the space cra’s reference

frame (at rest in the galaxy) frame (moving at velocity v relave
the diameter is L but the to the galaxy) the diameter is
spacecra is contracted contracted to L’ but the space cra
because of its moon. has its proper length.
The length contraction formula is:
L
L′ =
γ
1
γ=
v2
1−
c2
L is the length of an object measured in its rest frame: its “proper length.”
L’ is the length of the same object when it moves past the observer at velocity v.
24.4 THE LORENTZ TRANSFORMATION

Length contraction, the relativity of simultaneity, and time dilation, all show
that measurements of time and space in one inertial reference frame will
not agree with measurements in another inertial reference frame in relative
motion. The Lorentz transformation equations relate the coordinates of an
FP.CH24_2pp.indd 524 3/14/2023 6:42:47 PM

event in one inertial reference frame to the coordinates of the same event in
another inertial reference frame moving at constant velocity with respect to
the first.
24.4.1 The Lorentz Transformation Equations

For simplicity, we will define our axes so that the relative motion is along the x
axis, so that the y and z coordinates are not affected by the motion.
z z’ Event coordinates:
(x, y, z, t)
(x’, y’, z’, t’)

Ineral reference frame
moving in posive x-
direcon at velocity v
y y’
x x’
The diagram above shows the coordinate systems used in two inertial refer-
ence frames. The primed frame is moving at velocity v in the positive x- and
x’-direction. At time t = 0 the origins of both coordinate systems coincide.
An event with coordinates (x, y, z, t) in the unprimed frame has coordinates
(x’, y’, z’, t’) in the primed frame. The Lorentz transformation allows us to
transform from one frame to the other.
Lorentz transformation - to transform from the primed to the unprimed
reference frame:
x′ = γ ( x − vt )
y′ = y
z′ = z
 vx 
t′ =γt − 2 
 c 
Inverse Lorentz transformation - to transform from the unprimed to

the primed reference frame:
FP.CH24_2pp.indd 525 3/14/2023 6:42:47 PM

x= γ ( x′ + vt′ )
y = y′
z = z′
 vx′ 
γ  t′ + 2 
t=
 c 
These equations can be used to derive the formulae for time dilation, length
contraction, and synchronization differences.
We will use them to derive an equation for relativistic velocity addition.
24.4.2 The Velocity Addition Equation

A simple example involving light shows that we cannot simply add veloci-
ties. Imagine a car passing you with its headlights switched on. The car is
moving at velocity v and the light leaves the headlamps at velocity c relative
to an observer inside the car. If space and time were absolute then the light
would have a relative velocity v’ = c + v relative to you. This cannot be the
case because the speed of light is the same for all inertial observers, so it
must still be c. This is the same problem that we ran into when discussing the
Michelson–Morley experiment.
Consider an object moving at velocity u relative to an observer in a laboratory
that is itself moving at velocity v relative to a second inertial observer. What is
the velocity of the object w relative to this second observer?
z z’
Object moving with velocity u

relave to the moving reference
frame
Ineral reference
frame moving in
y y’ posive x-direcon at
velocity v
x x’
FP.CH24_2pp.indd 526 3/14/2023 6:42:47 PM

We are trying to find the relationship between w = dx/dt, u = dx’/dt’ and v.
dx′
u=
dt′
dx  dx′ dt′  dt′  dx′  dt′
w= = γ +v = γ  γ ( u + v)
+ v =
dt  dt dt  dt  dt′  dt
dt  v dx′   uv 
=γ 1 + 2 =γ 1 + 2 
dt′  c dt ′   c 
γ
=w ( u + v)
 uv 
γ 1 + 2 
 c 
( u + v)
w=
 uv 
1 + 2 
 c 
If u and v are small compared to the speed of light then this reduces to the
familiar equation for velocity addition: w = (u+v). The relativistic velocity
addition equation has the interesting property that no two sub-light speeds
can be added together to produce a speed greater than the speed of light.
However, long something accelerates it will approach but never quite reach
the speed of light. Consider the limiting case where u = v = c:
( u + v) ( c + c) 2c
=
w = = = c
 uv   c 2
 2
1 + 2  1 + 2 
 c   c 
24.5 MASS, VELOCITY, AND ENERGY

24.5.1 Mass and Velocity
When an object is accelerated at a constant rate, as measured in reference
frames momentarily at rest with respect to the object, its acceleration meas-
ured by a stationary observer will gradually fall. This is because the time taken
for the same increase in velocity is longer for the stationary observer as a
result of time dilation.
FP.CH24_2pp.indd 527 3/14/2023 6:42:48 PM

Consider a particle of charge q and mass m0 (as measured in a reference frame

at rest with respect to the particle) accelerated by a constant electric field of
strength E. The particle increases its velocity from v to v + dv in a time dt’ in
a reference frame moving with the particle at speed v relative to a stationary
observer. The same increase in speed will take a longer time dt = gdt’ for the
stationary observer.
In the moving reference frame:
Eqδt′
δv =
m0
In the reference frame of the stationary observer:
Eqδt Eqg δt′

=
δv =
m m
Combining these two equations:

m = γ m0
This shows that the mass will increase with velocity. The mass measured in
the rest frame of the particle is m0 and this is the fundamental mass of the
particle. The additional mass is related to the kinetic energy of the particle as
we shall see.
24.5.2 Mass and Energy

The equation for relativistic mass can be expanded as a series in terms of (v2/c2)
using the binomial theorem:
1
m0  v2  2
m=
γm0 = =
m0  1 − 2 
 v2   c 
 1 − 
 c2 
  1  −1  
 1  v2   2 
   v2 2
2  
= m0  1 +  2  +  2 + … 
 2 c  2! c  
 
 
FP.CH24_2pp.indd 528 3/14/2023 6:42:48 PM

If this is multiplied throughout by c2 then every term has the dimensions of

energy:
1
E = mc2 = γ m0 c2 = m0 c2 + mv2 + …+ terms converging to zero
2
The second term on the right is the classical kinetic energy and the converg-
ing series of terms beyond it (which are all negligible for v << c) are the rela-
tivistic corrections to the equation for kinetic energy.
The term E = mc2 on the left-hand side of the equation must represent the
particle’s total energy when it is moving at velocity v. This becomes E = m0c2
when the particle is at rest so this is called the rest energy of the particle and
it suggests that mass and energy are equivalent in some sense.
The kinetic energy of the particle is the difference between its total energy
and its rest energy:
= mc2 − m0 c2
KE
This approximates to the classical formula KE = ½ mv2 when the velocity is

small compared to the speed of light, as we have seen.
The equivalence of mass and energy implies that all energy transfers are also
mass transfers. However, the constant that relates energy to mass is the speed
of light squared.
∆E = c2 ∆m
c2 is so large (9×1016 m2s- 2) that most energy transfers in everyday life make
negligible difference to the mass. The only place where we see measurable mass
changes because of energy transfers is in nuclear physics (see section 26.1).
On the other hand, the conversion of even a tiny amount of matter would
release a huge amount of energy. The only process that converts mass com-
pletely to energy is the annihilation of matter and antimatter, for example, the
annihilation that occurs when an electron meets a positron (anti-electron).
24.6 SPECIAL RELATIVITY AND GEOMETRY

Einstein published the special theory of relativity in 1905. In 1908, Hermann
Minkowski, Einstein’s old mathematics teacher, published a paper in which
FP.CH24_2pp.indd 529 3/14/2023 6:42:48 PM

he showed that the Lorentz transformations of special relativity are equiva-

lent to rotations in a four-dimensional continuum he called space–time. This
effectively made special relativity into a geometric theory and ultimately led
to Einstein’s general theory of relativity in which gravitation is explained as a
disturbance of the geometry of space–time. Here is a simple example to show
how a geometric approach works.
24.6.1 Invariants
The fact that measurements of lengths and times made in different inertial
reference frames differ from one another can be disconcerting. Physicists like
to discover quantities that are the same for all observers. Such quantities are
called “invariants.”
Consider two points on a 2D surface. The location of each point can be
described by stating the coordinates of each point relative to a fixed coordinate
system consisting of two perpendicular axes. However, there is an arbitrary
choice about which set of perpendicular axes to use and the coordinates will
be different for different choices, as shown in the diagram on the next page.
While the coordinates change when the axes are rotated, the distance AB does
not. It is an invariant under rotation of the axes:
( xB − xA ) + ( yB − yA ) = ( x′B − x′A ) + ( y′B − y′A )

2 2 2 2
AB=
y y’
A: ( x A, yA ) ( x A’, yA’ )
y B’
B B: ( x B, yB ) ( x B’, yB’ )
yB
yA A
yA
xA xB x
x A’
x B’
x’
FP.CH24_2pp.indd 530 3/14/2023 6:42:49 PM

It is also possible to write down a set of transformation equations to convert

measurements made relative to the unprimed axes into measurements rela-
tive to the primed axes. It was the similarity between these transformation
equations and the Lorentz transformation equations that alerted Minkowski
to the idea that changing from one inertial reference frame to another is like
a geometric rotation.
24.6.2 Space–Time
Space–time is a four-dimensional continuum where the fourth dimension is
related to time. Whereas a point in space is defined by three coordinates (x, y,
and z) a point in space–time, called an event, has four (x, y, z, and t). In the
SI system, distances and times are measured in different units, meters and
seconds, so instead of simply using the time in seconds in a space–time dia-
gram we multiply the time by the speed of light so that the units on the time
axis are also meters. Minkowski realized that while measurements of time
intervals and distances will differ for observers in different inertial reference
frames, the Lorentz transformation ensures that the 4D “distance” between
two events is the same for all inertial observers, regardless of their velocity.
This 4D distance is calculated by Pythagoras’s theorem but by treating the
time differences as if they are mathematically imaginary quantities, in other
words, the fourth dimension is actually the ict dimension where i is the square
root of minus one.
ict
worldline of rocket
B moving at velocity v
ictB
ictA A
xA xB x
The diagram above shows a space–time diagram. Only one spatial dimension
is shown (x). Each point on the diagram represents an event – something
FP.CH24_2pp.indd 531 3/14/2023 6:42:49 PM

that happens at a particular moment at a particular point in space. Lines on

space–time diagrams are called “worldlines,” they represent a connected
series of events. In this diagram, the worldline has a constant gradient so it
represents an object moving at constant velocity in the positive x-direction
relative to the origin of the reference frame, for example, a rocket. Two events,
A and B lie on the worldline, so these represent two separate moments when
the rocket is in two different locations.
The interval AB is calculated in this reference frame by:
( xB − xA ) + ( ictB − ict A )
2 2
AB =
Since the interval is an invariant it must be the same using the coordinates of
events A and B in any inertial reference frame, including that of the rocket.
However, in the rocket’s reference frame both A and B occur where the rocket
is, in other words at x’ = 0 but at different times t′A and t′B. We cannot assume
that the times are the same as those in the other reference frame. The space–
time diagram in the rocket’s reference frame looks like this:
ict’
ict’B B
worldline of rocket at
rest in its own ineral
reference frame
ict’A A
0
x’
The interval AB in the rocket frame is:
( ict′B − ict′A )
2
=
AB
we can equate the two expressions for the interval (because it is an invariant):
( ict′B − ict′A =
) ( xB − xA ) + ( ictB − ict A )
2 2 2
FP.CH24_2pp.indd 532 3/14/2023 6:42:49 PM

we can also express the distance traveled in the un-primed frame in terms of
the velocity:
( xB − xA ) = v ( tB − tA )
After some algebraic rearrangement, we can express the time elapsed in the
rocket frame (primed frame) in terms of the time elapsed in the rest frame
(un-primed frame)
− c2 ( t′B − t′=
A) v2 ( t B − t A ) − c 2 ( t B − t A )
2 2 2
 v2  ( tB − t A )
( t′B − t′A=)  1 −  ( tB − t A=
)
 c2  g
This is time dilation. The time elapsed in the moving reference frame is less
than the time elapsed in the rest frame by the gamma-factor.
This illustrates how a geometrical approach in flat space–time reproduces the
standard results of special relativity.
24.6.3 Mass, Energy, and Momentum

The interval is just one of the important invariants in relativity. We have
already seen that when an object moves past us its mass increases with veloc-
ity according to the gamma-factor. We can also form an invariant quantity
from momentum and energy:
Energy momentum invariant = E2 - p2c2
E is the total energy of the object

p is the momentum of the object
Since this is an invariant it will have the same value in the rest frame of the
moving object, but in this frame, the object’s mass is its rest mass m0 and its
momentum is zero:
( m0 c2 )
2
E2 − p2 c2 =
This is a particularly useful relationship in particle physics.
FP.CH24_2pp.indd 533 3/14/2023 6:42:49 PM

24.7 EXERCISES
1. (a) Explain what is meant by each of the following:

Absolute space
Absolute time
An inertial reference frame
Galilean relativity
The luminiferous ether
Describe the Michelson-Morley experiment and explain why the
(b)
null result could not be explained in terms of absolute space and
absolute time.
State the postulates of special relativity and show how the null result of
(c)
the Michelson-Morley experiment is consistent with these postulates.
2. A spacecraft is manufactured on Earth has a total length of 200 m when
it is measured at rest on the surface of the Earth. During space trials, it
passes the Earth at a velocity of 0.90 c. The Earth can be assumed to be a
sphere of diameter 13 000 km.
Calculate the length of the spaceship as seen by an observer on Earth.

(a)
Calculate the diameter of the Earth as seen by an observer on the
(b)
spaceship.
Calculate the time it would take for the spacecraft to reach a planet
(c)
orbiting a star 20 light years from Earth:
(i) As measured by an observer on the Earth.
(ii) As measured by an observer on the spacecraft.
Calculate the distance traveled by spaceship as measured by an
(d)
onboard observer during the journey.
The astronauts spend 10 Earth years on the planet and then return
(e)
to Earth at the same speed as on their outward journey. One of the
astronauts has a twin brother who remained on Earth. Both were
21 at the start of the mission. How old will each twin be when the
spaceship returns to Earth and they are reunited?
Write down an equation for the relativistic “gamma-factor.”
3. (a)
At what speeds does the gamma-factor become 1.01, 1.10, 1.50, 5.0?
(b)
FP.CH24_2pp.indd 534 3/14/2023 6:42:49 PM

Sketch a graph to show how the “gamma=factor” varies with velocity.

(c)
What is the significance of the case v << c?
(d)
The inertial mass of a moving object increases with velocity accord-
ing to the equation:
m = g m0
Show that as the velocity of a particle approaches the speed of light

(e)
E
the following approximation for its momentum can be used: p ~
c
Where p is the momentum of the particle.
High-energy electrons have very short wavelengths and can be used
(f)
to probe the internal structure of nucleons. Use the equation in (d)
and the de Broglie relation to calculate the approximate accelerating
voltage needed for an electron to be able to resolve details on the
femtometer scale (10- 15m).
4. Two atomic clocks are synchronized and one of them is taken on a round
trip lasting 10 hours at a steady velocity of 300 m/s in a jet aircraft. At
the end of the journey, it is placed beside the other clock and compared.
(Ignore the effects of gravity and assume the “stay-at-home” clock is at
rest throughout the experiment).
Explain why there is a time difference between the two clocks at the
(a)
end of the experiment and say clearly which clock will be fast or slow
with respect to the other.
Calculate the expected time difference.
(b)
5. A spacecraft carrying observer A, has clocks, X and Y, at each end and is
passing an observer B at speed v.
X A Y v
Z B
FP.CH24_2pp.indd 535 3/14/2023 6:42:50 PM

Explain what is meant by “the relativity of simultaneity.”

(a)
Observer A synchronizes clocks X and Y. Suggest a method by which
(b)
he could do this.
Explain why the clocks will not be synchronized for observer B.
(c)
Both observers measure the distance between the two clocks.
(d)
Observer A obtains a result l0. What is the distance between the
clocks according to observer B?
A’s method to measure the distance was to fire a pulse of light from Y
(at time tY) and to record when it arrives at X(tX). The distance according
to A is then d = c (tX - tY)
B’s method for measuring the length was as follows. He used clock Z and
recorded the time at which each clock passed immediately above P. The
distance according to B is then d′ = v (t′X - t′Y).
Explain carefully why these two methods must give different results and
explain why both measurements are equally valid for the observers who
made them.
FP.CH24_2pp.indd 536 3/14/2023 6:42:50 PM

CHAPTER
25
Atomic Structure and
Radioactivity
25.1 THE NUCLEAR ATOM
The concept of an “atom” comes from the ancient Greeks and derives from
the idea of an uncuttable smallest part of each element. While we still consider
each atom to be the smallest part of an element it is certainly not “uncutta-
ble.” At the end of the 19th century J.J. Thompson showed that electrons were
small parts of all atoms and at the start of the 20th century Ernest Rutherford
probed the atom with high energy alpha particles and showed that all atoms
have a tiny, massive core or nucleus. Later, in 1919, he managed to split a
nitrogen atom. Rutherford’s work resulted in the nuclear model of the atom.
25.1.1 The Rutherford Scattering Experiment

The famous scattering experiment was actually carried out by his assistants,
Geiger and Marsden, under Rutherford’s direction. They used a radioactive
source to fire a narrow collimated beam of alpha particles at very thin gold foil.
Rutherford expected the alpha particles to penetrate the foil and to hardly be
deflected because any charge present in the atom would be spread out and
the forces on alpha particles, which are positively charged, would be too small
to cause large deflections. The experimental arrangement is shown below:
A moveable detector was used to record how many alpha particles scattered
through each angle.
The results were surprising:
Most alpha particles passed through the gold foil with little or no d
eflection
(as expected).
Some alpha particles were scattered through large angles.
FP.CH25_3pp.indd 537 3/15/2023 12:22:48 PM

lead shielding to Vacuum detector

create collimated
beam
θ
alpha
gold foil
source
A small number (about 1 in 104) scattered through angles greater than 90°
(backscattered).
Rutherford assumed that any deflections must be electrostatic in nature and
explained the results in the following way:
the atom is mainly empty space, so most alpha particles do not pass close
to any concentrated charge centers;
there is a tiny electrostatically charged central nucleus. The small propor-
tion of alpha particles that pass close to a gold nucleus will be deflected
through large angles and those that make an almost direct hit will back
scatter.
The nucleus must contain most of the mass of the atom otherwise the
nucleus would recoil strongly and the alpha deflections would be less
significant.
Rutherford used the scattering data and Coulomb’s law to calculate an upper
limit for the radius of the nucleus and found that the nuclear radius was of the
order of 10-14 m or smaller, about 104 times smaller than the atomic radius.
Later work showed that the charge on a nucleus is equal to +Ze where Z is
the atomic number, equal to the position of the atom in the Periodic Table of
elements. The forces acting on the scattering nucleus and the scattered alpha
particle are shown below.
The diagram below shows typical alpha particle trajectories close to a nucleus.
On this scale, the outside of the atom would be about 10 m away! Therefore
the vast majority of alpha particles, which pass much further away from the
nucleus than those shown in this diagram, experience weak electrostatic forces
and suffer small deflections.
FP.CH25_3pp.indd 538 3/15/2023 12:22:48 PM

Atomic Structure and Radioactivity • 539
The greater the impact parameter (the perpendicular distance between the
initial path of the alpha particle and the nucleus) the greater the minimum
distance between the alpha particle and the target nucleus and the smaller the
angle of deflection. An alpha particle traveling directly toward the center of
the target nucleus makes the closest approach to it and is deflected back along
its original path (deflection angle 180°).
2
=
4
+2e r
+Ze 2
=
nucleus 4
25.1.2 Closest Approach and Nuclear Size

At closest approach, the incident alpha particle momentarily comes to rest as
it deflects back along its original path. At this point, all of the incident kinetic
energy Eα of the alpha particle has been stored as electric potential energy.
If the distance of the closest approach is d then:
2 Ze2 2 Ze2
Eα = and d =
4 πε 0 d 4 πε 0 Eα
So, the radius of the nucleus, rn must be less than this value:
incident alpha
parcles
increasing
impact
gold nucleus parameters
closest
approach d
FP.CH25_3pp.indd 539 3/15/2023 12:22:49 PM

2 Ze2
rn <
4 πε 0 Ea
The alpha particle energy in Rutherford’s experiment was 4.7 MeV and the
atomic number of gold is 79. This gives an upper limit for the radius of the
gold nucleus:
rn < 4.8 × 10 −14 m
25.1.3 Using Electron Diffraction to Measure Nuclear Diameter

Increasing the energy of incident alpha particles reduces the distance of the
closest approach. However, alpha particles are actually helium nuclei consist-
ing of two protons and two neutrons. Neutrons and protons interact by the
short-range strong nuclear force so when the alpha particles get very close to
a target nucleus the forces are no longer simply electrostatic and Rutherford’s
analysis begins to break down. This limits the usefulness of alpha particles for
precise measurements of nuclear size. However, it is possible to use electrons
instead. A high-energy electron beam behaves like waves of very short wave-
length and these are diffracted by the target nuclei just like light waves are
diffracted by small spherical particles (e.g., lycopodium powder).
st
1 minimum of
high-energy diffracon
electron beam paern
θ
target
The first minimum of the diffraction pattern for a spherical object is at:
1.22λ
sin θ =
D
Where D is the diameter of the target nucleus and l is the de Broglie wave-
length of the electrons.
FP.CH25_3pp.indd 540 3/15/2023 12:22:50 PM

1.22l
D=
sin θ
The wavelength of the electrons is calculated from the de Broglie equation
h
λ=
p
However, in order to obtain a wavelength comparable to nuclear dimensions
the electron must be accelerated to very high energy so relativistic equations
must be used to determine the electron momentum. In particular:
E2 + ( m 0 c 2 )
2
p2 c=
2

the energy required E is much greater than the rest energy m0c2 so the equa-
tion simplifies to:
E
p=
c
giving:
1.22 hc
D=
E sin θ
25.1.4 The Nuclear Atom

The Rutherford nuclear atom consists of a small positively charged nucleus
containing protons and neutrons (collectively called “nucleons”) and most of
the mass of the atom, surrounded by orbiting electrons. The number of protons
in the nucleus is called the atomic number and corresponds to the position
of the element in the Periodic table. The atomic number is also equal to the
number of orbiting electrons in the neutral atom. If an atom gains or loses an
electron it becomes an ion. The ratio of the atomic radius to the nuclear radius
is about 20 000 in most atoms.
Nuclear nomenclature
Z: the atomic number, equal to the number of protons in the nucleus and
the position of the element in the Periodic table
A: the atomic mass number or nucleon number, equal to the number of
protons plus the number of neutrons in the nucleus
N = A – Z: the neutron number, equal to the number of neutrons in the
nucleus
FP.CH25_3pp.indd 541 3/15/2023 12:22:50 PM

A nucleus of element X can be represented in the following way:

A
ZX
For example, 12
C represents the nucleus of carbon-12 with 6 protons and 6
6
neutrons.
Elements are determined by their chemical behavior and this depends on
the arrangement of their outer electrons. Since the electronic configuration is
determined by the charge on the nucleus atoms with the same atomic number
are chemically identical even if the number of neutrons in the nucleus varies.
Atoms with the same atomic number and different neutron numbers (and
mass numbers) are isotopes of the same element. For example:
12
6 C, 136 C, and 146 C
are all isotopes of carbon. They are chemically identical but have slightly dif-
ferent physical properties because of their differing masses. Carbon-14 is also
unstable and, in common with many neutron-rich nuclei, it decays by beta-
minus emission.
25.2 IONIZING RADIATION

In 1896, the French physicist, Henri Becquerel, discovered that certain com-
pounds of uranium emitted penetrating radiation that could cause ioniza-
tion. He realized that this radiation was different from X-rays (discovered by
Roentgen a year earlier) because the new ionizing radiation could be deflected
by electric fields. Marie Curie coined the term “radioactivity” to describe the
emission of this new type of radiation. She also discovered two more elements
that were more radioactive than uranium, radium, and polonium.
25.2.1 Types of Ionizing Radiation Emitted by Radioactive Sources

Three distinct types of ionizing radiation are emitted by radioactive sources,
alpha, beta, and gamma. The nature and properties of each type are summa-
rized below:
Charge Rest mass Range in air∗∗ Stopped by∗∗ Nature of emission
Alpha (α) +2e 4u Few cm Card/skin Helium nucleus
Beta∗ (β) –e 1/1840 u ~1 m Few mm of Al Electron
Gamma (γ) Neutral 0 Indefinite Several cm of Photon
lead
FP.CH25_3pp.indd 542 3/15/2023 12:22:51 PM

∗
There are two types of beta emission, beta-minus (electrons) and beta-plus
(positrons). Here we are describing beta-minus.
These depend on energy so vary with different sources and are given here as
∗∗
a typical indication only.

The diagrams below show how each type of radiation is affected by electric
and magentic fields. Alpha particles deflect less and in the opposite direction
to beta particles because they have more momentum and opposite charge.
α α
γ γ
β β
E-field B-field
upwards into page
25.3 ATTENUATION OF IONIZING RADIATION

When ionizing radiation passes through matter it transfers energy to the mate-
rial and the ionizing beam is attenuated. When ionizing radiation is emitted in
a vacuum it spreads out and the intensity of the radiation falls as a result of this.
25.3.1 Inverse-Square Law of Absorption

Radioactive emission is a random process so ionizing radiation is emitted
equally (on average) in all directions from a source. If the radiation is not
absorbed by the medium into which it is emitted then its intensity will fall as
an inverse-square law. This will apply to all radioactive emissions in a vacuum
but also to gamma-rays in air (since they are only weakly ionizing).
Consider a radioactive source that emits ionizing radiation at a rate R.
The intensity I of radiation at radius r will be:
R
I=
4 πr 2
This is an inverse-square law.
FP.CH25_3pp.indd 543 3/15/2023 12:22:51 PM

sphere at radius
r from source
source
25.3.2 Exponential Absorption and the Attenuation Coefficient

When gamma-ray photons travel through matter the probability per unit
length of their path is a constant. This means that the proportional change
in beam intensity is always the same for the same thickness of material. For
example, if 1.2 cm of material reduces the beam intensity from I to 0.5 I then
the next 1.2 cm of the same material will reduce it to 0.25 I, and there is a
constant half-thickness.
δx
I I+δI
x
The diagram above shows part of the path of a gamma-ray beam through
matter. Then the intensity of the beam changes by an amount dI as it passes
through a short thickness dx of material. Since the beam is being absorbed dI
is negative. The proportion absorbed per unit length is constant so:
FP.CH25_3pp.indd 544 3/15/2023 12:22:51 PM

δI
= −µ
Iδx
where m is the absorption coefficient (a constant) for the medium with units m-1.
In the limit that dx ® 0 this becomes the first-order differential equation:
dI
= −µI
dx
whose solution is:
I = I0 e−µx
Intensity falls exponentially from its initial value I0. As with all exponential
changes, it has a constant proportion property in that the intensity will always
fall by the same fraction in the same distance. We can therefore derive an
expression for the half-thickness x½ of the material, that is, the thickness of the
material that will reduce any initial intensity of the radiation by 50%.
−µx 1
1
e 2
=
2
ln 2
x1 =
2 µ
The absorption coefficient and half-thickness depend on the energy of the

gamma-rays and the nature of the medium. Half-thickness (or “half value
layer, HVL”) is a useful quantity to use in radiological protection when com-
paring the effectiveness of different shielding materials or the penetrating
power of radiation.
X-ray penetration also decays exponentially with distance and the table below
gives half thicknesses of human tissue, aluminum, and lead for X-rays of three
different energies.
Half thickness, mm
Medium
30 keV photons 60 keV photons 120 keV photons
Human tissue 20 35 45
Aluminum 2.3 9.3 17
Lead 0.02 0.13 0.15
FP.CH25_3pp.indd 545 3/15/2023 12:22:52 PM

25.3.3 Absorption of Beta Radiation

Beta-radiation has a continuous range of energies even from the same source,
so while the transmission of monoenergetic beta particles does obey an
exponential decay law, the behavior of the continuous beta-ray spectrum is
more complex. There are also two distinct ways in which the electrons lose
energy – by collision and scattering and by radiating photons as they decelerate
(bremsstrahlung). In practice, we usually refer to the range of beta particles as
a function of their energy. However, instead of giving the range as a distance,
it is usually given in terms of surface density. This is the density of the material
multiplied by the range in that material. The reason for this is that the range
depends on the material of the absorber, being shorter in denser media, but
the surface density is equal to the product of range and density so it is approxi-
mately the same for a variety of different media.
Here are some values of surface density for beta particles of various energies:
Surface density, kgm–2
Beta particle (electron) energy, MeV
Aluminum Copper
0.1 0.188 0.221
1.0 5.55 6.29
10 585 615
The relationship between range R, surface density s and density r is:

σ
R=
ρ
While the surface densities for aluminum and copper are similar, the range of
a 1.0 MeV beta particle is very different because the two elements have quite
different densities.
Range in aluminum:
σ 5.55
=
=R = 2.1 × 10 −3 m
ρ 2700
Range in copper:
σ 6.29
R == = 0.70 × 10 −3 m
ρ 8960
Copper is roughly three times as dense as aluminum so the range in copper is
roughly a third of the range in aluminum.
FP.CH25_3pp.indd 546 3/15/2023 12:22:52 PM

25.3.4 Absorption of Alpha Particles

Alpha particles lose their energy incrementally as they ionize atoms in the
material, so alpha particles of a particular energy will have a particular range.
The nature of alpha particle emission (as we shall see) results in each source
emitting alpha particles of one or a small number of distinct energies. This
means that all alpha particles from a particular source will have one or more
distinct ranges in each medium.
Alpha particle tracks can be shown up using a device called a Wilson cloud
chamber. A cloud chamber contains a super-cooled vapor that is on the point
of condensing. When an alpha particle passes through the vapor tiny droplets
condense around the ions it creates. This leaves a visible track rather like
the vapor trail behind a jet aircraft. The diagram below shows tracks from an
alpha source that emits alpha particles of two distinct energies.
cloud
source chamber
E1
E2
The high positive charge and large mass of an alpha particle make it interact
strongly with matter so that alpha particles transfer energy quickly and have
very short ranges in anything other than gas. Even in the air they are stopped
within a few centimeters.
25.4 THE BIOLOGICAL EFFECTS OF IONIZING RADIATION

When ionizing radiation is absorbed by matter it transfers energy to the atoms
and molecules with which it interacts. This can cause ionization and the
breaking of atomic and molecular bonds. If ionizing radiation is absorbed by
living tissue it can cause cell damage. However, living things have evolved in a
FP.CH25_3pp.indd 547 3/15/2023 12:22:52 PM

low-level radioactive environment and damaged cells are continually repaired

and replaced. This suggests that the effects of low doses of radiation may be
small or even negligible and some scientists have suggested there might be
a threshold level of background radiation below which there are no harmful
effects, but this has not been proven. In the absence of certainty, we have to
assume that low levels of radiation exposure pose a low level of risk to the
health of living cells and we judge what is a low level by comparison with lev-
els of natural background radiation. If an experiment or a medical procedure
increases your annual dose by an amount that is small compared to the dose
you would receive naturally then it is acceptable. If it increases your annual
dose by much more than natural background radiation then it may be danger-
ous and cause a significant increase in the likelihood that you suffer from a
radiation-induced illness, such as cancer.
25.4.1 The Natural Background Radiation

Radioactive isotopes are present in minerals in the Earth’s rocks, soil, and
atmosphere, in building materials, in the food we eat, and in our own bod-
ies. Ionizing radiation is also produced by cosmic rays coming in from space
and from artificial sources such as medical and dental X-rays, nuclear weap-
ons testing, and the nuclear industry. We have evolved and live in a natu-
rally radioactive environment and our annual radiation dose varies depending
on where we live and what we do. Taking a transatlantic flight, for exam-
ple, increases our dose because there is less atmosphere above us to absorb
incoming cosmic rays.
Nuclear
Food weapons
Nuclear industry
Medical
Radon
Rocks and soil
Cosmic rays
FP.CH25_3pp.indd 548 3/15/2023 12:22:53 PM

The pie chart below shows the origins of the typical annual radiation dose for
a UK citizen. By far the largest contribution comes from radon gas, a natural
product of the uranium decay series that seeps into the atmosphere from the
ground and can accumulate in cellars and ground floor rooms if they are not
properly ventilated.
25.4.2 Measuring Radiation Dose

Several different units are used to measure the absorbed radiation dose. The
gray is used to measure the total energy absorbed per kilogram of living tissue:
Radiation dose in gray = energy absorbed (J) / mass of tissue (kg)
1 gray (Gy) = 1 Jkg-1
This does not take into account the nature of the radiation absorbed. The
same dose in gray can produce very different effects depending on whether
the tissue has absorbed alpha, beta, or gamma radiation.
The equivalent dose takes into account the nature and energy of the radiation
absorbed by multiplying by a weighting or quality factor WR determined by
the type and energy of the radiation absorbed. Equivalent dose is measured
in Sievert (Sv):
Equivalent dose in Sieverts = radiation dose in gray × weighting factor
The sievert is also in Jkg-1 because the weighting factor is dimensionless.
Typical weighting factors are:
X-rays, γ - rays, β - particles: WR = 1
Neutrons: WR = 2 – 5 (depending upon energy)
Protons: WR = 2
α - particles, fission products: WR = 20
When considering radiation hazards, we must also bear in mind penetrat-
ing power. Alpha particles have a very high weighting factor but are strongly
absorbed and are stopped by the outer layers of our skin. These layers are
mainly dead cells so alpha radiation is not particularly dangerous from out-
side the body. However, if an alpha source gets inside the body, for example,
breathing in a radioactive gas, it is particularly dangerous.
An older unit for equivalent dose is the rem (Roentgen equivalent man). This
is still used in many textbooks (especially in the USA). 1 rem = 0.01 Sv
FP.CH25_3pp.indd 549 3/15/2023 12:22:53 PM

25.4.3 The Effect of Radiation Dose on Human Health

There are many different factors that need to be considered when estimating
the risks to human health from ionizing radiation. These include the nature
and energy of the radiation, the tissues that are exposed to the radiation, the
radiation dose, and the rate at which that dose is received. Most of the infor-
mation about the effects of large absorbed doses comes from studies on the
survivors of the nuclear attacks on Hiroshima and Nagasaki and while we have
a pretty good idea about the immediate effects of very high exposures, the
long-term and random effects of lower doses are not well understood. The
table below gives an idea of the effects of different radiation doses.
According to UK government data, the annual average equivalent radiation
dose for a UK citizen is 2.7 mSv (0.27 rem) while in the USA this is 6.2 mSv
(0.62 mSv).
Equivalent dose (in one day)
Effects
Per milli-sievert Per rem
0–250 0–25 No observable damage
Mild symptoms, damage to bone marrow and spleen – short
250–1000 25–100
illness is likely – the higher the dose the more serious the effects
Nausea, radiation sickness, suppression of the immune system,
and susceptibility to disease, more severe damage to spleen,
1000–3000 100–300
lymph nodes and bone marrow – recovery often possible if
treated
More severe than above but also hair loss, skin burns, diarrhea,
3000–10 000 300–1000 hemorrhaging, damage to the central nervous system,
sterilization … death likely (and almost certain if not treated)
Patients who survive high radiation doses have a significantly increased risk of
developing cancers such as leukemia in the future. Those who have not been
sterilized by the radiation also risk passing on genetic damage to their chil-
dren, resulting in stillbirths and birth defects. Approximately 50% of people
exposed to 5000 mSv (500 rem) will die.
The annual exposure limit for nuclear industry employees in the UK is 20
mSv, about double the dose from a CT scan of the spine.
25.4.4 Reducing Risks in the Laboratory

There are several simple precautions that should always be followed when
carrying out experiments with radioactive sources in a laboratory. The first
and most obvious is that a risk assessment should be carried out before start-
ing! This should include an estimate of the likely dose that will be received
FP.CH25_3pp.indd 550 3/15/2023 12:22:53 PM

during the duration of the experiment and a list of actions that will be taken
to minimize this dose. A judgment must also be made – do the benefits of
carrying out the experiment (e.g., educational or medical benefits) outweigh
the increase in the risk of damage to your or other people’s health? In a school
laboratory, sources are usually weak and well protected so that their use will
result in a negligible increase in dose compared to the natural background;
however, it is always important to be confident that this is the case.
Here are some additional procedures that should be used:
Keep the source inside its shielded container except when carrying out
the experiment
Maximize the distance between the source and the experimenters
Minimize the time of the experiment
Do not direct a collimated beam toward anyone
Always handle sources remotely, for example, using tongs
Include shielding between the experiment and the experimenters
Do not eat or drink while carrying out the experiment
Wash your hands thoroughly after the experiment
Cover any open cuts or scratches with a plaster
If you drop the source or suspect it is damaged in any way report this
immediately
25.5 RADIOACTIVE DECAY AND HALF-LIFE

Radioactive decay takes place inside the nucleus. It is a “random process”;
we cannot predict when a particular nucleus will decay or in which direction
the ionizing radiation will be emitted. It is also a “spontaneous process,” no
external factors (e.g., temperature or pressure) affect when a decay will occur.
Even though the process is random it is still possible to predict the pattern of
decay for a sample containing a large number of unstable nuclei. This is the
probability of decay per unit of time that remains constant so that the same
proportion of nuclei decay at the same time.
Radioactive decay can be modeled quite simply using dice. Each die has 6
faces so the probability of landing with a “6” showing upwards is 1/6. If a large
number of dice are rolled together, we would expect about 1/6 of them to
show heads. This prediction becomes better the larger the number of dice.
In a similar way, the fraction of nuclei that decay in a particular time inter-
val is always the same for any particular type of unstable nucleus (particular
nuclide) and the rate of decay will be directly proportional to the number of
nuclei in the sample. This leads to a differential equation:
FP.CH25_3pp.indd 551 3/15/2023 12:22:53 PM

dN
∝ −N
dt
dN
= −λ N
dt
N is the number of nuclei present at time t and the minus sign indicates that
the number is falling with time.
l is the decay constant and depends on the nuclide being considered. It has an
SI unit s-1 and represents the probability of decay per unit of time in the limit
of small time intervals.
This is a first-order differential equation that can be solved by the separation
of variables. Its solution represents how the number of nuclei in the sample
varies with time:
N = N0 e−λt
This is exponential decay. The term e− l t is equal to the fraction of the initial
number of nuclei remaining after time t.
The “half-life” of the nuclide is the time taken for the number of nuclei in the
sample to halve.
−λt 1
1
e 2
=
2
ln2
t1 = k
2
The graph below shows the decay curves for three nuclides, A, B, and C.
Their half-lives are shown on the time axis. A has a half-life of about 0.35 y, B
has a half-life of about 0.69 y and C has a half-life of about 1.4 y.
The fraction remaining after n half-lives is 1/2n. While the mathematical
model suggests that the sample never completely decays, the number of nuclei
remaining will eventually become so small that the model does not apply and
the random nature of radioactive decay will result in significant fluctuations in
decay rate until the last nucleus eventually decays.
The “activity” of a radioactive source is defined as the number of disintegra-
tions (decays) per second taking place inside the source. This is not the same
FP.CH25_3pp.indd 552 3/15/2023 12:22:54 PM

% of N0
100
90
80
70
60
50
40
A
30
20
B
10
0
C
0 1 2 3 4 5
me
( C) ( B) ( A)
as the count rate in a detector because many emissions will miss the detector
or not be detected and some will be absorbed before even leaving the source.
Activity = number of decays per second inside the source
dN
A= −
dt
The SI unit for activity is the Becquerel (Bq). 1 Bq = 1 decay per second.
This is rewritten in terms of the number of nuclei in the source:
dN
A= − = −λN0 e−λt = −λN
dt
A= −λN
This is a useful relationship that can be used to determine the half-lives of

long-lived radioisotopes such as uranium. N can be determined from the mass
of the sample and A can be calculated from measurements of ionizing radia-
tion emitted by the sample – that is, a detector is placed close to the sample
and the count rate is measured and scaled up to find the activity. Once A
and N are known the decay constant can be calculated and used to find the
half-life.
FP.CH25_3pp.indd 553 3/15/2023 12:22:54 PM

25.6 NUCLEAR TRANSFORMATIONS

When a nucleus decays it changes to become a nucleus of a different element.
The nuclear transformations equations for alpha, beta-minus, and gamma-
decays are shown below along with two additional decays: beta-plus and
electron-capture. The numbers at the top of the equation are nucleon num-
bers (protons or neutrons). The numbers at the bottom of the equation are
charges. Both nucleon numbers and charge numbers must balance on either
side of the equation.
25.6.1 Alpha Decay

An alpha particle is a helium nucleus consisting of two protons and two
neutrons:
4
He or 24a
2

Here is the transformation equation for the alpha decay of uranium-238

238
92 U→ 234
90 Th + 24 α
The general equation for the alpha decay of X to Y would be:

A
Z X→ A-4
Z-2 Y + 24 α
A has reduced by 4 and Z has reduced by 2.
25.6.2 Beta-Minus Decay

A beta-particle is a high-energy electron created inside the nucleus during
beta-decay.

0
−1 e or −01 β
This is a subtle process involving weak nuclear force and occurs when a neu-
tron decays to become a proton inside the nucleus. In addition, the creation of
an electron conservation of lepton number (see Section 26.4.1) demands that
an anti-neutrino is also emitted. This is an anti-particle of the neutrino, it is
neutral, has a tiny rest mass, and is like an uncharged electron. Its symbol is: 00u
The bar over the symbol indicates that this is an anti-neutrino.
FP.CH25_3pp.indd 554 3/15/2023 12:22:54 PM

Here is the transformation equation for beta-minus decay of carbon-14:

14
6 C → 147 N + −01 β + 00u
The general equation for the beta-minus decay of X to Y would be:

A
Z X→ A
Y + −01 β + 00 υ
Z +1
There is no change in mass number A because one type of nucleon, a neutron,

has changed into another type, a proton. Z increases by 1. The underlying
nucleon transformation is:

1
0 n → 11 p + −01 β + 00 υ
Beta decay results in three particles moving away from the position of the
original nucleus. This can occur in a variety of ways with almost all the energy
released being shared by the electron and anti-neutrino. The way in which
this energy is shared is random so the beta-particles emitted from a source
have a continuous range of kinetic energies up to some maximum value.
This differs from alpha decay where, with only two emerging particles, the
energy is shared in a definite way and the alpha particles have a discrete
energy spectrum.
number of beta
parcles
kinec
energy
cut off
25.6.3 Gamma Emission

Gamma-rays are high-frequency electromagnetic photons. Gamma-ray emis-
sion occurs after some alpha or beta decay when the resultant nucleus is left
in an excited state. The excited nucleus loses energy by making discrete quan-
tum jumps to lower states until it reaches its lowest or ground state. Each
FP.CH25_3pp.indd 555 3/15/2023 12:22:55 PM

quantum jump results in the emission of a gamma-ray photon whose energy is

equal to the difference in energy between the nuclear energy levels:
∆E =hf
This results in a discrete energy spectrum for the gamma-rays.

Cobalt-60 decays to nickel-60 by emitting a beta-minus particle. However, the
decay can occur in one of two excited states of the nickel nucleus (indicated by
asterisks below) each of which then decays by emitting a gamma-ray.
Energy
Co
beta-minus:
0.31 MeV
60 **
27 Ni
beta-minus: gamma-ray: 1.17 MeV

1.48 MeV
60 *
27 Ni
gamma-ray: 1.33 MeV
60
27 Ni
25.6.4 Beta-Plus Emission

Some proton-rich nuclei can decay by emitting a positron, an anti-electron.
This converts a proton in the nucleus into a neutron. The emission of the
positron is accompanied by the emission of a neutrino and the positron and
neutrino share the energy of the decay in a random way. This results in a con-
tinuous spectrum of positron energies. The nuclide oxygen-15 is a beta-plus
emitter:

15
8 O → 157 N + 0
+1 β + 00 ν
The general equation for beta-plus decay is:

A
Z X→ A
Y+
Z −1
0
+1 β + 00 ν
FP.CH25_3pp.indd 556 3/15/2023 12:22:55 PM

and the underlying nucleon decay is:

1
1 p → 01 n + 0
+1 β + 00 ν
The mass number A is unchanged but atomic number Z decreases by 1.
25.6.5 Electron-Capture
Electron-capture is another way that a proton-rich nucleus can decay. The
nucleus captures one of the atom’s inner electrons and combines it with a pro-
ton to create a neutron and emit a neutrino, so the net effect on the nucleus
is the same as beta-plus decay. However, the loss of an inner electron allows
other electrons to cascade down to lower energy levels inside the atom emit-
ting characteristic X-rays.
The general equation for electron-capture is:

A
Z X+ 0
−1 e→ A
Z −1 Y + 00n
and the underlying nucleon transformation is again:

1
1 p+ 0
−1 e → 01 n + 00n
Electron capture is an alternative mode of decay for nuclei that undergo beta-
plus decay.
25.7 RADIATION DETECTORS

Alpha, beta, and gamma emissions are all forms of ionizing radiation so ioni-
zation is the key to detection. A single alpha particle, for example, can create
tens of thousands of ion pairs before it is stopped in the air. Air is usually an
insulator but when ionizing radiation passes through the air the ion pairs cre-
ated make the air slightly conducting. This change in its electrical properties,
or in the electrical properties of other gases can be used to make a radiation
detector.
Alpha particles are very strongly ionizing, much more so than beta particles
or gamma rays and if an alpha source is held close to a charged electroscope
the electroscope is discharged. Ions created in the air close to the cap create a
weak conducting path to earth. A more sensitive detection method is required
for beta particles and gamma rays unless the source is particularly intense.
FP.CH25_3pp.indd 557 3/15/2023 12:22:56 PM

alpha source
ion pair
+
posive ions are
aracted to the cap
and negave ions are
repelled from it
negavely
charged
electroscope
25.7.1 The Spark Counter

If the electric field strength in air is high enough the air becomes ionized and
it begins to conduct – a spark is formed. The breakdown field strength in air
is about 3´106 Vm-1. In the spark counter the field strength between a fine
wire and a metal grille is adjusted so that it is just less than this value. When
an alpha source is brought nearby so that ionizing radiation enters the gap
between the electrodes, electrons are accelerated in the field, collide with
air molecules and cause further ionization. This results in an avalanche of
charge that forms a spark. The greater the intensity of the ionizing radia-
tion the higher the spark rate. Spark detectors only work well with strongly
ionizing radiation such as alpha particles, but the principle behind the spark
counter is used in many particle and radiation detectors.
25.7.2 The Geiger Counter

The Geiger counter is the most familiar radiation detector. It is really a more
sophisticated version of the spark counter. However, it is much more sensitive
than the basic spark counter and it counts individual ionization events inside
a Geiger-Müller tube.
The tube itself contains an inert gas at low pressure and a fine wire anode lies
along the axis of the tube. When a potential difference is applied between this
anode and the surrounding cylindrical cathode there is a radial electric field
inside the tube. When ionizing radiation passes through the low-pressure gas
FP.CH25_3pp.indd 558 3/15/2023 12:22:56 PM

variable H.T. supply
alpha source (0-5000V)
fine wire
gap of ∼ 1 mm
fine metal grille
some of the gas molecules become ionized. The electric field between the anode
and cathode accelerates these ions and there is a small pulse of current in the
external circuit. This generates a voltage pulse that can be detected and counted.
Alpha and beta radiation enter the Geiger-Müller tube through a thin end
window but gamma-radiation can also enter through the walls of the tube.
All of the pulses are identical, regardless of the type of radiation so a Geiger
counter does not indicate the type of radiation detected. There is also a “dead
time” following each voltage pulse. During this time the detector will not reg-
ister any further ionization events. This limits the maximum count rate that
can be measured accurately.
low pressure inert gas (e.g.

cylindrical helium, neon, or argon)
metal cathode
voltage pulse
central wire
anode (+)
−
variable supply
thin end window 400-600V
FP.CH25_3pp.indd 559 3/15/2023 12:22:56 PM

The count rate from a Geiger counter should not be confused with the activity
of a source. The latter is the number of disintegrations per second inside the
source. The count rate on the Geiger counter is related to this but is usually
only a small fraction of the activity because of the emissions that miss the
counter, are not detected by it, or are absorbed inside the source or between
the source and the detector.
Count rates are usually recorded as counts per minute (cpm) and often have
to be corrected for the average background count where the experiment is
being carried out.
25.7.3 Using a Geiger Counter to Measure Count Rates

Before measuring the count rate from a radioactive source we must measure
the average background count from the surroundings. This is done by setting
up all of the apparatus except the source and then taking a series of measure-
ments using the Geiger counter. Typically, we might make five readings, each
over five minutes, and then use the average number of counts per minute
(cpm) as the average background count. After this has been done we can
carry out the experiment with the Geiger counter in the same position and
the source present and measure the total counts per minute in a similar way
(based on an average of five readings of five minutes each). The corrected
count is then the difference between the total count rate and the average
background count rate.
Corrected count rate (ccpm) = total count rate (tcpm) - average background
count rate (bcpm)
The diagram below shows a possible arrangement of apparatus to measure the
penetrating power of beta rays through aluminum:
constant distance
GM tube counter or
sealed
source rate meter
absorber (variable thickness)
FP.CH25_3pp.indd 560 3/15/2023 12:22:57 PM

25.8 USING RADIOACTIVE SOURCES

All of the properties of ionizing radiation have useful applications.
The exponential decay and constant half-life can be used like a clock to
date archaeological samples and minerals on Earth and elsewhere in the
solar system (as described below).
The penetrating power of ionizing radiation can be used to measure the
thickness of materials and to locate medical tracers inside the body.
The energy of ionizing radiation can be used to sterilize equipment and
to kill cancer cells.
The absorption of ionizing radiation can be used to form images of pipe-
lines and structures and to locate cracks and defects.
25.8.1 Radiological Dating

The best-known dating technique uses the isotope carbon-14, a beta-minus
emitter with a half-life of 5730 years. Approximately 1 in 1012 atoms of carbon
in the Earth’s atmosphere is carbon-14. The rest are carbon-12 and c arbon-13,
both of which are stable.
Atmospheric carbon-14 is continually replenished by cosmic rays from space
which bombard the atmosphere resulting in a flux of neutrons. When neutrons
strike nitrogen atoms (78% of the Earth’s atmosphere is nitrogen) c arbon-14
is created:

1
0 n + 147 N Æ 14
6 C + 01 p
This maintains a roughly constant proportion of carbon-14.

Living things continually exchange carbon with the atmosphere so, while they
are alive, they contain the same proportion of carbon-14 as the atmosphere.
When a living thing dies the exchange ceases so the proportion of carbon-14
decays exponentially. Radiocarbon dating measures the fraction of carbon-14
in the carbon content of a sample of once-living material and uses this to esti-
mate the age of the sample.
If the fraction in the atmosphere is f0 and the fraction in the sample is f:
f = f0 e−λt

where l is the decay constant for carbon-14.
FP.CH25_3pp.indd 561 3/15/2023 12:22:57 PM

The age of the sample is then:
 f   f 
ln   ln  
=t =  f0   f0 
−λ − ln 2
This method assumes that there has been no contamination of the sample
and that the fraction of carbon-14 in the atmosphere has remained constant
over the time period being measured. Radiocarbon dating methods are cali-
brated against other methods such as dendrochronology (tree ring counting).
f/f0 × 100%
100
90
80
70
60
50
40
30
20
10
time /y
0
573 1146 17190
25.8.2 Radiological Dating of Rocks

Radiocarbon dating cannot be used to date non-organic material unless
there is some organic material known to be of the same age associated
with it. However, radioactive dating methods (radiometric dating) using
other isotopes can be used to date rocks. One method uses two unstable
isotopes of uranium, uranium-238 (with a half-life of 4.5 billion years) and
uranium-235 (with a half-life of 700 000 years). Both decay, via many steps,
to stable isotopes of lead (lead-207 and lead-206 respectively) so the ratio of
lead to uranium grows as time goes on and this ratio can be used as a clock.
Another method uses the decay of potassium-40 to argon-40 (with a half-life
FP.CH25_3pp.indd 562 3/15/2023 12:22:57 PM

of 1.3 billion years). All such methods rely on assumptions about the initial
state of the rocks, for example, that there was no argon present in a rock
when it solidified so that all of the argon now present must have come from
the decay of potassium, and about the isolation of the rock since its forma-
tion, that is, so that no potassium or argon has been added from or lost to
external sources.
25.9 EXERCISES
1. The discovery of the electron by J.J.Thomson in 1897 showed that atoms

are not fundamental particles. Thomson thought that atoms were like
“plum puddings” with negatively charged electrons embedded in a sphere
of positive matter. However, Rutherford’s interpretation of the alpha par-
ticle scattering experiment carried out by Geiger and Marsden showed
that Thomson’s model was not correct and led to the nuclear model of
the atom.
(a)
Use a diagram to describe Rutherford’s alpha particle scattering
experiment.
(b) List the main observations from the experiment.
(c) Explain how evidence from the experiment supports the idea that:
i. Atoms are mainly empty space.
ii. The nucleus is charged.
iii. The nucleus contains most of the mass of the atom.
(d) Explain how, if we know the incident energy of the alpha particles, it
is possible to estimate an upper limit for the size of a nucleus.
(e) Calculate the closest approach of 5.0 MeV alpha particles to a gold
nucleus (Z = 79).
(f) State an assumption that was made to carry out the calculation in (e).
2. State and explain two advantages of using high-energy electron beams

from an accelerator rather than alpha particles from radioactive sources
to measure nuclear diameters.
FP.CH25_3pp.indd 563 3/15/2023 12:22:58 PM

3. (a) Copy and complete the table below for neutral atoms.
Isotope Symbol Atomic Nucleon protons electrons neutrons
number number
Hydrogen 1
12
Carbon - 12 6 C
Carbon -13
Carbon - 14
Oxygen - 16 8
56
Iron - 26 Fe
Gold - 79 197
Uranium - 235 92
Uranium - 238
(b) Use examples from the table to explain what is meant by an “isotope.”
(c) The heavier nuclei, such as iron, gold, and uranium, have an excess of
neutrons. Suggest a reason for this.
4. A radioactive rock is tested in a school laboratory. Here are the results:
Set-up Number of counts in 5 minutes
No rock present 100
Rock alone present 900
Rock behind card 402
Rock behind 2 mm aluminum sheet 398
Rock behind 1 cm lead 123
Rock alone present 24 hours later 197
(a) What is the average background count (in cpm).
(b) Why must we use the term “average”?

(c) Roughly, what is the half-life of the source?
(d) What is/are the main type(s) of radiation emitted by the rock? Give
reasons for your answer.
5. Carbon-14 has a half-life of 5700 years. While alive organisms exchange
carbon with their surroundings and maintain a constant small fraction of
the isotope carbon-14. When a living creature dies its carbon-14 content
FP.CH25_3pp.indd 564 3/15/2023 12:22:58 PM

is not replenished so this fraction falls. If the activity due to carbon-14 is

measured, the time since the organism died can be calculated.
(a) Calculate the decay constant of carbon-14.

(b) A sample taken from an archaeological relic contains only 1/16 of the
carbon-14 that would be present in a sample of the same mass taken
from a living creature. How old is the relic? What assumptions have
been made?
(c) How long does it take after death for the ratio of carbon-14 to c arbon-12
to fall to 10% of its original value?
6. Two radioisotopes, X and Y, have half-lives of 1 hour and 2 hours respec-
tively. N atoms of X have an activity equal to that of M atoms of Y. What
is the ratio M/N?
7. Nuclide P decays to nuclide Q with half-life of 1000 years. Q is stable. At
t = 0, there are N atoms of P and none of the Q present.
ork out the number of nuclei of P and Q after 1000, 2000, 3000, and
W
5000 years. Tabulate your results and use them to sketch a graph showing
how the number of atoms of both nuclides varies with time.
8. Gamma-ray photons of a particular energy from a radioactive source have
a 98% probability of penetrating 0.10 mm of lead.
(a) A lead shield is placed in front of the source. Show that the intensity of
transmitted radiation falls off exponentially with the thickness of the
lead shield.
(b) Calculate the half-thickness of lead for this radiation.
(c) For safety reasons the intensity of gamma-rays from this source must
be reduced to 1 % of its original value. Calculate the thickness of lead
shielding required to do this.
9. Discuss the advantages and disadvantages of using radioactive sources as
tracers in medical diagnosis.
10. There are four naturally occurring radioactive series, all of which end with a
stable isotope of lead. The uranium series starts with uranium-238 (Z = 92)
and ends with lead-206 (Z = 82).
(a) All of the decays in the series are either alpha decays or beta-minus
decays. State the number of each type of decay in the entire series.
FP.CH25_3pp.indd 565 3/15/2023 12:22:58 PM

(b)
The first three nuclides in the series are: uranium-238 (atomic
number 92), thorium-234, palladium-234, and uranium-234. Write
down balanced nuclear transformation equations for the first three
decays.
(c) The half-lives of the four nuclides above are: 4.51´109 years, 24.1 days,
77s, and 2.47´107 years. Suggest how the relative abundances of each
nuclide compare.
(d) The isotope bismuth-214 (Z = 83) can decay by either alpha or beta-minus
decay, but whichever decay it undergoes the series reaches lead-210
(atomic number 82) in two steps. Write down the nuclear transforma-
tion equations to show how this occurs by each route. (The element with
Z = 81 is thallium and the element with Z = 84 is polonium).
11. An experiment was carried out using a Geiger counter to monitor the
activity of a radioactive source. The table below gives the average counts
per minute, corrected for background, recorded during the experiment.
Time, s Activity, 1013 s-1
0 1.64
50 1.36
100 1.13
150 0.935
200 0.775
250 0.643
300 0.533
50 0.422
400 0.366
450 0.304
500 0.252
se a log-linear graph to determine the decay constant and half-life of

U
this source.
FP.CH25_3pp.indd 566 3/15/2023 12:22:58 PM

CHAPTER
26
Nuclear Physics
26.1 NUCLEAR ENERGY CHANGES

Nucleons inside nuclei are in bound states. This means that their total energy
is negative and work would have to be done to separate them. The work that
must be done is called the binding energy of the nucleus. Note that this is not
energy in the nucleus, it is energy that must be supplied to break it apart or
that would be released if the nucleus formed from separate nucleons. When
nuclear transformations such as radioactive decay, nuclear fission, or nuclear
fusion occurs, the total binding energy of the system changes and energy can
be released, for example, in the form of the kinetic energy of the products of
the reaction. Einstein’s mass-energy relationship, E = mc2, links these energy
changes to mass changes.
26.1.1 Nuclear Binding Energy

The binding energy of a nucleus is equal to the work that must be done to
separate all of the nucleons in the nucleus so that they are no longer interact-
ing. In theory, this would mean separating them to infinite distances but the
strong nuclear force that binds them together is extremely short range and
effectively falls to zero at distances greater than a few times 10−15 m:
The energy of a nucleon is lower when it is inside the nucleus than when it is
a free particle so its mass is also lower. The total mass of a nucleus is there-
fore less than the sum of the rest masses of the free nucleons from which it is
made. The difference in mass between the mass of the free nucleons and the
mass of the nucleus is called the mass defect ∆m for the nucleus. The binding
FP.CH26_3pp.indd 567 3/15/2023 12:48:21 PM

energy B.E. of the nucleus is equal to the mass defect multiplied by the speed
of light squared.
repulsion
strong nuclear force
3 1015 m
attraction
Mass defect = (sum of masses of free nucleons) − (mass of nucleus)

m Zm p Nm n Mnucleus
Binding energy: B.E. = c2∆m

Binding energy per nucleon: B.E./A = c2∆m /A
When carrying out calculations of nuclear binding energy it is important to
use nuclear mass and not atomic mass. Most tables of data give atomic masses
so the mass of Z electrons must be subtracted from this. The data below has
been used to calculate the binding energy and binding energy per nucleon of
an oxygen-16 nucleus.
Particle Mass, kg
electron 9.1094×10−31
proton 1.6726×10−27
neutron 1.6750×10−27
Oxygen-16 atom 26.5676×10−27
Mass of oxygen-16 nucleus = (mass of oxygen-16 atom) − 8 × (mass of elec-

tron) = 26.5603 × 10−27 kg
FP.CH26_3pp.indd 568 3/15/2023 12:48:23 PM

Nuclear Physics • 569
Mass of nucleons = 8 × (mass of neutron) + 8 × (mass of proton)

= 26.7808 × 10−27 kg
Mass defect = ∆m = (mass of nucleons) − (mass of nucleus) = 0.2205 × 10−27 kg
Binding energy = c2∆m = 1.9845 × 10−11 J = 124.0 MeV
Binding energy per nucleon = B.E./16 = 7.752 MeV/nucleon
26.1.2 Atomic Mass Units (amu)

Atomic masses are usually stated in terms of unified atomic mass units, u.
1 u = 1/12 × (mass of an unbound carbon-12 atom in its ground state)
1 u = 1.660 539 040 × 10−27 kg
The mass of a carbon-12 atom in these units is 12 u and 1 u is approximately
equal to the mass of a nucleon (proton or neutron).
The energy equivalent of 1 u is 1.494 × 10−10 J = 932.9 MeV
Nuclear energy calculations are often carried out by finding the mass defect in
atomic mass units and then multiplying by the energy equivalent to 1 u.
26.1.3 Energy Released by Nuclear Decays

To calculate the energy released in a radioactive decay process we must find
the mass defect for the reaction. Here are some examples.
Alpha decay of uranium-238

238
92 U 234
90 Th 42
The relevant atomic masses are:
Atom Mass
4
2 He 4.002603 u
234
90 Th 234.04364 u
238
92 U 238.05082 u
FP.CH26_3pp.indd 569 3/15/2023 12:48:30 PM

The mass defect for the reaction is the difference between the nuclear masses
on each side of the equation. To calculate the nuclear masses we need to sub-
tract 92me from the left-hand side and (90 +2)me from the right-hand side.
These electron masses cancel so we can work directly with the atomic masses:
∆m = 238.05082 u − (234.04364 + 4.002603) u = 0.00458 u
The energy released is: E = 932.9 × 0.00458 = 4.27 MeV
This is shared between the alpha particle and the recoiling nucleus and linear
momentum must be conserved so (in the reference frame of the original ura-
nium-238 atom) the two must travel in opposite directions and:
mnucleus vnucleus m v
so that:
m 234
v nucleus vnucleus vnucleus
m 4
The alpha particle travels much faster than the recoiling nucleus and carries
away most of the kinetic energy.
2
1 1 m m 234
E m v 2 m nucleus vnucleus 2 nucleus Enucleus Enucleus
2 2 m m 4
The alpha particle gets 234/238 E.
Beta-minus decay of carbon-14
14
6 C 147 N 01 00
Atom Mass
0
1 0.000549 u
14
6 C 14.003242 u
14
7 N 14.003074 u
The neutrino has negligible rest mass.
FP.CH26_3pp.indd 570 3/15/2023 12:49:00 PM

The mass defect for the reaction is the difference between the total rest masses
on each side of the equation. To calculate the nuclear masses of carbon-14 and
nitrogen-14 we must subtract 6 me from the atomic mass of carbon-14 and 7
me from the atomic mass of nitrogen-14.
Mass defect = (14.003242 − 6 × 0.000549) u − ((14.003074 − 7 × 0.000549)
+ 0.000549) u
When the beta-minus is included, the electron masses cancel:
∆m = (14.003242 − 14.003074) u = 0.000168 u
The energy released is: E = 932.9 × 0.000168 = 157 keV
This energy is shared between the recoiling nitrogen-14 nucleus, the beta-
particle, and the neutrino. The mass of the neutrino and beta-particle is neg-
ligible compared to the mass of the nucleus so virtually all the energy goes
to the two light particles. However, this energy is shared randomly between
them so beta-particles from a carbon-14 source are emitted with a continuous
range of kinetic energies up to a cut-off value of 157 keV.
26.2 NUCLEAR STABILITY

Inside an atomic nucleus the nucleons are bound together by the strong
nuclear force. However, the protons are all positively charged and repel one
another so the strong nuclear forces and the electrostatic forces act in oppo-
sition. These two forces are very different – the nuclear force becomes very
strong at short distances, binding nucleons together when they are close, but
it has a very short range so it only strongly affects nearest neighbors. The elec-
trostatic force, however, obeys an inverse-square law, so all the protons exert
significant repulsive forces on all other protons in the nucleus.
For very large nuclei the cumulative effect of coulomb repulsion outweighs
the attractive force between nearest neighbor nucleons and the nucleus
becomes unstable. Ultimately this sets a limit to the size of stable nuclei and
explains why the Periodic Table contains only about 100 elements.
26.2.1 Nuclear Configuration and Stability

If a graph of neutron number against proton number is drawn the stable
nuclei form a narrow band. On each side of this band, there are unstable
FP.CH26_3pp.indd 571 3/15/2023 12:49:00 PM

nuclei with specific decay modes that result in product nuclei lying closer to
the band of stability.
band of stability
neutron number:
N = A Z
alpha emitters
beta-minus
emitters
beta-plus emitters
and electron capture
proton number: Z
Beta-minus emitters: N/Z too high - neutron rich nuclei that approach
stability by converting a neutron to a proton.
Beta-plus emitters and electron capture nuclei: N/Z too low – proton-rich
nuclei that approach stability by converting a proton to a neutron.
Alpha emitters: N/Z too low – heavy proton-rich nuclei that approach sta-
bility by reducing both N and Z by 2. Since N>Z this reduces the ratio of
N to Z.
Nuclear fission: some heavy nuclei close to the top of the band can
undergo induced or spontaneous nuclear fission to create pairs of neu-
tron-rich daughter nuclei which lie about half-way down the band and are
beta-minus emitters.
FP.CH26_3pp.indd 572 3/15/2023 12:49:00 PM

The diagram below shows the effect of alpha and beta decays on a plot of
proton number against neutron number:
neutron number
Z 2 Z 1 Z Z+ 1 Z+ 2
N+ 2
N+ 1
+

ec
N

N1
N2
proton number
pr
26.2.2 Nuclear Binding Energy and Stability

The diagram below shows the variation of nuclear binding energy per nucleon
with nucleon number across the Periodic Table.
B.E. per nucleon / MeV

10.0
8.0
6.0
4.0
2.0
Nucleon number
0.0 50 100 150 250
200
FP.CH26_3pp.indd 573 3/15/2023 12:49:03 PM

It is better to compare binding energy per nucleon rather than total nuclear
binding energy because the latter depends on the number of nucleons in the
nucleus so that a large value does not necessarily mean that the nucleus is
particularly stable.
Binding energy per nucleon increases rapidly with nucleon number for
light nuclei.
The curve has a peak value that occurs for iron-56. This is the most stable
nuclide.
For nuclides heavier than iron-56 the binding energy per nucleon gradu-
ally falls.
Most nuclides (from Oxygen to Uranium) have a binding energy per
nucleon between 7.5 and 8.5 MeV/nucleon.
Some light nuclides, such as helium-4, carbon-12 and oxygen-16 have
particularly large binding energy per nucleon compared to other nearby
nuclides.
26.3 NUCLEAR FISSION AND NUCLEAR FUSION

Einstein’s equation E = mc2 explains where the energy released in radioactive
decay comes from but it also raised the question of whether there were other
ways to release nuclear potential energy. There are two important processes
that can be used to do this, nuclear fission and nuclear fusion. Nuclear fission
is the splitting of a heavy nucleus to create two lighter daughter nuclei and
nuclear fusion is the combination of two light nuclei to create a more massive
nucleus. The binding energy per nucleon curve shows that both processes will
release energy.
‘daughter
g nuclei’
B.E./A
nuclear fission
nuclear fusion
FP.CH26_3pp.indd 574 3/15/2023 12:49:03 PM

The initial steepness of the curve shows that nuclear fusion releases more
energy per kilogram of fuel than nuclear fission. However, the high nucleon
number of the fissioning nucleus shows that nuclear fission releases more
energy per reaction than nuclear fusion.
26.3.1 Nuclear Fission
Nuclear fission was discovered by Otto Hahn and Lise Meitner in 1938. They
were bombarding uranium with neutrons and noticed that lighter nuclei with
mass numbers approximately half that of uranium were being formed. They
came to the correct conclusion that the neutrons had induced fission reactions
in nuclei of uranium. However, natural uranium consists almost entirely of
two isotopes, uranium-238 (99.3%) and uranium-235 (0.7%). The fissionable
isotope is uranium-235 while uranium-238 is more likely to absorb neutrons
than fission.
There are many ways that the uranium-235 nucleus can split when it absorbs a
neutron. Here is one nuclear equation for the induced fission of uranium-235:
1
0 n 235
92 U 144
56 Ba 36 Kr 2 0 n
90 1
Atom Mass
1
0 n 1.009 u
235
92 U 235.044 u
144
56 Ba 143.923 u
90
36 Kr 89.920 u
We need to work with nuclear masses, but since there are equal numbers of
electrons to subtract from each side of the equation we can, once again, sim-
ply use the atomic masses:
Mass defect = (235.044 + 1.009) u − (143.923 + 89.920 + 2×1.009) u = 0.192 u
The energy released is: E = 932.9 × 0.192 = 179 MeV
The fact that additional neutrons are emitted could lead to further nuclear
fission reactions. The Hungarian physicist Leo Szilard realized that it would
be possible to initiate a chain reaction if more than one neutron per fission on
average went on to cause further fission reactions.
FP.CH26_3pp.indd 575 3/15/2023 12:49:11 PM

A chain reaction can release a huge amount of energy. If it is allowed to run

out of control this energy is released explosively – this is the principle behind
an “atom bomb.” However, if the chain reaction can be controlled at a con-
stant power, it can be used to generate electricity. This is the principle behind
thermal nuclear power stations. The main problem facing both approaches
is that 99.3% of natural uranium is uranium-238, a neutron absorber. This
reduces the average number of neutrons per fission that can initiate further
fission and effectively stops the reaction.
incident neutron
Uranium-235 nucleus
undergoing fission
One way around this problem is to use enriched uranium, that is, uranium
with a higher content of U-235, but this is difficult to obtain in large quanti-
ties because U-235 and U-238 are isotopes of the same chemical element and
so have the same chemical properties. Various separation techniques have
been used but the most successful involves centrifuging uranium hexafluor-
ide, a gaseous uranium compound, to increase the concentration of U-235.
Since this method can be used to produce fuel for nuclear reactors (about
4% enrichment) and for weapons (about 80% enrichment) it is very difficult
to distinguish between the peaceful production of enriched uranium and its
production for atomic bombs.
26.3.2 The Principle of the Atomic Bomb

The first atomic bombs were developed in the USA during the second world
war. In August 1945, atomic bombs were dropped on the Japanese cities of
FP.CH26_3pp.indd 576 3/15/2023 12:49:11 PM

Hiroshima and Nagasaki. The Hiroshima bomb used fission of uranium-235

and the Nagasaki bomb used fission of plutonium-239.
To get an uncontrolled chain reaction a minimum or critical amount of fission-
able material had to be produced. This is because the surface area to volume
ratio decreases with the volume of material. The rate at which neutrons are
produced depends on the volume of the sample, whereas the number lost
from the surface depends on its surface area, so decreasing this ratio tips the
balance toward the chain reaction. When a critical assembly is produced, the
reaction proceeds very rapidly and there is a huge explosion.
In both the Hiroshima and Nagasaki bombs sub-critical amounts of fission-
able material were made critical when the bomb detonated. However, the
techniques used were quite different. The Hiroshima bomb used a gun design
where a sub critical amount of uranium-235 was fired (by conventional explo-
sives) into another subcritical amount so that the combination became critical
and exploded.
conventional
explosive
(detonator)
sub-critical mass of sub-critical mass of

uranium-235 uranium-235
In the Nagasaki bomb, a spherical lump of plutonium-239 was surrounded by

shaped conventional explosives. When it detonated shock waves compressed
the sphere so that its surface area reduced and it became critical. It too
exploded. The energy released by each bomb was equivalent to the explosion
of about 10 000 metric tonnes of TNT (conventional high explosive).
shaped explosives
sub-critical assembly of
plutonium-239
compressed, critical
assembly of
plutonium-239
While atom bombs are incredibly powerful weapons they release far less
energy than a hydrogen bomb, which is based on nuclear fusion reactions.
FP.CH26_3pp.indd 577 3/15/2023 12:49:25 PM

In order to initiate the intense conditions of temperature and pressure under

which nuclear fusion reactions take place, a fission weapon is detonated first,
for this reason, such weapons are called “thermonuclear devices.”
26.3.3 Nuclear Reactors

In the core of a nuclear reactor the chain reaction is allowed to grow steadily
until it reaches a constant power level and then it is kept at that level. Under
these conditions exactly one neutron per fission goes on to initiate further
fission reactions. These reactions heat the core and energy is extracted by
pumping a coolant, usually water, through the core. The heated water is used
to generate steam to drive turbogenerators. In many respects, this is the same
principle as in a fossil-fueled power station. The main difference is that the
source of the energy is nuclear rather than chemical.
To achieve a stable critical assembly a “moderator” is used. This is a material
that slows down the neutrons emitted when a uranium-235 nucleus undergoes
fission. The fast neutrons emitted by fission have a chance of being absorbed
by a U-238 nucleus or initiating fission in a U-235 nucleus. Slowing them down
increases both probabilities but has a much greater effect on the probability
of initiating fission in U-235 so it allows a chain reaction to proceed at lower
levels of enrichment than are needed for the spontaneous chain reaction in a
bomb. The moderator contains nuclei of relatively light elements (e.g., car-
bon, water, or heavy water) so that fast neutrons make a number of collisions
with these nuclei and transfer energy and momentum to them, slowing down
as they do so. Eventually, the mean kinetic energy of the neutron is compa-
rable to the thermal kinetic energy of particles in the moderator, so they are
called “thermal neutrons.” Fuel rods are surrounded by moderating material:
fuel rods slow (thermal) neutron

moderator iniates fission
slow (thermal) neutron

iniates fission fast neutron
from fission
FP.CH26_3pp.indd 578 3/15/2023 12:49:25 PM

The reaction is controlled using neutron-absorbing material such as boron or

cadmium in control rods that can be raised or lowered inside the core of the
reactor. In an emergency, the control rods are dropped into the core and the
chain reaction stops.
Coolant is pumped through the core of the reactor. This extracts energy to
generate electricity but also stops the reactor core overheating and melting
(“melt down”). Failure of the coolant system activates fail-safe systems that
immediately lower the control rods and switch off the reactor.
The diagram shows a simplified arrangement for a pressurized water reactor.
This is the most common type of nuclear reactor and is in use in many coun-
tries including the USA, France, Russia, Japan, and China.
containment
control rods
steam to
pressure vessel
turbo-
generator
fuel rods
core
cool water
condensed
heat
from steam
light water
moderator
steam
pump
generator
FP.CH26_3pp.indd 579 3/15/2023 12:49:26 PM

There are two cooling circuits. In the primary circuit, water is pumped up
through the core and then returns to the core via a heat exchanger. In the heat
exchanger energy is transferred to water in the secondary circuit. This gener-
ates steam that is used to drive a turbogenerator to generate electricity. The
steam is then condensed and returned to the heat exchanger. Cold water from
a lake or river is needed to operate the condenser and large cooling towers are
used to cool this water once it has returned from the condenser.
26.3.4 Plutonium
Plutonium is a fissile material that can be used in bombs and reactors. However,
it does not occur naturally on Earth in any significant quantities but it is cre-
ated as a by-product of nuclear fission reactions in a reactor core. Neutrons
that are absorbed by uranium-238 create an unstable and short-lived nuclide,
uranium-239 that undergoes a beta-minus decay to form neptunium-239.
This is also unstable with a relatively short half-life and it undergoes a second
beta-minus decay to form plutonium-239:
1
0 n 238
92 U 239
92 U
239
92 U 239
93 Np 01 00
239
93 Np 239
94 Pu 01 00
Plutonium can be harvested from spent fuel rods. This is called “reprocessing.”
26.3.5 Nuclear Fusion

Nuclear fusion is the combination of two light nuclei to form a more mas-
sive nucleus with the release of a great deal of energy. Here is an example of a
nuclear fusion reaction which combines two isotopes of hydrogen (deuterium
and tritium) to form helium-4 and release a neutron. This reaction might be
used in future fusion reactors:
2
1 H 31 H 42 He 01 n
FP.CH26_3pp.indd 580 3/15/2023 12:49:31 PM

Atom Mass
1
0 n 1.008664 u
2
1 H 2.014102 u
3
1 H 3.0160492 u
4
2 He 4.002603 u
We need to work with nuclear masses, but since there are equal numbers of
electrons to subtract from each side of the equation we can, once again, sim-
ply use the atomic masses.
Mass defect = (2.014102 + 3.0160492) u − (4.002603 + 1.008664) u = 0.0189 u
The energy released is: E = 932.9 × 0.0189 = 17.6 MeV
This is about 3.5 MeV/nucleon compared to about 0.76 MeV/nucleon from
nuclear fission (combustion releases less than 1 eV per nucleon!)
For nuclear fusion to take place the reacting nuclei must come close enough
(a few times 10−15 m) for the short range strong nuclear force to bind them
together. However, all nuclei are positively charged and repel one another.
To approach close enough for fusion to take place they must have a very
large kinetic energy. This can be achieved by accelerating the nuclei and
then crashing them together in a device such as the Large Hadron Collider
(LHC) at CERN or by confining the reactants and heating them to extreme
temperatures.
Three situations that involve nuclear fusion reactions are:
Nucleosynthesis – the formation of heavy nuclei from light nuclei in the
cores of stars.
Thermonuclear weapons – the nuclear fusion of isotopes of hydrogen
in a bomb.
Fusion reactors – commercial reactors designed to produce electrical
energy from nuclear fusion.
26.3.6 Nucleosynthesis
Soon after the Big Bang the early Universe consisted mainly of hydro-
gen with some helium and trace amounts of other nuclei. Nuclei of all the
heavier elements were formed (and are still being formed) by nuclear fusion
FP.CH26_3pp.indd 581 3/15/2023 12:49:40 PM

reactions taking place in stars. Nuclei of elements up to iron-56 are formed

in the cores of stars during most of their “normal life” (when they are on the
“main sequence”). Nuclei beyond iron-56 are formed when very massive stars
explode at the end of their lives (forming supernovae).
Stars form when clouds of gas and dust collapse under their own gravita-
tional forces. As gravitational potential energy falls the gas and dust heat up
and when the temperature and pressure at the core of the collapsing mass
become high enough nuclear fusion reactions begin. At this point, a proto-
star is formed. The higher the mass of the star the more extreme the core
conditions and elements higher up the Periodic table can form. This process
of nucleosynthesis stops at iron-56 because this is the most stable nuclide.
Nuclides lighter than this are formed by exothermic reactions whereas those
beyond iron-56 are only formed in endothermic fusion reactions and so need
an external source of energy. The energy released in the core by exothermic
nuclear fusion reactions generates an outward radiation pressure that sup-
ports the star against further gravitational collapse during most of its “life.”
Star formation : Nuclear fusion Star in equilibrium:

gravitational reactions nucleosynthesis reactions in
collapse of gas ignite in core: core. Outward radiation
and dust. protostar pressure balances pressure
forms. from gravitational forces.
As fuel for the fusion reactions in the core runs out gravitational forces cause
the core to collapse. What happens next depends on the mass of the star
(see Section 28.1.1) but when stars of mass greater than about 10 times the
mass of the Sun collapse, they undergo a sequence of fusion reactions and
create all the nuclides up to iron-56 and then explode in a supernova. Some of
the energy released in the explosion creates heavier nuclei up to uranium and
the explosion distributes them throughout space.
FP.CH26_3pp.indd 582 3/15/2023 12:49:41 PM

Our Sun is a medium-sized star and will spend almost all of its life s ynthesizing
helium from the hydrogen in its core. The net effect is to convert four protons
into a helium nucleus but the probability of this happening in one step by a
fortunate collision of four particles with enough energy to get close enough
to fuse is effectively zero. The main process by which helium is created in the
Sun is called the proton-proton cycle which proceeds in three steps.
Step 1: two protons collide to form a deuteron, a positron (anti-electron),
and a neutrino:
1
1 H 11 H 21 H 10 e 00
Step 2 (twice): a proton collides with a deuteron to form a nucleus of
helium-3 and emit a gamma-ray:
1
1 H 21 H 32 He 00
Step 3: two helium-3 nuclei collide to form a helium-4 nucleus and
release two protons:
3
2 He 32 He 42 He 2 11 H
This process releases about 26 MeV per helium-4 nucleus produced. The
overall reaction for the proton-proton cycle is then:
4 11 H 42 He 2 10 e 2 00
The two positrons created in the core almost immediately annihilate with
electrons creating high energy gamma-rays that contribute to the outward
radiation pressure that supports the star. The neutrinos are very weakly inter-
acting and pass through the outer layers of the Sun and into space. The flux
of solar neutrinos detected on Earth gives astronomers a way to monitor the
fusion processes going on in the Sun’s core. A different sequence of fusion
reactions (the CNO cycle) creates about 10% of the Sun’s energy and is domi-
nant in stars above about 1.3 solar masses. Astronomers estimate that there
is enough hydrogen left in the core for the Sun, which was formed about 5
billion years ago, to continue to shine for another 5 billion years.
26.3.7 Thermonuclear Weapons

The need to form a critical mass and for a significant amount of the fissile
material in that mass to fission in a chain reaction limits the maximum yield
FP.CH26_3pp.indd 583 3/15/2023 12:49:47 PM

of an atom bomb (fission weapon). A thermonuclear weapon uses a fission

explosion to create the extreme temperatures and pressures under which iso-
topes of hydrogen can fuse. This makes the yield of a fusion weapon (hydro-
gen bomb) effectively unlimited. The most powerful thermonuclear bomb
ever exploded was the RDS-220 or “Tsar bomb” detonated by Russia in 1961.
It released an energy equivalent to about 50 million tonnes of TNT, that is
about 5000 times more energy than the atom bomb dropped on Hiroshima.
The Teller-Ulam design of a hydrogen bomb, as shown below, has two dis-
tinct parts: a primary device rather like the implosion weapon dropped on
Nagasaki, and a secondary device containing the fusion fuel that is imploded
by a focused shock wave from detonation of the primary device. A fissile
“spark plug” runs through the center of the second device and both enhances
the reaction and propagates the shock wave that compresses the fusion fuel.
The implosion creates the extreme conditions needed for the fuel (ultimately
deuterium and tritium) to undergo nuclear fusion reactions and release a
huge amount of energy.
shaped explosives
primary device fissile core
fissile ‘spark plug’

secondary device fusion fuel
26.3.8 Fusion Reactors

All existing commercial nuclear reactors use nuclear fission reactions in ura-
nium or plutonium. While this is a well-established technology there are sig-
nificant disadvantages to the use of nuclear fission to generate electricity:
FP.CH26_3pp.indd 584 3/15/2023 12:49:47 PM

Fission reactors produce high level nuclear waste consisting of a complex

cocktail of radioactive nuclides. Proposed strategies for the long-term safe
storage of nuclear waste (where these exist) are controversial.
The fuel for fission reactors, uranium, and plutonium, is limited, and
sources of fuel are not accessible to all nations equally.
Equipment needed to enrich uranium for reactors can also be used to
enrich it to weapons-grade so it is difficult to distinguish peaceful from
military nuclear programs.
When a reactor fails, for example, in a melt-down of the core, there is the
potential for massive environmental damage (e.g., Chernobyl, F ukushima)
as radioisotopes escape from the reactor.
Nuclear fission reactors are complex and the cost of decommissioning
reactors at the end of their working life adds a significant amount to the
cost of the electricity they generate.
If a reactor could be constructed that would generate electricity commercially
from nuclear fusion reactions it would avoid most of the problems associated
with nuclear fission.
Fusion reactions involve light nuclei and produce very little radioactive
waste.
The fuel for fusion reactors, isotopes of hydrogen, is plentiful and spread
all around the world.
Nuclear fusion reactors work in a very different way to thermonuclear
weapons so it would be easier to distinguish peaceful from military pro-
jects.
When fusion reactions fail the reactor simply switches off, there is no core
and no possibility of a meltdown.
While the cost of a commercial fusion reactor is likely to be high the
decommissioning costs should be significantly less than for a fission reac-
tor because there is no highly radioactive core or spent fuel rods to deal
with.
The challenge of creating a commercial fusion reactor is very great and stems
from the need to create and maintain the extreme conditions under which
fusion reactions can take place. The two main approaches are:
Tokamaks – reactors that support a plasma (a gas of ionized particles) in
a toroidal magnetic field and then pump it to high temperature to initiate
the fusion reactions.
FP.CH26_3pp.indd 585 3/15/2023 12:49:47 PM

Inertial confinement – focusing intense lasers onto a pellet of fusion

fuel so that it implodes, reaches extreme temperatures and pressures, and
fuses.
Both methods have achieved some success in producing fusion reactions but
have a long way to go before they can be used in a commercial reactor. The
largest current project is the International Thermonuclear Experimental
Reactor (ITER) which is being built in France by an international collabora-
tion of countries representing over half the world’s population. Its aim is to
test the feasibility of magnetic fusion reactors (Tokamaks).
The tokamak method is likely to use the fusion of deuterium (about 0.015%
of all hydrogen atoms on Earth are deuterium) and tritium, which can be cre-
ated from lithium (a common element in the Earth’s crust):
2
1 H 31 H 42 He 01 n
The reactor would be surrounded by a lithium blanket. Once self-sustaining

fusion reactions are taking place in the plasma a constant flux of neutrons can
be absorbed in the blanket where they convert lithium to tritium, effectively
creating more fuel for the reactor, for example,
7
3 Li 01 n 42 He 31 H 01 n
The blanket would also heat up and a suitable coolant (water) could be pumped
through the lithium blanket and used to raise steam to drive turbo-generators.
26.4 PARTICLE PHYSICS

Throughout the 20th century experiments with cosmic rays and in nuclear
physics revealed the existence of many new subatomic particles, with a range
of different properties. As more and more were discovered, patterns and rela-
tionships between them were identified and physicists suspected that there
might be something like the Periodic Table for these particles. They were
right, and, as with the Periodic Table, the patterns were present because of an
underlying structure to the particles.
The different types of atom are all constructed from just three types of parti-
cle: protons, neutrons, and electrons. It turns out that all of the subatomic par-
ticles can be explained in terms of two simple classes of particles: the quarks
and the leptons. The grand scheme that was constructed to explain particle
FP.CH26_3pp.indd 586 3/15/2023 12:49:50 PM

physics is called the Standard Model. Here we will simply describe the main
characteristics of each type of particle and the structure of the model.
26.4.1 Leptons
The anti-neutrino, emitted in beta-minus decay, and the neutrino, emitted in
beta-plus decay, are closely related to the electron and positron that are also
emitted in those decays. These are all “leptons,” particles that interact by the
weak nuclear force. Surprisingly it turns out that there are also heavier ver-
sions of these particles so that there are three generations of leptons:
Generation Particle/anti-particle Particle/anti-particle

First Electron and positron Electron-neutrino and anti-electron
neutrino
Second Muon and anti-muon Muon-neutrino and anti-muon-neutrino
Third Tau and anti-tau Tau-neutrino and anti-tau neutrino
The muon and tau are effectively more massive versions of the electron and
thy tend to undergo decays that eventually produce electrons. The neutrinos
are all neutral and have very low rest mass. Beams of neutrinos of any one type
begin to oscillate between the different “flavors” of neutrino so that soon the
beam contains equal numbers of electron-, muon-, and tau-neutrinos. The
discovery that this occurs solved the so-called “solar neutrino problem” where
only about 1/3 of the electron-neutrinos emitted by nuclear fusion reactions in
the Sun were detected here on Earth. The other 2/3 had oscillated to muons
or taus en route.
26.4.2 Hadrons and Quarks

Hadrons are strongly interacting particles like protons and neutrons but they
are not fundamental, they consist of combinations of smaller particles called
quarks. Most hadrons contain quark pairs (mesons) or quark triplets (baryons)
and the quarks, like the leptons, come in three generations, each with a pair
of different “flavors.”
Generation Particle/anti-particle Particle/anti-particle

First Up and anti-up Down and anti-down
Second Strange and anti-strange Charm and anti-charm
Third bottom and anti-bottom top and anti-top
FP.CH26_3pp.indd 587 3/15/2023 12:49:50 PM

The up, charm, and top quarks have a charge of + 2/3 e.

The down, strange, and bottom quarks have a charge of − 1/3 e.
Mass increases as we go down the table so that hadrons formed from bottom
quarks tend to be very heavy (and highly unstable).
Theory suggests that quarks cannot exist as free particles and we do not see
fractional charges in nature. Quarks bind together with the “color force” which
is the origin of the strong nuclear force, and they do so into triplets (baryons)
or pairs (mesons) with integer or no charge.
Protons consist of two up and one down quark: uud. Charge = (2/3 + 2/3 −
1/3) e = +e
Neutrons consist of one up and two down quarks: udd. Charge = (2/3 −
1/3 − 1/3) e = 0
While quarks interact with the strong force they are also affected by the weak
nuclear force and this can cause them to change their flavor. For example,
when a beta-minus decay occurs and a neutron changes to a proton, one down
quark inside the neutron changes to an up quark (this is a weak interaction)
creating an electron and anti-neutrino in the process.
Mesons are particles made from a quark and antiquark pair. For example, the
positive pi-meson π+ consists of an up quark and an anti-down quark: ud.
Virtual pi-mesons are exchanged between protons and neutrons to bind them
together in nuclei.
26.4.3 The Fundamental Interactions

There are four fundamental interactions in nature:
Gravitation – infinite range; described by Einstein’s general theory of
relativity.
Electromagnetism – infinite range; described by Maxwell’s equations
and quantum electrodynamics.
The weak nuclear force – short range; described by electroweak theory.
The strong nuclear force (color force) – short range; described by
quantum chromodynamics.
Three of these have been described using quantum theory and two of them,
the weak nuclear force and electromagnetism, have been unified into a sin-
gle theory, electroweak theory, so that they are seen as manifestations of the
FP.CH26_3pp.indd 588 3/15/2023 12:49:51 PM

same underlying interaction. The color interaction is described by a similar

quantum mechanism so it too is expected to unify with the electroweak inter-
action to form a single super interaction at very high energies.
e exchange of e
virtual photons
The underlying process that explains how these three quantum forces work is
based on the “exchange of virtual particles.” A virtual particle can be created
by “borrowing” energy from the Universe for a short time and then “paying
it back” when the particle disappears. This is possible because of the energy-
time Uncertainty principle in quantum theory. For example, the electromag-
netic repulsion between two electrons comes about as a result of an exchange
of virtual photons.
The exchange particles for electromagnetism are photons, for the weak force
they are W+, W−, and Z0 particles and for the strong force they are differ-
ent kinds of gluons. Richard Feynman developed a pictorial way to represent
interactions. His method was useful because it provided a link to the mathe-
matical methods needed to solve problems in quantum electrodynamics. The
diagrams are known as “Feynman diagrams” and the diagram below shows
two of the many ways a pair of electrons might interact:
e e e e
exchange of a single photon exchange of two photons
In quantum electrodynamics (QED), all possible ways in which an interaction

could take place to contribute to the probability of how it does take place,
and Feynman diagrams provide a way to identify and order the different
possibilities.
FP.CH26_3pp.indd 589 3/15/2023 12:49:51 PM

Gravity is still described by Einstein’s general theory of relativity. This is an

elegant and highly successful mathematical theory but it is based on continu-
ous variations in the geometry of space-time and so far, no one has succeeded
in finding an acceptable theory of quantum gravity. This is one of the most
important outstanding questions in physics – how to connect general relativity
and quantum theory.
26.4.4 The Conservation Laws

We are familiar with the conservation of momentum, mass energy, and charge
but there are several other important conservation laws in particle physics, in
particular:
Conservation of baryon number: the total number of baryons in the Uni-
verse cannot change. Each baryon, for example, proton or neutron, has
baryon number B = 1. Anti-baryons have number B = −1. Mesons have
baryon number zero. Effectively quarks have a baryon number + 1/3 and
anti-quarks have baryon number − 1/3.
Conservation of lepton number: the total number of leptons in the uni-
verse cannot change. Each lepton, for example, electron, neutrino, or tau,
has lepton number L = 1. Anti-leptons have lepton number L = − 1.
(According to the Standard Model, lepton numbers in each generation are
separately conserved and neutrinos are massless. However, recent discoveries
have shown that neutrinos have a very small mass and can oscillate between
the three flavors. While this does not affect the conservation of total lepton
number it does undermine the conservation of lepton number by flavor.)
26.4.5 The Standard Model

The fact that everything can in principle be explained by a few leptons, quarks,
and force carriers is amazing, as is the achievement of physicists in discovering
that this is the case.
The only particle that has been added to the Standard Model in recent years
is the Higgs boson, discovered at the LHC in 2012. This particle is a quantum
of the Higgs field that fills the Universe. The interaction of each particle with
the Higgs field is responsible for setting the mass of the particle.
All of the particles in the table on the next page have anti-particles (or are
their own anti-particle).
FP.CH26_3pp.indd 590 3/15/2023 12:49:51 PM

The table below summarizes all the particles in the Standard Model.
Quarks Force
carriers
First generation Second generation Third generation
Up Charm Top Gluon

Charge 2/3 e Charge 2/3 e Charge 2/3 e Charge 0
Spin ½ Spin ½ Spin ½ Spin 1
Down Strange Bottom Photon
Charge −1/3 e Charge −1/3 e Charge −1/3 e Charge 0
Increasing mass Z Boson
Charge 0
Spin 1
Electron Muon Tau W Boson
Charge −e Charge −e Charge −e Charge ±e
Electron- Muon-neutrino TAU-neutrino
neutrino Charge 0 Charge 0
Charge 0 Spin ½ Spin ½
Spin ½
Leptons
Higgs boson
Charge 0
Spin 0
The fact that there are three generations of quarks and three generations of
leptons suggest that there is an underlying symmetry linking the quarks and
the leptons. This has led to several hypotheses about new particles and mech-
anisms for changing leptons to quarks and vice versa, but so far there has been
no experimental evidence to support these ideas. While the Standard Model
is incredibly impressive, it is unlikely to be the last word on particle physics;
FP.CH26_3pp.indd 591 3/15/2023 12:49:52 PM

there are too many arbitrary constants that have to be put into the model to
make it work.
26.4.6 Dark Matter and Dark Energy

Despite the success of physicists in constructing the standard model, dis-
coveries in cosmology have raised the strong possibility that there are as yet
undiscovered forms of matter. Careful measurements of the rotation rates of
galaxies suggest that there is nowhere near enough visible matter in these gal-
axies (i.e., stars) to provide the gravitational force needed to maintain the rota-
tion. In other words the total gravitational force from visible matter inside the
galaxy is much less than the required centripetal force for the outer parts to
rotate as they do. This was first pointed out by Fritz Zwicky in 1933 when he
tried to understand the motion of the Coma cluster of galaxies.
These observations led physicists to suggest that there must be a lot of invis-
ible or “dark matter” in the galaxies to provide the additional centripetal force.
This idea has been supported by evidence from gravitational lensing – the
deflection of light close to galaxies because their mass distorts the local space-
time geometry. The mass required to account for the observed lensing effects
is much greater than the mass of the visible matter in the galaxies. While a
small proportion of the dark matter is probably cool baryonic matter (i.e.,
“ordinary matter”) the rest is yet to be identified. Dark matter is thought to
make up 27 % of the mass of the Universe whereas ordinary matter is thought
to make up just 5 %. The remaining 68 % of the mass is thought to be “dark
energy.”
ordinary matter dark matter dark energy
ordinary
matter
5%
dark matter
23%
dark energy
72%
FP.CH26_3pp.indd 592 3/15/2023 12:49:52 PM

Dark energy is a relatively new idea put forward to explain an unexpected

discovery in cosmology. In 1998, Saul Perlmutter and others measured the
red-shifts of distant type-1a supernovae and came to the conclusion that the
expansion of the Universe is accelerating. This was a surprise because gravi-
tational attraction between the galaxies would be expected to slow the expan-
sion. Other observations seemed to support the conclusion that the expansion
rate is increasing so theoreticians needed to come up with a new model to
explain this. Their idea is that an unknown form of energy fills the Universe
and creates an outward pressure, rather like a negative gravitational force,
causing space-time to expand at an accelerating rate. The energy density is
such that the mass equivalent of dark energy in the Universe accounts for 72%
of the total mass.
26.5 EXERCISES
1. (a) Explain what is meant by nuclear binding energy.
(b) Sketch a graph to show how nuclear binding energy per nucleon var-
ies with nucleon number and use it to explain:
(i) Why iron-56 is regarded as the most stable nucleus.
(ii) How nuclear fission reaction of some heavy nuclei can release a
large amount of energy.
(iii) How nuclear fusion of some light nuclei can release a large
amount of energy.
(c) Suggest why a nucleus of an element with atomic number 120 is likely
to be highly unstable.
2. Use data from the table below to calculate the binding energy and binding
energy per nucleon for iron-56 (Z = 26).
Particle Mass
electron 0.000549 u
proton 1.007276 u
neutron 1.008665 u
iron-56 atom 55.934934 u
FP.CH26_3pp.indd 593 3/15/2023 12:49:52 PM

3. Uranium-235 (atomic number 92) is unstable and decays to thorium-231

by emitting an alpha particle of energy 4.77 MeV.
(a) Write down a nuclear transformation equation for the decay.
(b) Calculate the kinetic energy of the thorium-231 nucleus as it recoils.
(c) Calculate the mass defect in kg for the decay.
4. Here is data for the beta-minus decay of the rare isotope carbon-16.
Particle Mass
0
1 0.000549 u
16
6 C (atom) 16.01470 u
16
7 N (atom) 16.006103 u
(a) Write down a balanced nuclear equation for this decay.

(b) Calculate the maximum kinetic energy of an emitted beta-particle in
this decay.
(c) Explain why this is a maximum-value and beta-minus particles from
different decaying carbon-16 nuclei will have a range of kinetic ener-
gies.
(d) Carbon-16 has a half-life of just 0.74 s. Explain why it is not surprising
that this nucleus is highly unstable.
5. Nitrogen (atomic number 7) has 7 isotopes from nitrogen-12 to nitro-
gen-18. 99.63% of all nitrogen is the stable isotope nitrogen-14 and the
rest is almost entirely nitrogen-15, which is also stable. Here is some data
that will be useful when answering the questions that follow.
Atom Mass
Electron 0.000549 u
12 12.01864 u
7 N
13 13.005738u
7 N
14 14.003074 u
7 N
15 15.000108 u
7 N
16 16.006103 u
7 N
FP.CH26_3pp.indd 594 3/15/2023 12:50:07 PM

Atom Mass
17 17.00845 u
7 N
18 18.0142 u
7 N
13 13.003354 u
6 C
13 13.0248 u
8O
Proton 1.007276 u
Neutron 1.008665 u
Nitrogen-12 and nitrogen-13 decay by beta-plus emission while the other

unstable isotopes of nitrogen decay by beta-minus emission.
(a) Write down nuclear transformation equations for the beta-plus and
hypothetical beta-minus decay of nitrogen-13.
(b) Calculate mass defects for each of the reactions in (a) and use the val-
ues to explain why nitrogen-13 cannot decay by beta-minus emission.
(c) Calculate the binding energy per nucleon for nitrogen-14 and nitro-
gen-15. Both isotopes are stable but 99.63% of all nitrogen is nitro-
gen-14. Comment on this in the light of your calculation.
(d) Suggest, with reasons, how the half-lives of nitrogen-16, nitrogen-17,
and nitrogen-18 are likely to compare.
6. Beta-decay is a weak interaction in which the flavor of a quark inside bar-
yon changes causing a change in the type of baryon in the nucleus. In
beta-minus decay a neutron changes to a proton, and in beta-plus decay a
proton changes to a neutron.
(a) Sketch a graph of proton number against neutron number for the
stable nuclides and use it to explain where on the chart beta-plus or
beta-minus emitting nuclides are likely to be found.
(b) (i) Write down the nuclear equation for the beta-minus decay of
carbon-14 to nitrogen-14.
(ii) Write down an equation for the decay of a neutron to a proton
inside the carbon-14 nucleus in the decay in (b)(i).
(iii) Write down an equation for the change of flavor of a quark in
the neutron in the decay in (b)(ii).
Discuss whether a quark changes flavor when electron capture
(c)
occurs.
FP.CH26_3pp.indd 595 3/15/2023 12:50:24 PM

FP.CH26_3pp.indd 596 3/15/2023 12:50:24 PM
CHAPTER
27
Quantum Theory
27.1 PROBLEMS IN CLASSICAL PHYSICS

Classical physics is based on Newtonian mechanics (and gravitation) and
Maxwell’s electromagnetism. By the end of the 19th century, these two theo-
ries had been incredibly successful in explaining diverse phenomena from the
laws of thermodynamics and the propagation of radio waves to the paths of
planets in their orbits. However, a few problems were beginning to emerge
that resisted an explanation using classical physics. Here are three examples:
The Black-Body Radiation Spectrum
power radiated per

unit wavelength
predicon of classical
theory: ultraviolet
catastrophe
black-body radiaon
spectrum
wavelength
FP.CH27_3pp.indd 597 3/15/2023 2:32:30 PM

The shape of the black body radiation spectrum was well known, but all
attempts to use classical physics to derive a formula for the spectrum failed.
In fact, they agreed with the spectrum at long wavelengths (low frequencies)
but diverged drastically from it at short wavelengths (high frequencies), pre-
dicting an infinite amount of high-frequency radiation from a hot body. This
was clearly wrong. It was called the “ultraviolet catastrophe.”
Heat Capacities of Gases

According to classical equipartition theory (see Section 9.3.5) thermal energy
at temperature T should be distributed equally amongst all the degrees of
freedom available to the particles of the gas such that each degree of freedom
gains an energy of ½ kT. For a monatomic gas, there are just three degrees
of freedom corresponding to translations in the x- y- and z- directions, so the
energy per molecule is 3/2 kT. For more complex molecules, vibrational and
rotational degrees of freedom can also be excited and these should, according
to classical theory, get ½ kT as well. The greater the energy per molecule at
a particular temperature the higher the heat capacity of the gas, so measure-
ments of heat capacity allowed physicists to test the predictions of classical
equipartition theory. They found that while the predictions were usually good
at high temperatures they sometimes broke down at lower temperatures, as if
some degrees of freedom were not contributing to the heat capacity. This was
unexpected and could not be explained using classical theory.
The Photoelectric Effect

Heinrich Hertz discovered the photoelectric effect in 1887. He noticed that
when ultraviolet light was shone onto a pair of electrodes it was easier to form
sparks between them. Ultraviolet light seemed to be able to knock electrons
out of the surface of the metal electrode. Further investigation showed that
the effect depended on the frequency of the absorbed radiation but not on its
incident EM
radiaon of ejected electron
sufficiently high (‘photoelectron’)
frequency
−
metal surface
FP.CH27_3pp.indd 598 3/15/2023 2:32:30 PM

Quantum Theory • 599
intensity. This was a complete surprise. According to classical theory ejecting

an electron from the surface of a metal should depend on the energy of the
incoming radiation and not on its frequency.
27.1.1 Planck and the Black Body Radiation Spectrum

In 1900, Max Planck managed to derive the correct equation for the black
body radiation spectrum, but to do so he had to make a revolutionary assump-
tion. He assumed that the atomic oscillators vibrating in the black body can
only have discrete amounts of energy and that the energy of any particular
oscillator is a multiple of a smallest quantity or “quantum of energy” equal to
a constant multiplied by its frequency. The energy of an atomic oscillator is:
E = nhf
n is an integer, f is the frequency of the atomic oscillator and h is the Planck

constant.
h = 6.62607004 × 10- 34 Js
This was a radical suggestion. In classical physics, energy is a continuous
variable so atomic oscillators at any frequency could have any energy. By
quantizing energy in this way Planck showed that the quantum of energy for
a high-frequency oscillator is much larger than for a low-frequency oscillator.
energy
n=4
n=3
n=2
n=1
low frequency high frequency

oscillator oscillator
This makes it far less likely that a high-frequency oscillator will be excited
because it needs a large energy to start vibrating. The effect is to suppress
high-frequency vibrations and dramatically reduce the high-frequency
FP.CH27_3pp.indd 599 3/15/2023 2:32:31 PM

electromagnetic radiation they emit, thus preventing the ultraviolet c atastrophe.

At low frequencies, the allowed energies are very close together and behave
more like the classical continuum of allowed energies. That is why the long-
wavelength, low-frequency part of the curve can be explained classically.
Planck’s quantization of energy was the first of several ad hoc quantizations
discovered in the early part of the 20th century. This was the beginning of
quantum theory, but at that time no one understood why quantization worked.
27.1.2 Explaining Heat Capacities

Planck’s idea, that the energy of an atomic oscillator is quantized, can also be
used to explain why, at low temperatures, some modes of molecular rotation
or vibration are suppressed. The argument is the same as for high-frequency
atomic vibrations in a black body. The energy of a vibrational mode is quantized
and if the mean thermal energy, ½ kT, is less than the minimum energy needed
to excite a mode of vibration, hf, then those modes will not be excited and the
heat capacity will be lower than expected. At high temperature ½ kT >hf so
the modes are excited and do contribute to the heat capacity. Measurements
of heat capacity as a function of temperature show that it increases toward the
value predicted by equipartition as temperature increases.
27.1.3 Explaining the Photoelectric Effect

Einstein realized that the photoelectric effect could be explained by assuming
that electromagnetic radiation can only exchange energy with the matter in
discrete amounts or quanta. These quanta or “photons” have an energy:
E = hf
where f is the frequency of the radiation and h is the Planck constant.

There is a minimum energy, F, needed to remove an electron from the surface
of a metal. This is called the “work function” for the metal and it depends on the
element used, being lower for more reactive metals. Einstein assumed that when
radiation is absorbed by matter one photon transfers all its energy to one elec-
tron. The photon energy must be greater than the work function for the electron
to be ejected (some electrons will require more energy than F because they are
not at the surface of the metal). For electrons to be ejected from the surface:
hf ≥ Φ
FP.CH27_3pp.indd 600 3/15/2023 2:32:31 PM

This explains why there is a threshold frequency f0 below which photoelectric

emission does not occur:
hf 0 = Φ
This was another radical departure from classical physics where the light was
considered part of the electromagnetic spectrum and was assumed to be a
continuous wave that can take any energy. Einstein’s theory treated electro-
magnetic radiation as if it consisted of discrete packets of energy and trans-
ferred that energy discretely too. The photon theory treated light more like a
particle model than a wave model.
27.1.4 Characteristics of Photoelectric Emission

To understand the significance of Einstein’s photon theory we must first
review the characteristics of the photoelectric effect. One way to do this
is to use light sources of various frequencies to illuminate the negatively
charged plate of a gold-leaf electroscope. If the leaf falls the light source is
ejecting electrons. Typical experiments that can be carried out quite simply
in a laboratory are shown below.
F < f0 f > f0
cap: metal with work
−−−−−− −−−−−−
funcon Φ
−−−
−−−
electroscope is
discharged
If the metal is zinc, then visible light does not discharge the electroscope
even if it is very intense but ultraviolet radiation will discharge it even if
its intensity is low.
Increasing the intensity of the ultraviolet radiation discharges the
electroscope more rapidly.
When ultraviolet radiation is used, the leaf begins to fall as soon as it is
illuminated: there is no delay.
FP.CH27_3pp.indd 601 3/15/2023 2:32:31 PM

When ultraviolet radiation is used, the kinetic energy of the ejected

electrons has a continuous spectrum up to a maximum value and this
maximum kinetic energy increases if higher frequency ultraviolet r adiation
is used.
These observations cannot be explained using a classical wave model of light.
According to the wave model, energy is spread across the wavefront and
delivered continuously to the metal surface. If this was the case, then
energy would need to accumulate close to a surface electron before it
was ejected. For typical light intensities used in these experiments, this
would take a long time, but no delay is observed. Electrons are emitted as
soon as the light of high enough frequency strikes the surface. The wave
model cannot explain how this energy suddenly gets concentrated into a
few different places to eject the electrons.
In the wave model the parameter that determines how much energy is
delivered is the intensity of the radiation and not the frequency, so the
wave model cannot account for the threshold frequency or the increase of
kinetic energy with frequency and not intensity.
Einstein’s photon model can explain all the observations. Photons are dis-
tributed randomly in the arriving radiation but transfer all their energy to
individual electrons. We have already seen that this explains the existence of
a threshold frequency f0. It also explains the immediate ejection of electrons:
as soon as the first photon strikes the surface it will eject a photon (if f > f0).
Increasing the intensity of the radiation increases the number of photons per
second arriving at the metal surface so the electroscope discharges more rap-
idly (again only if f > f0). Finally, increasing the frequency above the threshold
frequency gives photons more than enough energy to eject an electron from
the surface so the excess energy gives the electron kinetic energy and the
maximum kinetic energy will be:
KEmax= hf − Φ
= hf − hf 0
so the maximum kinetic energy is directly proportional to (f - f0) as observed.
27.1.5 Measuring the Planck Constant

The Planck constant can be determined using a photocell. This is an electrical
component that uses the photoelectric effect to produce an output v oltage
when light is shone onto it. The symbol for a photocell is shown below. It
FP.CH27_3pp.indd 602 3/15/2023 2:32:31 PM

consists of a metal emitter and a collector. When light strikes the emitter
electrons are ejected and travel across to the collector. If the photocell is
connected to a circuit it provides an emf that can transfer energy to other
components.
− +
collector emier
electrons driven this way
The circuit shown on the next page can be used to find the maximum kinetic
energy of the emitted electrons by applying an opposing voltage to the
cell and increasing this until the current in the circuit is reduced to zero.
Measurements of this “stopping voltage” for incident light with a range of
different frequencies can be used to find the Planck constant and the work
function of the emitter.
photocurrent
FP.CH27_3pp.indd 603 3/15/2023 2:32:32 PM

By making the emitter positive with respect to the collector electrons must do
work to move between them. When the voltage between the two electrodes
is V the work that must be done is eV. Increasing V increases the work that
must be done and the photocurrent falls to zero when the work is equal to
the maximum kinetic energy. The voltage at which this occurs is called the
stopping voltage VS:
eV=
S = hf − Φ
KEmax
If the stopping voltage is measured for light sources with a range of frequencies
(e.g., by using different colored filters in front of a white light source) then a
graph of VS against f can be used to find both the Planck constant and the work
function.
h Φ
=VS   f −  
e  e
comparing this with

=
y mx + c
the gradient is h/e, the intercept on the voltage axis is – F/e, and the intercept
on the frequency axis is f0:
stopping gradient h/e

voltage
0
f0 frequency
− Φ/e
Robert Millikan used a similar method to make the first measurement of the
Planck constant in 1916.
FP.CH27_3pp.indd 604 3/15/2023 2:32:32 PM

27.2 MATTER WAVES

Einstein’s photon theory showed that a particle model can be used to explain
some aspects of the behavior of light. However, superposition effects such
as interference and diffraction can only be explained using a wave model.
Neither model completely describes the nature of light and physicists some-
times say that light exhibits “wave-particle duality.” In 1919, Louis de Broglie
suggested that if radiation exhibits wave-particle duality then perhaps matter
does too and he proposed an equation that could link the wave and particle
models together. This is known as the de Broglie relation.
27.2.1 The de Broglie Relation

de Broglie showed that the characteristic wave-like property of light, its wave-
length, is linked to a characteristic particle-like property of the photon, its
momentum. Here is an argument that shows how they are linked.
Photon energy: E = hf
E
Mass equivalent to this energy: m =
c2
E hf h
Momentum associated with this mass: p= mc= = =
c c λ
h
p=
λ
This is the de Broglie relation, the link between wavelength and momentum.
While we have derived it for the photon, de Broglie’s hypothesis was that it also
applied to matter. This would imply that a moving particle has an a ssociated
wavelength given by:
h
λ=
p
At velocities small compared to the speed of light this becomes:

h
λ=
mv
The faster the particle moves the shorter its de Broglie wavelength.
FP.CH27_3pp.indd 605 3/15/2023 2:32:33 PM

27.2.2 Electron Diffraction

If de Broglie’s hypothesis is correct then matter ought to exhibit wave-like
properties such as interference and diffraction just like light. The question
is, on what scale? If a person walking through a doorway behaves like a wave,
shouldn’t they diffract like a light going through a narrow slit? A rough cal
culation shows that even if they do, the diffraction effects would be undetec
table. Recall that the first minimum of a single slit diffraction pattern occurs
at an angle whose sine is l/b where l is the wavelength and b is the width of
the slit. For a 70 kg human walking at 1 ms- 1 this is:
6.6 × 10 −34
sin θ = ~ 10 −35
70 × 1
This is unbelievably tiny so the diffraction pattern would effectively have zero
width and so would not affect the classical expectation that the person contin-
ues through the doorway with no diffraction at all.
However, if an electron is traveling at 5% of the speed of light its de Broglie
wavelength is about 5 × 10- 11 m. This is comparable to the spacing of atoms
in a crystal lattice so we might expect to detect interference and diffraction
effects when electron beams accelerated to comparable speeds are directed at
crystals. This would only require an accelerating voltage of about 600 V.
The first physicists to detect electron diffraction were Davisson and Germer
in 1925 when they fired electron beams at a crystalline sample of nickel. They
noticed that the electrons were scattered into a pattern of maxima and minima
similar to the one that would be obtained if X-rays were passed through the
structure. Furthermore, the wavelength of X-rays needed to obtain the same
pattern corresponded to the de Broglie wavelength of the electrons.
At almost the same time G.P. Thomson, the son of J.J. Thomson (who had
shown that the electron can be treated as a particle), produced electron dif-
fraction rings by passing a beam of electrons through a thin slice of graphite.
The experimental arrangement is shown below.
Graphite consists of planes of atoms arranged in a hexagonal pattern. Rows of
atoms within these planes act like the lines in a diffraction grating. For planes
separated by a distance d, there will be diffraction maxima at angles given by:
nλ
sin θ =
d
FP.CH27_3pp.indd 606 3/15/2023 2:32:33 PM

accelerang
voltage V Vacuum
electrons diffracon
paern
Heater (rings)
supply (low
voltage) hot cathode anode graphite
target
fluorescent
screen
The fact that there are many such planes in all orientations relative to the
incident beam results in diffraction maxima being formed around the surface
of a cone with a half angle q for each set of planes and each order of d
iffraction.
Where these cones of diffracted electrons hit the end of the vacuum tube they
form a pattern of concentric rings.
planes of atoms d
acng like lines in a
diffracon grang
atomic arrangement
in graphite
FP.CH27_3pp.indd 607 3/15/2023 2:32:34 PM

The radius of diffraction rings can be used to find the spacing of atomic planes
in the crystalline structure. Changing the accelerating voltage changes the
wavelength of the electrons and the radius of each ring. Higher voltage gives
the electrons greater momentum and a smaller de Broglie wavelength so the
rings get smaller.
r
electron
beam θ
L
graphite
λ h h
sin θ = = =
d dmv d 2 meV
r
For small angles, sin θ ~ θ ~ so that:
L
hL
r~
d 2 meV
where d is the spacing of a particular set of atomic planes and V is the acceler-
ating voltage for the electrons. The radius of a diffraction ring is approximately
proportional to the reciprocal of the square root of the accelerating voltage.
27.2.3 The Compton Effect

In 1923, Arthur Compton showed that when X-rays are scattered by electrons
the process behaves just like a collision between two particles. This was an
important result because the wavelength shift in the scattered X-rays could
not be explained using a wave model for the radiation. In Compton’s experi-
ment, the X-ray photons had much greater energy than the ionization energy
of the atom so the electrons behaved like free particles.
Compton treated this as a collision between particles and used the equations
for conservation of energy and momentum to find the relationship between
the X-ray scattering angle and the change in wavelength of the X-rays:
h
( λ=
′ − λ) (1 − cos θ )
me c
FP.CH27_3pp.indd 608 3/15/2023 2:32:34 PM

incident photon
of wavelength λ
Scaered electron
with momentum p
− φ
−
θ
scaered photon of
wavelength λ’>λ
This was verified experimentally and showed once again that the particle
model must be used for some interactions between radiation and matter.
27.3 WAVE-PARTICLE DUALITY

The de Broglie equation applies to both electromagnetic radiation and
matter – sometimes a wave model is needed and sometimes a particle model
is needed. Neither model gives a complete description of the underlying
phenomenon, but there is a deeper problem because the two models seem to
contradict one another.
We have already seen that the wave model of light, with energy spread contin-
uously across the wavefront, cannot be used to explain photoelectric emission.
However, if instead of directing the light onto a metal plate we had passed it
through a double slit apparatus we would have to assume that the energy is
spread continuously to explain the resulting interference pattern.
How can these two apparently irreconcilable models be related?
FP.CH27_3pp.indd 609 3/15/2023 2:32:35 PM

EM radiation (e.g., light) Matter (e.g., electrons)

Evidence for the Young’s double slit experiment Electron diffraction
wave model
Diffraction patterns Electron standing waves in atoms
Evidence for the Photoelectric effect Discrete nature of the electric charge
particle model
Compton effect Momentum of individual electrons
27.3.1 Young’s Double Slit Experiment Revisited

The double slit experiment, first used to support the wave model of light,
is an ideal example to use when trying to reconcile the wave and particle
models.
In the double slit experiment, a monochromatic light source is directed at a
double slit and an interference pattern consisting of regularly spaced maxima
and minima appears on a screen. The maxima occur in positions where waves
arriving from each slit are in phase and interfere constructively. The minima
occur where the waves arrive in antiphase (p phase difference) and interfere
destructively. The resultant intensity at any point on the screen is calculated
by adding the phasors from each slit and then squaring the resultant ampli-
tude. This approach, the wave model, works (up to a point). It explains the
intensity variation across the screen.
If we now assume that the light is emitted and absorbed as individual
photons we run into difficulty. Consider, for example, a minimum posi-
tion. When both slits are open the minimum is dark, so no photons arrive
at this position. If we cover either one of the slits, then light does arrive at
this position. So we have a problem. It seems that opening the second
slit and allowing light to reach that position from either slit results in less
light arriving …. how can identical photons cancel one another out? How
can energy disappear?
The diagrams on the next page illustrate this. In the top diagram, only the
top slit is open and NA photons reach point P. In the middle diagram, only
the bottom slit is open and NB photons reach point P. With both slits, open
one might expect (NA + NB) photons to reach point P but in fact zero photons
are detected at P.
It seems that opening the second slit affects where photons passing through
the first slit can go. On the face of it, this is bizarre.
FP.CH27_3pp.indd 610 3/15/2023 2:32:35 PM

NA photons
boom slit from slit A
covered
P
B
LASER
A
single
slit diffracon
paern
single screen
slit B
NB photons
top slit from slit B
covered P
B
LASER
A
single
slit
diffracon
single screen paern
slit A
Zero
photons
both slits
open
P
B
LASER
A
single
slit interference
paern
double slit A and B screen
FP.CH27_3pp.indd 611 3/15/2023 2:32:35 PM

One, unlikely, possibility is that when both slits are open photons from A
and B somehow interact on their way to the screen and this prevents them
reaching P. To rule this out the experiment has been repeated using a filter
in front of the source to reduce the intensity so far that only one photon at a
time is interacting with the apparatus. Now photons arrive one at a time at the
screen and cannot interact with one another en route. What happens?
At first, the photons seem to be arriving completely randomly but after a short
while, it becomes apparent that the same patterns as before are produced
with single or double slits. If we insist that the photons behave like particles
then each photon can only pass through one of the slits. If it passes through
slit A when B is closed it can reach P. If it passes through slit B when A is
closed it can also reach P. However, if it passes through slit A when slit B is
open it cannot reach P! Similarly, if it passes through slit B when A is open it
cannot reach P! To maintain the particle model we would need to assume that
when the photon passes through one of the slits its future path is affected by
the state of the slit through which it did not pass.
individual photons photons cluster

pass through slit A around maxima and
OR slit B do not go to minima
LASER
filter single
slit
double screen
slit
interference paern
builds up gradually
A single photon passing through a double slit apparatus cannot reach any of
the minimum positions, even though it could reach all of them if it passed
FP.CH27_3pp.indd 612 3/15/2023 2:32:35 PM

through either slit when the other one is closed! This shows that photons do
not interfere with each other but every photon interferes with itself. Where
does this leave us?
The wave model can be used to explain the intensity distribution in super-
position effects.
The particle model can be used to explain the discrete emission and
absorption of radiation.
A similar experiment can be carried out with electron beams. The results are
exactly the same. Wave-particle duality affects matter and radiation in the
same way.
27.3.2 Interpreting Wave-Particle Duality

Einstein suggested that the solution to the problem of wave-particle dual-
ity might be to treat the waves as waves of probability so that the higher the
intensity of the radiation the higher the probability of finding a photon (or
electron) there. Max Born developed this idea into the “statistical interpre-
tation” of quantum mechanics and this became a cornerstone of the most
popular interpretation of the theory – the “Copenhagen Interpretation” (see
Section 27.5.1).
When light or electrons pass through a double slit apparatus each photon
or electron is associated with a wave (the “wavefunction”) whose intensity
is directly proportional to the probability of finding a photon or an electron
at each point in space. When an observation is made, for example a photon
is detected on the screen, the wavefunction “collapses” to zero everywhere
except at the position where the photon or electron is detected. This interpre-
tation is radical in several respects.
Prior to the observation the wavefunction changes in a continuous way but
when the observation is made there is a discontinuous collapse so that the
probability changes suddenly even at great distances from the o bservation.
There is no known physical process that can explain the “collapse of the
wavefunction.” This is sometimes called “the measurement problem.”
The idea that distant parts of the wavefunction can change when an
observation is made shows that quantum theory is “non-local.”
The wavefunction is the most complete description of the system but it
deals only with probability. This means that even if we could know the
initial conditions of a system with absolute precision (e.g., how a particular
electron is approaching a double slit apparatus) we can only make statistical
FP.CH27_3pp.indd 613 3/15/2023 2:32:35 PM

predictions about the future state of the system (e.g., where the electron
will hit the screen). Quantum theory is “indeterministic – the future of
the Universe is not uniquely determined by its present state. This is a
complete departure from the determinism of classical physics.
27.3.3 The Schrödinger Equation

Waves are solutions to differential equations. The one-dimensional wave
equation is:
∂2 y 1 ∂2 y
=
∂x2 v2 ∂t 2
where y is the wave disturbance and v is the speed of the wave in x-direction.
A solution to this is:
=y A cos ( ωt − kx )
where w = 2pf and k = 2p/l.

A similar pair of equations for the electric field (E) and magnetic field (B) in
a vacuum can be derived from Maxwell’s equations. These form the electro-
magnetic wave equations (in one dimension):
∂2 E ∂2E
=ε µ
0 0
∂x2 ∂t 2
∂2 B ∂2 E
=ε µ
0 0
∂x2 ∂t 2
e0 is the permittivity of free space and m0 is the permeability of free space.

Comparing the electromagnetic equations with the original one-dimensional
wave equation we can see that the speed of electromagnetic waves is:
1
c=
ε0µ0
The electromagnetic wave equations provide a way to derive the wavefunc-

tion for light, but is there a corresponding wave equation for the de Broglie
FP.CH27_3pp.indd 614 3/15/2023 2:32:36 PM

waves of an electron? Erwin Schrödinger thought that there should be and

in 1925 he found it. This is the most important equation in quantum the-
ory: “the Schrödinger equation.” However, unlike the equations above, the
Schrödinger equation deals with complex quantities (combinations of real and
imaginary numbers). The details of how the Schrödinger equation was derived
are beyond the scope of this book but here is the time-dependent equation:
ih ∂Ψ −h2 ∂ 2 Ψ
= + Vψ
2 π ∂t 8 π2 m ∂x2
where y is the wavefunction, i is the square root of minus one, and m is the
mass of the electron. V is the potential energy that might vary with position
and time. If the electron is moving freely in space then V = 0.
It is possible to find wave-like solutions to this equation. These represent the
de Broglie waves of the electron. Once Schrödinger had published his equa-
tion physicists could use it to solve a vast range of problems, and Schrödinger
himself showed how it could explain the energy level structure and spectrum
of the hydrogen atom. In Schrödinger’s atomic model the electron orbitals are
three-dimensional standing wave solutions to the equation.
According to the Copenhagen Interpretation, the square of the magnitude
of the wavefunction at each point in space is equal to the probability per unit
volume of finding the electron at that point. Since the wavefunction is itself a
complex quantity it is not directly observable, and its magnitude is found by
multiplying it by its own complex conjugate:
probability of finding electron in small region δxδyδz =Ψ ( x, y,z ) Ψ ∗ ( x, y,z )
if this is integrated over all of space it must equal 1: the electron will be found
somewhere in the Universe.
27.4 THE QUANTUM ATOM

The Rutherford nuclear model of the atom consists of electrons in orbit
around a positively charged nucleus. Such a model would allow the electrons
to orbit at any distance from the nucleus and to have any energy. However,
there is a serious problem with this model. According to Maxwell’s equations,
charged particles radiate energy when they are accelerated and according to
Newtonian mechanics, a particle moving in orbital motion has a centripetal
FP.CH27_3pp.indd 615 3/15/2023 2:32:36 PM

acceleration. Orbiting electrons ought to radiate energy and fall rapidly into
the nucleus. Classical physics leads us to the conclusion that Rutherford’s
nuclear atom is unstable.
It was also known that excited atoms emit a line spectrum, that is, the radiation
emitted by isolated atoms contains a set of discrete frequencies. If the photon
model is considered this suggests that the emission of radiation involves dis-
crete energy jumps DE = hf for each spectral line. This further suggests that
the electrons inside an atom cannot have any value of energy but can only
exist at certain discrete energy levels.
27.4.1 Bohr’s Model of the Hydrogen Atom

The fact that atoms exist and are stable shows that something is missing from
Rutherford’s nuclear atom. Niels Bohr realized that quantization might solve
this problem and at the same time give a way to calculate the frequencies
present in atomic line spectra.
Bohr assumed that:
The electron can move in one of several discrete circular orbits and that
each orbit has a specific energy (energy levels)
The angular momentum of the electron is quantized in integer units of
h/2p.
electron
−
rn
F mvn
+
proton
The diagram above shows an electron in the nth orbit inside the hydrogen
atom. The dotted lines indicate other quantized orbits. The energy of the nth
orbit is derived below.
FP.CH27_3pp.indd 616 3/15/2023 2:32:36 PM

Angular momentum:
nh
mvn rn =
2π
Centripetal force:
mvn2 e2
=
rn 4pe 0 rn2
Kinetic energy:
1 e2
=
KE =mvn2
2 8 πε0 rn2
Potential energy:
e2
PE = −
4 πε0 rn
Energy:
1 e2 e2
En =
KE + PE =mvn2 − =
−
2 4 πε0 rn 8 πε0 rn
Bohr used these equations to eliminate rn from the energy equation and to
express the energy of an electron in the nth level in terms of the quantum
number n.
− me 4  1  −13.6 eV
=En =  
8e 0 2 h2  n2  n2
This is a very important result as it provides an accurate method to calcu-

late the ionization energy of hydrogen and the frequencies of the lines in
the hydrogen line emission spectrum. The number n is called the “principal
quantum number” and n = 1 represents the lowest allowed energy state for an
electron in the hydrogen atom: the “ground state. All the energies are nega-
tive because the electron is bound to the nucleus.
FP.CH27_3pp.indd 617 3/15/2023 2:32:37 PM

The diagram below is an energy level diagram for the hydrogen atom based
on the Bohr model.
Energy
13.6
− = −0.85 eV 0 n=∞
4 n=4
13.6 n=3
− = −1.5 eV
3 n=2
13.6
− = −3.4 eV
2
13.6
− = −13.6 eV n=1
1
As n increases the energy levels get closer and closer together becoming a
continuum of states as n approaches infinity. The energy needed to remove
an electron from the ground state of the hydrogen atom (its first ionization
energy) is 13.6 eV, and the energy needed to move from n = 1 to n = ∞ where
the energy would be zero (a free electron).
27.4.2 Explaining the Hydrogen Line Spectrum

It had been well known since the end of the 19th century that the wave-
lengths of the lines in the hydrogen spectrum fall into several series which
are linked by simple mathematical formulae. The spectral series with the
most lines in the visible part of the spectrum is called the Balmer series after
the Swiss schoolmaster who first worked out the formula that linked individ-
ual lines in the series. Balmer’s formula (in a form later adopted by Johannes
Rydberg) is:
1  1 1 
= R 2 − 2 
λn 2 n 
R is the Rydberg constant. R = 1.097373157 × 107 m- 1. n is an integer greater

than 2.
FP.CH27_3pp.indd 618 3/15/2023 2:32:37 PM

Rydberg generalized this formula for all the series in the hydrogen spectrum:
1  1 1 
= R 2 − 2 
λn m n 
where m and n are integers with n > m.

Bohr was able to derive this formula from his atomic model and to find an
expression for the Rydberg constant. He assumed that the spectral lines cor-
respond to quantum jumps made by electrons moving from the excited state
to less excited states so that the Rydberg formula corresponds to a quantum
jump from the nth energy level to the mth energy level when (n > m). When
the electron loses an energy DEmn it emits a photon of frequency fmn given by:
DEmn = fmn. The diagram below shows how a photon is emitted when an elec-
tron jumps from n = 4 to m = 2.
Energy
13.6 0 n=∞
− = −0.85 eV n=4
4 n=3
13.6
− = −3.4 eV n=2
2
13.6
− = −13.6 eV n=1
1
For the transition above (from n = 4 to n = 2):
− me4  1  − me  1
4
 me4   1   1 
∆E24 = E4 − E2 =  − 2 2  2 = 2 2  2  −  2 
8ε 0 2 h 2  4 2  8ε 0 h  2  8ε 0 h   2   4 
3 me4
=
128ε0 2 h2
FP.CH27_3pp.indd 619 3/15/2023 2:32:38 PM

3 me4
hf24 =
128ε0 2 h2
3 me4
f24 =
128ε0 2 h3
c 128 cε0 2 h3
λ 24 = = 4
= 4.9 × 10 −7 m
f24 3 me
The Rydberg formula is derived in a similar way but using a transition from n
to m.
− me4  1  − me4  1  me4   1   1 

∆Emn = En − Em =   −   =  − 
8ε0 2 h2  m 2  8ε0 2 h2  n2  8ε 0 2 h2   n2   m 2 
hc
∆Emn = hfmn =
λ mn
1  me4  1   1 
=   2  −  2 
λ mn  8 c0 2 h3  n   m 
The Rydberg constant is:

 me4 
R= 2 3 
 8 cε0 h 
27.4.3 Electron Waves in Atoms

The success of the Bohr model was impressive but it was still an arbitrary
quantization rule. However, an alternative way to think about the quantization
rule gives us a clue to the real reason that atoms have energy levels. Bohr’s
quantization of angular momentum is:
nh
mvn rn =
2π
which can be rearranged to give:
 h 
2 πrn =n 
 mvn 
FP.CH27_3pp.indd 620 3/15/2023 2:32:39 PM

the de Broglie relation can be used to introduce a wavelength:

 h 
λn = 
 mvn 
2πrn = nλ n
The left-hand side of this equation is equal to the circumference of the cir-
cular orbit. The right-hand side of the equation is an integer number of de
Broglie waves. Bohr’s quantum condition is equivalent to saying that energy
levels can only occur when a circumference is an integer number of electron
wavelengths. This guarantees only a discrete set of energy levels in the atom
in a similar way that a stretched string can vibrate in only a discrete set of fre-
quencies (the fundamental and harmonics).
The energy levels of an atom correspond to standing wave patterns for the de
Broglie electron waves. The lowest or ground state is when the circumference
of the orbit corresponds to a single electron wavelength.
27.4.4 The Schrödinger Atom

While Bohr’s model was surprisingly successful in explaining the energy levels
and spectrum of the hydrogen atom it was quite obviously not the final word
on the atom. In Bohr’s model, the electron is restricted to a planar circular
orbit whereas an electron in a real atom could be anywhere in the three-
dimensional space surrounding the nucleus and the orbits could be different
shapes. Bohr’s model was adapted to include elliptical orbits but the solution
for a three-dimensional atom was found by Schrödinger using the Schrödinger
equation.
The mathematical derivation of formulae for electron orbitals is beyond the
scope of this book but it is interesting to outline the method and to describe
some of the important results. First of all, Schrödinger needed to put a poten-
tial energy term V into his equation. This is simply the electrostatic potential
energy of the electron in the field of the nucleus:
e2
PE =
4 πε 0 r
Then he needed to solve the equation in spherical polar coordinates (r, θ, φ).
These coordinates are used because of the symmetry of the problem – solving
FP.CH27_3pp.indd 621 3/15/2023 2:32:40 PM

a spherically symmetric problem in Cartesian coordinates is much harder and

the solutions are unlikely to be easy to interpret.
The equation separates into three separate equations in r, θ and φ. This
results in three separate quantum numbers, one associated with each of the
coordinates.
n = principle quantum number – (linked to r) this determines the energy of
the electron as before.
l = orbital quantum number – (linked to θ) this determines the angular
momentum and shape of the orbit.
m = magnetic quantum number – (linked to φ) this determines the orientation
of the orbit.
There is another property of electrons that is not included in the Schrödinger
equation, its spin. Electrons have spin angular momentum of magnitude h/4π.
This is usually referred to as spin-½ because the unit of angular momen-
tum used in atomic physics is h/2π (or ‘h-bar). Spin is a quantum mechanical
property and when it is measured along any particular axis a spin–½ particle
is found to have either + ½ or − ½ a unit of spin. This introduces another
quantum number:
s = spin quantum number – if this is +1 the spin is “up” and if it is − 1 the spin
is “down” relative to the axis of measurement.
Pulling all of this together we can see that there will be four integer quantum
numbers associated with each allowed state. The equation also constrains the
values of these quantum numbers so that:
For any n: l can have any value from 0 to (n − 1)
For any l: m can take integer values from − l to + l.
s can only be + 1 or − 1.
The three-dimensional shapes of the n = 1 (s) and n = 2 (2s and 2p) orbitals
are shown approximately (and not to scale) below. The s-orbitals (with l = 0)
are spherical. The p-orbitals (l = 1) form three dumbbell shaped orbitals with
different orientations (m = −1, 0, +1). Two electrons can occupy each orbital.
The shapes represent probability distributions for the electrons.
FP.CH27_3pp.indd 622 3/15/2023 2:32:40 PM

1s 2s 2p
Wolfgang Pauli realized that no two electrons in the same atom can have the
same set of quantum numbers. This is an example of the “Pauli Exclusion
Principle” and the consequence is incredibly important. If it were not the
case then all the electrons in a multi-electron atom would fall into the lowest
energy level and different elements would have very similar chemical prop-
erties. The Exclusion principle prevents this and ensures that the electrons
must fill up each energy level in turn. In a very general sense this is what gives
us chemistry. It certainly accounts for the Periodic Table. For example:
When n = 1: l = 0, m = 0 and s = ±1. Two electrons can occupy these states.
These are the 1s states in the atom and correspond to a spherically symmetric
wavefunction. Hydrogen, with one electron has a half-filled 1s shell. This can
be represented by:
1s1
The first number is the energy level or principal quantum number, the letter
refers to the type of orbital (related to the value of l) and the superscript is the
number of electrons in the shell.
This accounts for its reactivity – it can complete the shell by reacting with
other elements with one or more outer electrons.
Helium, has two electrons, so the 1s shell is full and the atom is very stable.
Its electronic configuration is:
1s2
Lithium has three electrons so the n = 1, l = 0, m = 0 and s = ±1, states are
filled by two electrons so the 1s shell is full and the third electron has to go
into one of the eight available states with n = 2 (the 2s shell). The lowest
FP.CH27_3pp.indd 623 3/15/2023 2:32:40 PM

energy state is the n = 2, l = 0, m = 0, s = 1 (or -1) state. This leaves the 2s shell
partially filled and lithium is again reactive. Its electronic configuration is:
1 s2 2 s1
This filling of energy shells continues as we move to larger and larger atoms.
The periodicity of the periodic Table comes about as successive shells fill up
and electrons start to fill the next shell.
The fifth element in the Periodic table is Boron. This must accommodate five
electrons so both the 1s and 2s shells are full and the next electron must have
an orbital quantum number l = +1. This is called a p-shell and has a different
shape to the spherical s-shells. There are three p-shells (m = − 1, 0, and +1)
in different orientations, and each is shaped like a dumbbell. Two electrons
can go into each shell (with s = +1 or s = −1). The electronic configuration for
Boron is:
1 s2 2 s2 2 p1
The table on the next page shows the order of the first few electronic energy
levels and the corresponding electronic configurations. Just a few examples
will show how the electronic configuration, determined by solutions to the
Schrödinger equation, affects chemical properties.
Atoms with filled shells tend to be very stable – for example, helium and
neon.
Atoms with a single electron in an otherwise empty shell tend to be very
reactive because they easily lose that electron to an atom that needs one
or more electrons to complete its own outer shell – for example, sodium.
Atoms with an almost full shell are also very reactive, easily gaining elec-
trons from other atoms that have only one or a few electrons in their outer
shell – for example, fluorine.
Carbon has 2 electrons in the 2p shell, so to complete this shell carbon
must gain four electrons. This makes it “4-valent.” Carbohydrates (com-
pounds of carbon and hydrogen) are the most important class of com-
pounds for living things and illustrate how valency is linked to electronic
configuration. In methane, CH4, each hydrogen atom shares one electron
so that the carbon 2p shell is completed and the hydrogen 1s shell is com-
pleted, so both achieve more stable (lower energy configurations) and
four covalent bonds are formed.
FP.CH27_3pp.indd 624 3/15/2023 2:32:40 PM

n l m s Configuration Element (ground state)

1s shell 1 0 0 +1 1s1 Hydrogen
−1 1s2 Helium
2s shell 2 0 0 +1 1 s2 2 s1 Lithium
−1 1 s2 2 s2 Berylium
2p shell 1 −1 +1 1 s2 2 s2 2 p1 Boron
−1 1 s2 2 s2 2 p2 Carbon
0 +1 1 s2 2 s2 2 p3 Nitrogen
−1 1 s 2 2 s 2 2 p4 Oxygen
+1 +1 1 s 2 2 s 2 2 p5 Fluorine
−1 1 s2 2 s2 2 p6 Neon
3s shell 3 0 0 +1 1 s2 2 s2 2 p6 3 s1 Sodium
−1 1 s2 2 s2 2 p6 3 s2 Magnesium
Etc.
This gives a very simple explanation of key chemical properties. More detailed
analysis of chemical bonding requires a solution of the Schrödinger equation
for electrons in the field of both atoms. The Schrödinger equation is the fun-
damental equation in chemistry.
27.5 INTERPRETATIONS OF QUANTUM THEORY

While relativity challenges our preconceived ideas about space and time,
quantum theory challenges our ideas about the nature of reality. Wave-
particle-duality shows that it is impossible to capture all the features of a
quantum object in a visualizable classical model and any such attempt leads to
contradictions. However, the Schrödinger equation and other developments
in quantum theory provide a powerful set of mathematical equations that
allow us to make predictions about the physical world to an incredible degree
of precision. While some physicists think we should just accept and use the
equation and give up on trying to find any underlying meaning to them others
have felt that it is important to interpret the wavefunction and seek a deeper
understanding of what is actually going on in a quantum process. This has
led to several different “interpretations” of quantum theory but there is still
FP.CH27_3pp.indd 625 3/15/2023 2:32:42 PM

a lack of consensus. Here we will give a brief description of three alternative

approaches to the theory:
The Copenhagen Interpretation was developed around Niels Bohr and
Werner Heisenberg in Copenhagen (where Bohr worked).
The sum-over-histories approach was developed by Richard Feynman.
The many-worlds theory was developed by Hugh Everett III.
27.5.1 The Copenhagen Interpretation

The Copenhagen interpretation has been the “establishment” interpretation
of quantum theory for nearly a century despite having obvious shortcom-
ings. The central idea is the wavefunction (y, “psi”). This is a solution of the
Schrödinger equation that represents all it is possible to know about a physical
system. For example, the wavefunction for light leaving a source might be a
series of waves spreading out in all directions. But these waves are not directly
observable, they are related to the probability of observing a photon at each
point in space. The Schrödinger equation allows us to calculate how the waves
change as they spread out and interact with the apparatus. For example, when
the waves pass through a double-slit arrangement they diffract at each slit
and then superpose, producing an interference pattern consisting of regularly
spaced maxima and minima. However, this pattern is only related indirectly
to the light that is detected on the screen (or by any other kind of detec-
tor). The “intensity” (|y|2) of the interference pattern represents a probability
distribution for the arrival of photons on the screen. High intensity corre-
sponds to high probability and low intensity corresponds to low probability.
The pattern applies to each individual photon, so if only one photon interacts
with the apparatus the same probability distribution is used to work out where
it is likely to be detected.
Now we reach the most controversial aspect of the Copenhagen i nterpretation,
the act of observation or measurement. Up to this point the wavefunction
has evolved continuously according to the Schrödinger equation. At the
moment of observation or measurement the probability distribution, which
spreads across the entire screen in the double slit experiment, suddenly and
discontinuously changes. It becomes instantly zero everywhere except at the
point where the photon is observed. This is called the “collapse of the wave-
function.” This is not explained by the Schrödinger equation and there is no
agreement on any physical mechanism by which it occurs. This is called the
“measurement problem” and is the main reason many physicists think that the
Copenhagen Interpretation is unacceptable (even if it does give us a useful
way to describe quantum processes).
FP.CH27_3pp.indd 626 3/15/2023 2:32:42 PM

The diagram below gives a simplified explanation of the double slit experi-
ment using the Copenhagen Interpretation.
Wavefunction for light Wavefunction for light Wavefunction for light

approaching slits modified by apparatus collapses as observations
are made and photons
arrive at particular places.
Prior to observation the photon could be anywhere within the probability dis-
tribution. It is said to be in a “superposition of states.” After the observation,
the photon has arrived at a particular place on the screen.
Here is a summary of the main features of the Copenhagen Interpretation.
The wavefunction y (a solution of the Schrödinger equation) gives the
most complete description of a quantum state (this is unobservable).
Schrödinger’s Equation governs the behavior of the wavefunction and its
interaction with the apparatus.
The evolution of the wavefunction is continuous and deterministic (i.e.,
if we know its initial state we can calculate, with certainty, its future state.
|y|2 gives probability of finding photon at each point in space (Born’s sta-
tistical interpretation).
Before an observation or measurement photons are in a superposition of
states.
Observation or measurement results in the discontinuous collapse of the
wavefunction.
We can only make statistical predictions about the outcome of wavefunc-
tion collapse so the quantum theory is indeterministic.
When we use high-intensity light (large numbers of photons) the quantum
distribution is identical to that expected from classical physics. This is not
really surprising, if there is a large amount of energy the discrete nature
of individual quanta is hard to observe. The idea that quantum predictions
merge smoothly into classical predictions at high energies is called the cor-
respondence principle.
FP.CH27_3pp.indd 627 3/15/2023 2:32:43 PM

27.5.2 Heisenberg’s Uncertainty (Indeterminacy) Principle

In classical physics, it is, in principle, possible to measure the position and
momentum of a particle as precisely as our instruments allow. It is assumed
that a particle possesses a definite position and momentum at all times even if
we do not choose to measure these quantities. The forces between the parti-
cles are also determined by other properties such as mass and charge so that
it should be possible, in principle, to predict future positions and momenta by
carrying out suitable calculations using the initial values. More importantly,
classical physics is “deterministic” – the future is entirely determined by the
present state whether or not we have this information. This is not the case
with quantum theory.
having passed through

hole electron gets
random momentum
parallel to x-axis
wavefunction for electron diffracts

at hole
Consider trying to determine the exact position of an electron at some

moment. One way of doing this would be to direct single electrons at a nar-
row hole. If an electron gets through the hole and reaches the other side then
it must have been localized in the region of the hole as it passed through, so
we know its location at that moment with an uncertainty about equal to the
diameter of the hole. However, we must take into account the wave nature
of the electron. As it passes through the hole its wavefunction diffracts and
spreads out in all directions. This is equivalent to giving the electron a random
sideways momentum as it passes through the hole. As the hole is made smaller
the diffraction effects increase and the random changes of momentum are
likely to be larger. Making a precise measurement of the position of the elec-
tron in the plane of the hole introduces an uncertainty in the momentum of
the electron in that plane.
FP.CH27_3pp.indd 628 3/15/2023 2:32:43 PM

If the hole diameter is Dx the angular spread of the first diffraction maximum
(which is where most electrons will be found after passing through the hole)
is given by:
1.22λ
sin θ = ±
∆x
For small angles this is approximately:
λ
θ~±
∆x
The de Broglie wavelength links wavelength to momentum so:
h
λ=
p
h
θ~±
p∆x
where p is the original momentum of the electrons (perpendicular to x). After
the electron passes through the hole its momentum also has an x-component
Dp perpendicular to its original momentum:
∆p

p
For small angles:

∆p
θ~
p
Combining the last two equations:
∆p h
~±
p p∆x
h
∆p ~
∆x
∆ p∆ x ~ h
FP.CH27_3pp.indd 629 3/15/2023 2:32:44 PM

While this is a rough derivation, it does suggest that the uncertainty in x-posi-
tion and the uncertainty in x-momentum are inversely proportional and that
their product is of the order of the Planck constant. The more precisely we
measure the position of the electron (smaller hole) the greater the uncer-
tainty in its momentum (greater spread). But this is not just about our ability
to make precise measurements. The wavefunction contains complete infor-
mation about the electron, so the properties of position and momentum are
indeterminate – the electron does not possess independent properties of posi-
tion and momentum at each moment.
This discovery, made by Werner Heisenberg in 1927, is one of the defining
features of quantum theory and is known as the uncertainty principle or inde-
terminacy principle. Its more formal statement is in terms of an inequality:
h
∆x∆p ≥
4π
If particles do not possess defined properties such as position and momentum
then it is impossible to predict the future in detail based on the total informa-
tion about the state of the present. This makes quantum theory “indetermi-
nate” and leaves the future open. It also undermines our classical ideas about
the nature of reality – how can an electron be regarded as a real particle if it
does not possess definite values for momentum and position?
27.5.3 The Sum-Over-Histories Approach

In 1980, Freeman Dyson described the sum-over-histories approach:
Thirty-one years ago, Dick Feynman told me about his “sum over histories”
version of quantum mechanics. “The electron does anything it likes,” he said.
“It just goes in any direction at any speed, …. however it likes, and then you
add up the amplitudes and it gives you the wavefunction.” I said to him,
“You’re crazy.” But he wasn’t.
Feynman embraced the idea that photons or (electrons or any other quan-
tum object) are free to get from one point to another by any route at all, and
assumed that these alternative “histories” contribute in some way to what the
electron or photon actually does. This seems to be what is happening in the
double slit experiment because the slit through which the photon does not
pass does affect where it can hit the screen. To do this he associated a rotat-
ing phasor with each photon. Different routes have different lengths and take
different times to complete so the phasors from different routes will have a
range of phases (i.e., they point in a range of different directions) when they
FP.CH27_3pp.indd 630 3/15/2023 2:32:44 PM

arrive at any point on the screen. The phase can be worked out if we know
the wavelength associated with the photon (e.g., a path of length 72.5 wave-
lengths introduces a phase difference p). The next step is to add up the pha-
sors at each point on the screen. The square of the resultant phasor amplitude
is proportional to the probability that the photon is found at that point. The
diagram below shows three phasors for three of the many possible routes a
photon could take from one point, A, to another, B.
possible path 1
phasor:
Point B path 1
possible path 2
phasor:
phasor at path 2
point A sum of
phasor::
possible path 3 path 3 phasors
This can be summed up in a series of rules.

Rule 1: Photons/electrons “explore all paths.”
Rule 2: Each path contributes a phasor.
Rule 3: The phasors add like vectors at the detector.
Rule 4: The square of the resultant phasor is proportional to the prob-
ability of finding the electron/photon at that point.
For example, double slits – the photon can reach the detector via either slit,
so each possible route contributes a phasor. The paths are different lengths so
there is a phase difference.
path via A detector

A
source
path via B
B
Path B is longer than path A so the phasor has rotated more times and there
is a phase difference between phasors via A and via B.
FP.CH27_3pp.indd 631 3/15/2023 2:32:45 PM

via B
via A
resultant
The length of the resultant will vary as the detector is moved to different
positions. If the two phasors arrive in phase they reinforce so the probabil-
ity at that point is a maximum value. At positions where the phasors arrive
in antiphase (p phase difference) they undergo destructive interference and
cancel out. The probability at that point is zero so no photons actually arrive
there.
This “sum-over-histories” approach is particularly helpful in particle physics
where all possible mechanisms for a particular interaction contribute a phasor
and the sum of all the phasor amplitudes (squared) gives the probability of the
process. As an interpretation of quantum theory, it suggests that beneath what
we regard as the real world of actual events there are potential events (possi-
ble paths) that may not actually be where the photon is found but nonetheless
contribute to what it can do.
27.5.4 The Many-Worlds Theory

The many-worlds theory was first suggested in 1957 by Hugh Everett III
and it was an attempt to solve the measurement problem. Everett thought
that the Schrödinger equation should give a complete explanation of what
happens in a quantum system and felt that the measurement problem only
arises if we treat the system we are measuring as a quantum system and the
apparatus we use to measure it (or even the observers who observe it) as
classical systems. He argued that all observations and measurements were
interactions between quantum systems so both the observed and the observer
(the measured system and the measuring apparatus) must be described by the
Schrödinger equation.
The wavefunction, a solution to the Schrödinger equation, represents a
“superposition of states.” According to the Copenhagen Interpretation this
superposition collapses into a particular state when an observation is made.
Everett argued that if we really take the Schrödinger equation seriously we
should assume that there is a wavefunction describing both the observed
FP.CH27_3pp.indd 632 3/15/2023 2:32:45 PM

system and the observer and that this wavefunction contains a superposition
of all possible states of both. For example, if an atom of a radioactive element
has a 50% chance of decaying in 1 hour then the wavefunction of the system
consists of two parts, one representing the undecayed atom and the other
representing the decayed atom and the emitted radiation. During 1 hour
the amplitude of the first part gets weaker and the second part gets stronger
until at the end of the hour both parts have equal strength. This represents
an equal chance of having decayed or not having decayed. Now consider an
observer who can detect whether the atom has decayed or not. According to
Everett, the observer is also described by a wavefunction and this contains
a superposition of two states, one in which the observer detects a decay and
the other in which he does not. As time goes on the amplitude of the first
part grows and the amplitude of the second part falls until after 1 hour they
have equal amplitudes. No collapse has occurred but the description of the
Universe now contains two observers, one having observed the decay, the
other having observed no decay. The single world has split into two.
According to Everett’s model, the wavefunction for the whole Universe con-
tains a superposition of all possibilities. We only think that the wavefunction
collapses because we, as individuals, only experience one path through this
ever-splitting Universe even though copies of ourselves occur in many of the
other parallel worlds. The attraction of the many-worlds theory is that it solves
(or at least removes) the measurement problem. Its drawback is the weird
multiplicity of Universes it imagines.
Every measurement causes the world to split into multiple copies of itself,
each copy containing one possible outcome of the measurement AND a
copy of the measuring device giving that outcome.
This solves the “measurement problem” – No “collapse of the wavefunction.”
But it suggests that world is continually splitting into multiple copies of
itself – “many-worlds.”
M1 1 World 1
[Observer 1]
[[Observer]
IN
M M2 2 World 2
[Observer 2]
M3 3 World 3
[Observer 3]
FP.CH27_3pp.indd 633 3/15/2023 2:32:46 PM

M is a measurement and M1, M2, and M3 are three different possible results of
that measurement that exist as superpositions in the wavefunction.
In the many-worlds interpretation everything that is not impossible actually
occurs in some world – if life is highly improbable but not impossible it must
exist in part of the multiverse.
27.5.5 Schrödinger’s Cat

Erwin Schrödinger realized that quantum theory challenged our conventional
view of how the Universe functions. He tried to draw attention to its counter
intuitive nature using a thought experiment in which quantum effects on the
atomic scale are amplified up to affect objects on an everyday scale. This
thought experiment is known as Schrödinger’s Cat.
A cat is placed in a box along with a mechanism that is designed to break a vial
of poison if a radioactive atom decays. If this does happen the poison will kill
the cat. The box is closed and left for a time so that the probability that the
atom has decayed and the poison has been released to kill the cat is exactly ½.
The box is then opened. The cat is either dead or alive but what was its state
prior to opening the box?
Classically the answer to this question seems very obvious – the cat was either
alive or dead before we opened the box, we just do not know which state it
was in. Quantum theory gives a different and rather disturbing answer. The
atom is described by a wavefunction that has two parts – one describing its
undecayed state and the other describing its decayed state. Prior to opening
the box and making an observation the most we can deduce about the system
is that the atom is in a superposition of the decayed and undecayed states. But
there is also a wavefunction for the state of the detector and the vial of poison.
These too must be in a superposition of states. So must the cat. It is not dead
or alive prior to opening the box but in a superposition of those states. And we
can go further – the experimenter who opens the box can also be described by
a wavefunction. Until we interact with him and ask him what has happened to
the cat we must describe him, the cat, the detector, and the atom by a wave-
function that includes a superposition of both alternatives.
FP.CH27_3pp.indd 634 3/15/2023 2:32:46 PM

vial of
poison
?
hammer
release
detector
radioactive
atom
If we stick to the Copenhagen Interpretation each wavefunction is collapsed

by successive observations. This is where the many-worlds theory offers a way
out of these endless wavefunction collapses. The cat is both dead and alive
but not in the same world. In one world the atom decayed, the poison was
released and the cat died. In that world, the experimenter opened the box
to find the dead cat. But in another world the atom did not decay, the poison
remained trapped in the vial and when the experimenter opened the box the
cat was alive and well.
The arguments about how to interpret quantum theory go on.
27.6 EXERCISES
1. (a) Calculate the de Broglie wavelength of an electron that has been

accelerated through a potential difference of 500 V.
Calculate the energy of an X-ray photon of wavelength
(b) (i)
2.0´10-10 m. Express your answer in both J and eV.
(ii) Calculate the de Broglie wavelength of an electron that has the

same energy as the photon in (b)(i).
(c) The first ionization energy of most atoms is around 10 electron volts.
Explain why visible light is not ionizing but X-rays are.
2. (a) Explain what is meant by each of the following:
(i) the photoelectric effect (ii) threshold frequency
(iii) work function
FP.CH27_3pp.indd 635 3/15/2023 2:32:46 PM

(b) Describe the dependence of the photoelectric effect on:

(i) frequency of radiation (ii) intensity of radiation
(c) Give one aspect of the photoelectric effect that cannot be satisfacto-
rily explained using a wave model of light.
(d) Describe Einstein’s photon theory and explain how it can account
for the frequency and intensity dependence of the photoelectric
effect.
3. (a) Why will weak red light emit no electrons from the surface of zinc?
(b) Why doesn’t increasing the intensity of red light lead to the emission
of electrons from the surface?
(c) Why can UV light of wavelength 200 nm emit electrons from zinc?
(d) What changes if the intensity of the UV light falling on the zinc is
increased? Why?
(e) Blue light of wavelength 450 nm can emit electrons from potassium.
Explain why this can occur for potassium but not zinc.
(f) How would electrons emitted from potassium by violet light differ
from those emitted by blue light? Explain.
Work functions: zinc: 4.3 eV, potassium: 2.3 eV
4. (a) Calculate the maximum kinetic energy for electrons emitted from
the surface of a metal of work function 2.0 eV by light of wavelength
450 nm.
(b) Calculate the maximum velocity of these electrons.
(c) Calculate the stopping voltage for these electrons.
State and explain what happens if the intensity of the light is
(d)
increased.
5. The energy levels of the hydrogen atom are given by the equation:
−13.6 eV
En =
n2
(a) How much energy is needed to ionize a hydrogen atom from its
excited n=2 state?
FP.CH27_3pp.indd 636 3/15/2023 2:32:46 PM

(b) What is the wavelength of the photon emitted by a hydrogen atom

when it decays from the n = 4 state to the n = 2 state? What part of
the EM spectrum does this radiation belong to?
(c) An electron with kinetic energy 11 eV scatters from a hydrogen atom
in its ground state. Describe the states of the hydrogen atom and
electron after the collision if:
(i) It is an elastic collision. (ii) It is an inelastic collision.
6. (a) Estimate the number of photons of visible light emitted by a 100 W
filament lamp per second. Assume the light efficiency is 10% and
take photons of visible light to have a wavelength around 500 nm.
(b) Why don’t we notice the quantum nature of light in everyday life?
7. (a) Derive an algebraic expression for the recoil force on a torch if it
emits a beam of light of optical power P.
(b) Solar sails could be used to reflect light from the Sun and gener-
ate a thrust to propel a spacecraft away from the Sun. Estimate the
minimum area of solar sail necessary to generate a thrust of 1.0 N at
a distance of 1.5 ´ 1011 m from the Sun (1 AU). The Sun’s luminosity
is 3.8 ´ 1026 W.
8. (a) Summarize the main features of the Copenhagen Interpretation.
(b) Use the Copenhagen Interpretation to explain the formation of an
interference pattern when monochromatic light is shone through
double slits.
(c) Explain what is meant by “the measurement problem.”
(d) Why is the “measurement problem” a problem for the Copenhagen
Interpretation?
(e) How does the many-worlds interpretation “solve” the measurement
problem?
9. (a) Describe the Schrödinger’s Cat thought experiment.
(b) At the end of the experiment a box is opened and the experimenter
finds a cat that is either dead or alive. How does the description of
the state of the cat inside the box, before it is opened, differ in clas-
sical physics and in quantum physics (according to the Copenhagen
interpretation)?
FP.CH27_3pp.indd 637 3/15/2023 2:32:46 PM

(c) What happens, according to the Copenhagen interpretation, when

the box is opened and the experimenter looks inside?
(d) Suggest a reason why this thought experiment is so shocking.
(e) How does the many-worlds theory account for what happens in this
thought experiment?
10. The energy levels of the hydrogen atom are given by the equation:
−13.6 eV
En =
n2
(a) Derive a similar equation for the singly ionized atom of helium
(Z = 2).
(b) Describe how you would expect the line emission spectrum from
singly ionized helium to compare with that of hydrogen.
(c) Explain why we cannot use the same simple method to derive the
energy levels for the second electron in the neutral helium atom.
FP.CH27_3pp.indd 638 3/15/2023 2:32:46 PM

CHAPTER
28
Astrophysics
28.0 PHYSICS ASTROPHYSICS AND COSMOLOGY

Much of astrophysics differs from other branches of the subject because we
cannot set up experimental stars and vary parameters to see what happens.
However, there are many billions of stars in space and we can test our theo-
ries by observing their radiation using telescopes here on Earth. The first
telescopes used for this purpose were optical but in the second half of the
20th century other parts of the electromagnetic spectrum were also used; first
radio but now infra-red, ultra-violet, X-ray, and even gamma-ray telescopes
have been built and used. The Earth’s atmosphere is almost transparent to
visible light and some radio wavelengths but much of the rest of the spectrum
is strongly absorbed, so telescopes working at other wavelengths are usually
mounted on satellites orbiting beyond the atmosphere.
Different wavelengths provide information about different processes going
on in the source stars. It is usually the case that the higher the frequency
of the radiation the more energetic the underlying process that emitted it.
In recent years, the detection of gravitational waves (predicted in Einstein’s
general theory of relativity) has opened the possibility of gravitational wave
telescopes in the future. These would provide evidence about some of the
most dramatic astronomical events, for example, the collapse of giant stars
collisions of black holes.
Here we will focus on some of the physics that is important in astrophysics
and include aspects of cosmology, the study of the entire Universe.
FP.CH28_2PP.indd 639 3/14/2023 7:09:56 PM

28.1 STARS
Before the start of the 20th century physicists were unable to explain how
a star like the Sun could continue to radiate energy for billions of years. No
known chemical or gravitational process could provide a large enough source
of energy. This was not a major problem until the theories of long-term geo-
logical processes and evolution by natural selection became established. Both
required the Earth to have existed, and had a source of energy, for billions of
years.
The problem of the Sun’s energy source was solved by the discovery of
Einstein’s mass-energy equation and the process of nuclear fusion (see Section
26.3.4). Stars fuse light nuclei into heavy nuclei in their cores releasing a huge
amount of energy. Radiation from the core supports the star against gravita-
tional collapse while energy is transferred to its surface where it radiates out
into space.
28.1.1 Mass
The most important parameter when modeling a star is its mass. The greater
the mass of the star the higher the temperature at its core. This can allow
fusion reactions to proceed faster and nucleosynthesis (see Section 26.3.5) to
produce heavier elements. Increasing mass rapidly increases the reaction rate
in the core so that more massive stars use up their fuel relatively more quickly
than less massive stars and reach the end of their lives earlier.
Mass determines the fate of a star.
Low-mass stars are stable for a long time but their cores are not hot
enough to create nuclei beyond carbon. When they run out of fuel, they
swell to become red giant stars before shedding their outer layers of gas
and forming a planetary nebula. This exposes the white-hot core. This
final state is called a white dwarf star. Fusion reactions have now ceased
so the white dwarf cools down over a long period of time (of the order of
a billion years) eventually becoming a dense black dwarf star.
High-mass stars have shorter lives but the core becomes hot enough for
nuclear fusion to create iron. Iron is the most stable nuclide so at that
point fusion reactions stop suddenly and the star undergoes a violent col-
lapse and explosion called a supernova. This can increase the luminosity
of a star by a factor of around 1010 for a short period (days or weeks). The
supernova explosion has two effects – some of the energy creates heavier
FP.CH28_2PP.indd 640 3/14/2023 7:09:56 PM

Astrophysics • 641
nuclei than iron and the process blasts these out into space, where they
can become part of the raw material for second and third generation stars
to form. Our own Solar System must have formed from supernovae rem-
nants because it contains significant amounts of the heavy nuclides (e.g.,
uranium in the Earth’s crust).
Mass also determines what happens to the core left after the supernova
explosion.
The core is so massive (typically 2-3 times the mass of the Sun) that the
forces that prevent the collapse of ordinary matter – that is, forces that
stop atoms being crushed (called electron degeneracy pressure) – are
overcome by gravity. This effectively forces orbital electrons and nuclear
protons to combine to form neutrons. When this happens the core radius
decreases enormously (to about 10 km!) and the density of the core,
which is now almost entirely made of neutrons, increases spectacularly
to around 1017 kgm−3. One centimeter cubed of this neutron star material
would have a mass of one hundred million metric tonnes! The core is now
called a neutron star. This collapse causes the rotation rate to increase too
so neutron stars spin rapidly, sometimes completing a revolution in milli-
seconds). Neutron stars form from the collapse of stars with initial masses
in the approximate range of 10–30 times the mass of the Sun.
Another effect of the collapse is to intensify the magnetic field of the star.
This has the effect of directing a beam of radio waves out along the mag-
netic axis of the star. Since this axis can be in a different place to the rota-
tion axis the beam sweeps around like the light from a lighthouse. If the
Earth happens to be struck by this beam we receive regular pulses. When
these pulses were first discovered they were so regular that astronomers
thought they might be alien radio signals. They are called “pulsars.”
If the mass of the core is greater than about 5 times the mass of the Sun
then the collapse to a neutron star is not the end of the story. The gravita-
tional forces are so strong that the neutrons themselves are crushed and
at the present time we know of no physical force that prevents collapse to
a point or singularity. A black hole is formed. The reason for the name is
that at a certain distance from the central singularity the escape velocity
is c, the speed of light. Since this is a universal speed limit no material or
information from points closer to the singularity can reach the outside
world. The surface at which this occurs is called the event horizon of the
black hole (see Section 23.2.5).
FP.CH28_2PP.indd 641 3/14/2023 7:09:56 PM

Stellar nebula: cloud of

gas and dust in which
protostars form from
gravitaonal collapse
high mass star –

low or medium mass star
– nucleosynthesis up to nucleosynthesis up to
carbon-12 iron-56
Red giant
Red supergiant
supernova
planetary
nebula explosion
nucleosynthesis up
to uranium-238
white dwarf neutron star OR black hole
FP.CH28_2PP.indd 642 3/14/2023 7:09:57 PM

28.1.2 Stars as Black Bodies

Energy released by nuclear fusion reactions in the core of a star raises its sur-
face temperature so that it emits a spectrum of electromagnetic radiation into
space like a black body. The total power radiated by a star is called its lumi-
nosity L (measured in watts). However, while the black body model is fine for
the overall shape of the spectrum, there is a great deal of fine detail too. This
results in a complex pattern of dark absorption lines corresponding to ele-
ments in the outer layers of the star that have absorbed radiation at particular
frequencies. Analysis of stellar spectra can tell us a great deal about the nature
of the star, its surface temperature, and composition, as well as allowing us to
classify stars in a useful way. In addition to this, the motion of the star relative
to the Earth results in a Doppler shift of these absorption lines and by meas-
uring this shift we can determine the relative velocity of the star.
The peak wavelength in the black body radiation spectrum can be measured
using a telescope attached to a spectrometer. Wien’s law (see Section 8.5) can
then be used to find the surface temperature of the star.
power radiated per unit

wavelength
Stellar (black-body) radiaon

spectrum
wavelength
peak
2.90 ´ 10-3 mK
λ max T =
2.90 × 10 −3
T=
λ max
FP.CH28_2PP.indd 643 3/14/2023 7:09:57 PM

If the luminosity is also known it is possible to use Stefan’s law to calculate the
radius of the star:
L =σAT 4 =4 πr 2 σT 4
L
r=
4 πσT 4
28.1.3 Stellar Spectra and the Hertzsprung–Russell Diagram

Astronomers classify stars according to their spectral type. This also corre-
sponds to their surface temperature because as the temperature rises differ-
ent types of spectral lines appear in the spectrum. The details of this do not
need to concern us here but what is interesting is that when luminosity is
plotted against temperature (spectral type) for all of the observable stars a
clear pattern emerges. This was first done by Hertzsprung and Russell and the
plot is called a Hertzsprung-Russell diagram. Note that the x-axis points in the
direction of decreasing temperature.
luminosity (relave to the

Sun) increasing radius
Deneb
10
6 Rigel Betelgeuse
super giants
short lifeme: Antares
7
10 years
Siriu
giants
Sun
white dwarf
stars
4
long lifeme:
10 11
10 years
O B A F G K M
30 000 Increasing temperature 3000 K
FP.CH28_2PP.indd 644 3/14/2023 7:09:58 PM

The diagram on the previous page shows (in a very simplified form) the
main regions of the HR diagram. The letters refer to spectral classes used
in astronomy. The Sun is in class G and has a surface temperature of about
5800 K.
A diagonal band runs from large luminous hot blue-white stars at the top left
to small dim red stars at bottom right. This is called the “main sequence” and
stars spend most of their lives on this band. At the end of their lives, when
nuclear fusion fuel in their core runs out, they move off the band as they
become red giants or supergiants and eventually white dwarf stars, neutron
stars, or black holes. These final two-star types do not appear on the HR dia-
gram because their luminosity and spectrum are not measured directly (and
luminosity is very low).
28.2 DISTANCES
One of the greatest challenges for astronomers and cosmologists is to find
ways to determine accurate distances to the objects they observe. Ancient
Greek astronomers managed to find ingenious methods to estimate the size of
the Earth and the distances to the Moon and Sun but modern space explora-
tion has provided accurate methods for surveying our immediate surround-
ings in space. Distances to objects beyond the solar system are determined
by several different overlapping methods and these regions of overlap can be
used to calibrate one technique against another.
28.2.1 Trigonometric Parallax

The Earth’s orbital motion causes the apparent positions of relatively nearby
stars to shift against the background of very distant (“fixed”) stars. This paral-
lax shift can be used to measure the distance to the nearby star.
very distant stars
R
background of

d
FP.CH28_2PP.indd 645 3/14/2023 7:09:58 PM

Telescopes can be used to measure the parallax angle a. This is half of the
total angular shift in the star’s position during a 6-month period (as the earth
completes half of an orbit).
R
= tan α
d
parallax angles are tiny so we can use the small angle approximation and
replace tan (a) with a (in radians).
R
d=
α
where R is the radius of the Earth’s orbit. This is known very accurately from
laser ranging within the solar system and trigonometry. If R is measured in
meters and a in radians then d will also be in meters. Astronomers often use
different (non-SI units):
1 Astronomical unit = 1 AU = 149 597 870 700 m
They also measure the parallax angle in seconds of arc where:
1 second of arc = 1/3600 degree
When these units are substituted into the equation for distance above, the
result is in parsecs (pc) so that a star with a parallax angle of 0.1 seconds of arc
is at a distance of 10 pc.
1 parsec (pc) = 3.0857 ´ 1016 m = 2.26156 light years (ly)
The parallax method using Earth-based telescopes is limited to about 100 pc
because as distance increases the parallax angles soon become too small to be
resolved. However, space telescopes (e.g., the Hipparcos satellite) can extend
this method to about 1000 pc.
28.2.2 The Inverse-Square Law and Cepheid Variables

Telescopes can be used to measure the intensity (or flux) I of stellar radiation
as it reaches the Earth. If we also know the luminosity L of the star being
observed then the inverse-square law can be used to calculate its distance d.
L
I=
d2
FP.CH28_2PP.indd 646 3/14/2023 7:09:58 PM

L
d=
I
In 1912, Henrietta Leavitt discovered a relationship between the period of

a certain type of variable star, a Cepheid variable, and its luminosity. This
period-luminosity rule provides a way to discover the luminosity of Cepheid
variables by measuring their period from the Earth. This is easily done simply
by monitoring the flux from the star over a period of time.
flux of radiaon
at Earth
mean flux
at Earth
period of Cepheid
variable, T
luminosity
period
T
FP.CH28_2PP.indd 647 3/14/2023 7:09:58 PM

Cepheid variables are very luminous so they can be detected out to a very
great distance. This gave astronomers a method to extend distance measure-
ments from our own galaxy to other quite distant galaxies.
The way to determine distance using Cepheid variables is summarized below:
Monitor the flux from a Cepheid variable.
Measure its period T and mean intensity I.
Use the period-luminosity relation to find the luminosity L.
Use the inverse-sqaure law to find the distance d.
This method can be calibrated against the parallax method using nearby
Cepheids.
Hubble used this distance method with groups of Cepheids in nebulae and
showed that the nebulae were actually separate galaxies outside the Milky Way.
28.2.3 Hubble’s Law

In the 1920s, Edwin Hubble and Vesto Slipher measured the spectra and
distances for many observable galaxies. When a galaxy is in motion relative to
the Earth the spectrum measured on Earth is shifted relative to the spectrum
of the same elements from a stationary source. This is a Doppler effect (see
Section 14.4.1) and can be used to calculate the velocity of the source, in this
case, the velocity of the galaxy. The wavelength shift dl is given by:
dλ = ( λ − λ0 )
where λ0 is the wavelngth from a stationary source and λ is the wavelength
from the moving source.
δλ
Astronomers usually work with the fractional shift, z = . If this is positive
λ0
the wavelengths are increased and it is called a “red-shift.” If it is negative the
wavelengths are decreased and it is called a “blue-shift.”
For velocities small compared to the speed of light the relationship between
the fractional shift in wavelength δλ and the velocity is given by:

λ0
δλ v
=
z =
λ0 c
FP.CH28_2PP.indd 648 3/14/2023 7:09:59 PM

To their surprise they discovered that:
δλ
z= is positive for all distant galaxies, that is, light from all distant galax-
λ0
ies is red-shifted.
The red-shift is directly proportional to the distance of the galaxy:
δλ v
z= = ∝d
λ0 c
Hubble realized that this means that:

Distant galaxies are moving away from us – “the Universe is expanding.”
(A few nearby galaxies are actually moving toward us but that is because
their own proper motion is greater than the global motion due to expansion.)
The recession velcoity of a distant galaxy is directly proportional to its
distance:
v∝ d
v = H0 d

This final equation is the “Hubble law.”
red-shi or
recession velocity
distance
Having established Hubble’s law, red-shifts can be used to determine the dis-
tances to distant galaxies. However, for very distant objects the source must
be extremely bright otherwise the flux of radiation reaching the Earth is too
weak for measurements to be made. Fortunately, there are extremely bright
FP.CH28_2PP.indd 649 3/14/2023 7:09:59 PM

objects that can act as “standard candles” for these measurements. One such
object is a type 1a supernova. Astronomers understand the physics of these
stars and can predict their luminosity, which is great enough for them to be
seen in the most distant galaxies. At these distances, the recession velocities
are a significant fraction of the speed of light so relativistic effects must also
be taken into account.
28.3 COSMOLOGY
Cosmology is the science of the Universe as a whole, dealing with its origin,
nature, evolution, and end. All of the evidence that we have suggests that the
laws of physics we have discovered from our own planet operate throughout
the Universe, so we use these to try to understand it. While cosmology deals
with physics on the largest scale it is intimately linked to physics on the small-
est scales and discoveries in particle physics and cosmology are often linked.
The enormous energies present soon after the Big Bang are reproduced in
particle physics experiments such as the Large Hadron Collider at CERN.
28.3.1 The Origin and Age of the Universe

Hubble’s law shows that the Universe is expanding so it is tempting to think
that if all the distant galaxies are moving away from us (as they are) then we
must be at the center of this expansion. However, this is not how a cosmologist
would view the expansion. An observer on any of the galaxies would see all
the other galaxies moving away from him in the same way that we do. There
is no center to the expanding Universe, all points are equivalent. A way to
understand this is to imagine that the Universe is two-dimensional and shaped
like the surface of a balloon with the galaxies as spots on the surface. As the
Universe expands the balloon gets larger and all the spots move apart. Those
initially close together move apart slowly, those that started farther apart sepa-
rate more rapidly, exactly as the Hubble law describes. An observer placed
on any one of the spots could discover this law and would see the Universe
expanding in exactly the same way – there would be no center to this surface.
A section of the expanding surface is shown below as it doubles in size. The
black dots represent three galaxies, A, B, and C. Distance AC is double the
distance AB.
As the scale doubles AB also doubles. An observer on A sees B recede at some
velocity v. AC also doubles, so the same observer (on A) also sees C recede.
However, the recession velocity of C is double that of A because AC = 2AB
FP.CH28_2PP.indd 650 3/14/2023 7:09:59 PM

Big Bang A B C
A B C
me
in both diagrams. This confirms the Hubble law – a galaxy at double the dis-
tance has double the recession velocity, v µ d. The same argument would hold
equally well for observers at B or at C.
If the Universe is expanding now, it must have been much smaller in the past.
Stephen Hawking and Roger Penrose showed that it must have begun as a
point or singularity of infinite density that exploded and has been expanding
ever since. This initial explosion is called the Big Bang.
We can use Hubble’s law to estimate how much time has passed since the Big
Bang. This is an estimate of the age of the Universe. The method assumes that
the galaxies have always had their present relative velocities. We can calculate
how long they have been separating by dividing their current separation by
their current recession velocity:
d 1
T= =
v H0
This is the “Hubble time.” It does not take into account the variation in galac-
tic velocities caused by gravitational forces but does give a good order of
magnitude for the age of the Universe. More sophisticated methods using
evidence from the cosmic background radiation (measured by the Wilkinson
Microwave Anisotropy probe, WMAP, in 2012) have given a much more pre-
cise estimate of the age of the Universe:
Age of Universe = 13.772 ± 0.059 × 109 years
So far we have described the red-shifts as if they are caused by the motions
of galaxies through space. However, we have already seen that the concept of
absolute space had to be abandoned and only relative motions are significant.
FP.CH28_2PP.indd 651 3/14/2023 7:09:59 PM

Einstein’s general theory of relativity goes further showing that space has geo-
metrical properties that can be changed by the presence of matter or energy
(space-time curvature). This theory provides a different interpretation of
red-shifts and the Big Bang and expansion. According to general relativity
space itself is expanding (rather like the surface of the balloon in the analogy
above). The red-shifts are therefore a result of the stretching of electromag-
netic waves as they cross the expanding space between galaxies. Hubble’s law
is consistent with this approach (as can be seen from the example above). One
new consequence of Einstein’s approach is that there will be “horizons” in the
Universe. While no object can move through space faster than the speed of
light it is possible for the space between two galaxies to expand so fast that
light cannot travel between them. When this happens the galaxies effectively
disappear over a cosmic horizon.
28.3.2 Evidence for the Big Bang

The Big Bang model of the origin of the Universe is supported by several
strong strands of evidence:
The red-shifts of distant galaxies are consistent with the Universe having
evolved from a tiny dense point.
The Universe is filled with cosmic microwave background radiation.
This has a black-body radiation spectrum corresponding to a tempera-
ture of about 2.7 K. This was predicted from the model and detected,
by accident, by Penzias and Wilson in 1964. Soon after the Big Bang the
Universe was filled with high intensity gamma-radiation that was in ther-
mal equilibrium at a very high temperature. As the universe expanded it
cooled and the radiation was red-shifted to longer wavelengths forming
the microwave background that we detect today. The fact that the spec-
trum is a black-body spectrum confirms that the radiation was in (almost)
thermal equilibrium and the fact that it is uniform from all parts of the sky
confirms that it filled the Universe.
Tiny fluctuations in the microwave background radiation (detected and
measured by the COBE and WMAP satellites) are of the correct mag-
nitude to account for galaxy formation. If the current microwave back-
ground radiation had been perfectly uniform then the early Universe
would not have contained enough concentrations of matter for galaxies
to form.
The ratios of light nuclides (e.g., hydrogen, helium, and lithium) through-
out the Universe are consistent with the amounts expected to have been
formed by nuclear fusion reactions as the Universe expanded. There was
FP.CH28_2PP.indd 652 3/14/2023 7:09:59 PM

a brief period after the Big Bang when the Universe was hot and dense
enough for some nucleosynthesis to take place but this soon stopped as
the Universe expanded and cooled. All the heavier elements were synthe-
sized by nuclear fusion reactions in stars.
The Big Bang and expanding Universe model is consistent with Einstein’s
general theory of relativity. In fact his theory requires a Universe that
either expands or contracts.
28.4 EXERCISES
1. (a) The surface temperature of the Sun is about 5800 K. Sketch a graph
to show how the intensity of radiation varies with wavelength for EM
waves leaving the surface and calculate the wavelength at which the
peak of this distribution occurs.
(b) The luminous flux from the Sun is about 1400 Wm−2 at the radius
of the Earth’s orbit. Calculate the luminous flux at the orbit of Mars
(about 1.5 times further away than the Earth.
(c) The luminosity of the Sun is about 4 × 1026 W with a surface tempera-
ture of about 5800 K. Calculate its radius.
(d) Toward the end of its life the Sun will become a white dwarf star
with a surface temperature of about 105 K. How will this affect the
spectrum of radiation it emits? Support your answer with a relevant
calculation.
Wien’s constant = 2.90 × 10− 3 mK, Stefan’s constant = 5.67 × 10− 8 Wm− 2K− 4
2. The peak wavelength in the spectrum of light emitted by the super-giant
star Betelgeuse is 830 nm and its luminosity is 3.5 × 1031 W
(a) Calculate its surface temperature.
(b) Calculate its radius.
Wien’s constant = 2.9 × 10− 3 mK, Stefan’s constant = 5.67 × 10− 8 Wm− 2K− 4
3. A distant galaxy has a red-shift of 0.05. The Hubble constant is about
2.2 × 10− 18 s− 1 in SI units.
(a) Calculate the velocity of the galaxy relative to the Earth and state the
direction in which it is moving.
(b) Calculate the distance of the galaxy.
FP.CH28_2PP.indd 653 3/14/2023 7:09:59 PM

(c) Use the Hubble constant to estimate the age of the Universe in years
and explain why the actual age is likely to differ from this value.
4. (a) Describe a parallax method for measuring the distance to a relatively
nearby star.
(b) Show that 1 parsec is about 3 ´ 1016 m and about 3.26 light years.
(c) Explain why stellar parallax cannot be used to measure the distance to
very distant stars.
1 astronomical unit (AU) = 1.496 ´ 1011 m
(d) How far away (in pc) is a star with parallax 0.052 seconds of arc?
5. (a) Two similar galaxies are observed. The brightest blue stars in galaxy A
have an apparent brightness 10 000 times greater than those in galaxy B.
Galaxy A is 107 light years away. How far away is galaxy B?
(b) Two type Ia supernovae are observed one week apart. The first is 100
Mpc away but its apparent brightness is only 0.070 of the apparent
brightness of the second. How far away is the second supernova?
6. (a) Explain what is meant by a “standard candle” in astronomy.
(b) Explain how a Cepheid variable can be used as a standard candle to
measure distances to galaxies beyond the Milky Way.
7. (a) A rocket is traveling away from the Earth at a velocity of 9.0 ´ 106 ms−1
when it transmits a signal to the Earth on a carrier frequency of
10.000 ´ 1010 Hz. To what frequency should the receiver on Earth be
tuned?
(b) A distant galaxy has a red shift of 0.030. Aliens on a planet in this galaxy
transmit a signal to Earth on a carrier frequency of 10.000 ´ 1010 Hz. To
what frequency should the receiver on Earth be tuned?
(c) Compare your answers to (a) and (b) and discuss whether or not the
shift in frequency has the same physical cause.
(d) Estimate the time taken for the signal in (b) to reach the Earth.
H0 = 2.2 × 10− 18 s− 1
8. (a) Explain how observations of galactic spectra led to the ideas of the
expanding universe, Hubble’s law and the Big Bang.
(b) State and explain two other pieces of evidence for the Big Bang as the
origin of the Universe.
FP.CH28_2PP.indd 654 3/14/2023 7:09:59 PM

CHAPTER
29
Medical Physics
29.1 ULTRASOUND
29.1.1 Overview of Ultrasound
Ultrasound (sonography) uses high-frequency sound waves to form images
of structures inside the human body and is particularly suited to imaging soft
tissues. Typical frequencies are in the range of 1–20 MHz with corresponding
wavelengths from 2 to 0.1 mm. Distances to tissue boundaries are computed
using the time for reflected pulses to return:
1
distance to boundary = × speed of ultrasound in tissue × time for pulse to return
2
The speed of ultrasound in the body is about 1550 ms- 1. The higher frequency
waves provide higher resolution but are absorbed more strongly so do not
penetrate so far into the body.
Ultrasound transducers use piezoelectric crystals to generate and detect the
waves. When an alternating voltage is applied across the crystal it vibrates and
when the crystal experiences alternating stress (as it absorbs an ultrasound
wave) it generates an alternating voltage.
There are several different types of an ultrasound scan. An A-scan (“amplitude
mode”) is the simplest procedure, using a single transducer to detect echoes.
As the scanner is moved along a line the depth is computed and d isplayed
on a screen. A B-scan (“brightness mode”) uses a linear array of transducers
and creates a 2D image as the scanner is moved. It is also possible to ultra-
sound pulses to compute the velocities of tissue boundaries in order to create
FP.CH29_3PP.indd 655 3/15/2023 12:50:26 PM

a moving image, this is M-mode (“motion mode”). When ultrasound reflects

from a moving object it is Doppler shifted. Doppler mode ultrasound uses
these shifts to measure and display blood flow rates.
Studies into the safety of ultrasound techniques have shown that they present
very low risk with no confirmed evidence that normal ultrasound scans cause
any significant damage to humans. This makes them preferable to X-ray
techniques for many diagnostic uses, including prenatal scanning.
29.1.2 Ultrasound and the Eye

A-scans are used to measure the structure and dimensions of the human
eye to ensure that the correct lens implant is used following surgery for the
removal of a cataract. The diagram below shows how an A-scan result relates
to the eye. The speed of the ultrasound waves in the eye is 1550 ms- 1. The
dotted lines on the upper diagram correspond to tissue boundaries responsi-
ble for the peaks in the detected signal. The gel between the transducer and
the cornea reduces the amount of the incident signal that is reflected by the
outer surface and does not enter the eye.
cornea rena
gel lens
ultrasound
trasnducer
distance from probe
Signal strength
ultrasound pulse
me / µs
0 40
FP.CH29_3PP.indd 656 3/15/2023 12:50:26 PM

Medical Physics • 657
B-scans of the eye are used to diagnose problems such as a detached retina,
glaucoma, or cataracts or to monitor the shape and size of a tumor. In the
B-scan, the incident ultrasound is moved back and forth to scan slices of the
eye. These can then be combined to form an image. The eye is quite small so
ultrasound does not have to penetrate far and higher frequencies can be used.
Recently frequencies up to 50 MHz have been used to provide extremely
high-resolution images of structures at the front of the eye. At these frequen-
cies, penetration is just a few millimeters.
B-scans in pre-natal scanning have to penetrate farther into the body so these
are limited to lower frequencies. This reduces their resolution but the struc-
tures being imaged tend to be larger so this is not a major problem.
29.1.3 Doppler Ultrasound for Blood Flow Measurements

Ultrasound is directed into a blood vessel and reflects from the moving
cells. The reflected waves are Doppler shifted and the frequency difference
between the incident and reflected waves is proportional to the speed of the
flow. The diagram below illustrates how this is carried out:
ultrasound transducer
coupling gel
blood flow
As the blood flow pulses the frequency difference shows a series of peaks:
frequency
me
FP.CH29_3PP.indd 657 3/15/2023 12:50:27 PM

29.1.4 Using Ultrasound to Break Kidney Stones

Kidney stones can be painful and might get stuck in the tubes connecting
your kidney to your bladder. Small stones are usually passed out of the body
in urine but larger ones can be located using ultrasound and then broken
into small pieces by high-intensity ultrasound pulses. This technique is called
Extracorporeal Shock Wave Lithotripsy (ESWL).
29.2 X-RAYS
29.2.1 Overview of Medical X-rays
X-rays were discovered in 1895 by Wilhelm Roentgen and he won the very
first Nobel Prize for physics in 1901. X-rays are a form of high frequency,
short wavelength electromagnetic radiation emitted when electrons moving
at high-speed crash into a target and stop suddenly. The radiation is ionizing
and highly penetrating and Roentgen took the first X-ray photograph, of his
wife’s hand, soon after his discovery.
X-rays are now used routinely in medicine to create images of the inside of the
human body. In conventional X-ray radiography the X-rays pass through
the body and are absorbed to differing extents by the tissues through which
they pass, creating a shadow image that can be captured on film or by arrays
of detectors. A more sophisticated technique, called Computed Tomography
(CT-scanning) involves rotating the X-ray source and detectors around the
body to create a 3D image.
7 6
6
5
dose / mSv
4
3
average annual radiaon dose
2
2 1.5 1.5
1 0.1 0.4
0.001 0.005
0
procedure
FP.CH29_3PP.indd 658 3/15/2023 12:50:29 PM

X-rays are ionizing radiation so they can damage tissues and doctors must
always balance risk against benefit when deciding whether to use them. The
risks depend on the wavelength and intensity of the X-rays, the duration of
the procedure, and the tissues being exposed. To assess the risk the -ray dose
is compared to the annual radiation dose from natural background sources.
The chart below shows typical doses from different types of X-ray procedures.
29.2.2 Generating X-Rays

To generate X-rays an electron beam is accelerated through a potential differ-
ence of between 30 and 150 kV and directed onto a target cathode made of
tungsten. As the electrons stop some of their kinetic energy is transferred to
X-ray photons. The radiation is called “bremsstrahlung,” which means “brak-
ing radiation.” The rest of the incident energy (99% or more) is transferred
to heat in the target. Removal of heat from the target is a serious problem in
X-ray tube design. Modern CT machines are rated at up to 100 kW and the
focal spot on the anode can reach temperatures in excess of 2000°C. A high
melting point target, for example, tungsten, is mounted in a material with high
thermal conductivity, for example, copper, and a coolant is pumped through
the copper. The shape of the anode is designed to produce a narrow beam of
X-rays from a fairly broad beam of electrons.
accelerang voltage, e.g. 50 kV
rotang tungsten
anode (target)
cathode
vacuum cooling system for
anode
electron
beam
lead shielding lead shielding
X-rays
FP.CH29_3PP.indd 659 3/15/2023 12:50:30 PM

There are two main adjustments to the tube – the electron current and the
accelerating voltage. Increasing the current increases the number of electrons
per second striking the target and this increases the number of X-ray photons
emitted per second. Increasing the voltage increases the energy of the elec-
trons and the maximum frequency of the X-ray photons. It also increases the
number of photons emitted.
The spectrum of X-rays produced is continuous, but also contains some sharp
emission lines that are characteristic of the target element. These lines are
created when electrons strike atoms in the target and eject an electron from
an inner orbit (e.g., K-shell or L-shell). The vacancy is then filled by electrons
from higher orbits cascading down and emitting photons as they do so.
X-ray intensity
K-lines
L-lines
wavelength
The largest energy jumps are for electrons dropping into the innermost shell,
the K shell, and these correspond to the shortest wavelength spectral lines.
The energy jumps into the K and L shells correspond to X-ray photon energies.
The short wavelength cut-off corresponds to all the energy of one incident
electron being transferred to a single X-ray photon. The rest of the continuous
spectrum corresponds to more complex interactions and multiple collisions.
The higher frequency, shorter wavelength X-rays are the most penetrating.
These are called “hard X-rays” and are the ones needed for image formation.
Longer wavelength, “soft X-rays” are usually filtered out because they do not
contribute to the image but do increase the radiation dose. The absorption of
X-rays by tissues attenuates the beam. The amount of attenuation increases
with the density of the tissue, so bones absorb X-rays more strongly than the
surrounding soft tissues. This is what creates the contrast in an X-ray image
FP.CH29_3PP.indd 660 3/15/2023 12:50:30 PM

and X-rays are particularly good for imaging bones and bone damage. Soft
tissues do not create much contrast so often a contrast medium is injected
prior to the X-ray. CT scans are better at imaging soft tissue than standard
X-rays.
29.2.3 Attenuation of X-Rays in Matter

There are two mechanisms by which X-rays are absorbed in matter:
The photoelectric effect – where an incident X-ray transfers all of its
energy to an electron and this electron undergoes multiple collisions. The
probability of photoelectric scattering drops rapidly with X-ray energy.
The photoelectric effect is the mechanism that determines contrast in an
X-ray image.
Compton scattering – the incident X-ray transfers some energy to an
electron but scatters off with lower energy (longer wavelength) and inter-
acts with other electrons. The probability of Compton scattering falls only
slowly with increasing X-ray energy. Compton scattering is the mecha-
nism that determines the noise in an X-ray image.
The intensity of an X-ray beam falls off exponentially as it passes through
matter according to the Beer–Lambert law:
Iout = Iin e−µx
where m is the linear absorption coefficient of the material (m- 1).

This has two important consequences. A certain minimum intensity is
required to produce an image so the exposure time will depend on the size
of the patient and larger patients will need longer exposures and will receive
a larger dose. Also, since the intensity falls exponentially through the patient,
tissues near the top surface, where the X-rays enter the body, receive a larger
dose than tissues near the X-ray detector. Most of the dose is absorbed close
to the skin at the top surface of the patient, and this is where there you most
risk of tissue damage.
Different materials have different absorption coefficients and these d ifferences
are responsible for creating contrast in X-ray images. The d ifferences become
larger at lower X-ray energies so these are preferable for high-contrast imaging.
The drawback is that the low-energy X-rays are absorbed more strongly than
high-energy X-rays so the dose increases.
FP.CH29_3PP.indd 661 3/15/2023 12:50:30 PM

Absorption coefficients for soft tissues do not vary much so contrast media
are often used to create X-ray images of blood vessels, the fallopian tube, the
urinary tract, the digestive system, etc. These are based on iodine or barium
compounds that absorb X-rays strongly.
29.2.4 Creating X-Ray Images

A simple medical X-ray imaging machine uses a film placed underneath the
patient to detect the X-rays.
X-rays
paent
lead grid
film
Some of the X-rays passing through the patient are scattered off axis so a lead
grid is slowly moved between the patient and the film during the exposure.
This eliminates off-axis X-rays and increases the signal-to-noise ratio for the
image.
An intensifying screen can be used to increase the number of light photons
created from each X-ray photon. This consists of thin layers of fluorescent
material placed in front of the film. The fluorescent layers absorb the X-rays
and emit visible photons. The arrangement is housed in a cassette that is
placed under the patient for exposure.
An X-ray filter (usually a metal plate) is placed between the X-ray source and
the patient to filter out the low energy (long wavelength) X-ray photons. This
reduces the total dose given to the patient but does not affect the intensity of
the image because low energy photons would have been absorbed in the body.
An X-ray image intensifier can also be used to increase the brightness of the
image. Incident X-rays strike a phosphor screen that converts the X-rays to
photons of visible light. These strike a photocathode that emits electrons.
FP.CH29_3PP.indd 662 3/15/2023 12:50:31 PM

The electrons are accelerated and focused in a “photomultiplier” tube and

converted back to photons when they strike another phosphor screen.
electrodes to accelerate and

focus electrons
output phosphor screen
fibre opc cable
X-rays
to camera/viewer
vacuum
input phosphor screen photomulplier tube

photocathode
CT scans rotate an X-ray source and collimator around the patient to create
detailed images of slices of the body. An array of fixed detectors surround
the patient.
X-ray source and collimator rotated

around paent
X-rays X-rays
fixed detectors
CT
FP.CH29_3PP.indd 663 3/15/2023 12:50:31 PM

Whereas two-dimensional images are composed of two-dimensional p icture

elements or “pixels,” three-dimensional images are composed of volume
elements or “voxels.” Each voxel is assigned a value that represents how
strongly it attenuates the X-rays that pass through it. The voxel values are
calculated by a computer algorithm based on the attenuation of X-rays passing
through from different directions (hence the need to rotate the X-ray source
around the patient).
X-rays
8 voxel cube
detectors
The X-ray attenuation across different directions is used to determine the

value for each individual voxel.
29.3 MAGNETIC RESONANCE IMAGING (MRI)

29.3.1 Overview of MRI
Magnetic resonance imaging is particularly useful for imaging soft tissues
that have low contrast with X-ray techniques, for example, brain imaging. It
involves placing the patient in a strong magnetic field but does not involve the
use of ionizing radiation, so the risks to the patient are very low, and long expo-
sures are possible. However, the strong magnetic field must be created using
superconducting electromagnets and this makes this an expensive procedure.
29.3.2 The Physics of MRI

Nuclei are positively charged and when the nucleus of an atom spins it creates
a magnetic field that is like that of a bar magnet. Magnetic resonance tech-
niques manipulate the nuclear spin of hydrogen atoms to generate a signal
that can be used to form images. Since hydrogen atoms are spread throughout
FP.CH29_3PP.indd 664 3/15/2023 12:50:31 PM

the human body, mainly in water molecules, they are ideal for imaging all
parts of the body.
axis of dipole precesses around

external magne
c field direc
on
proton (hydrogen nucleus) with
N
magne
c dipole
When an external magnetic field is applied the spinning nuclei precess around
it at a fixed frequency called the Larmor frequency:
γB0
f=
2π
where g is the gyromagnetic ratio.

The gyromagnetic ratio for a hydrogen nucleus (a proton) is 42.6 MHzT- 1.
Clinical MRI scanners use magnetic field strengths in the range of 0.2–3.0
T and research scanners use up to 11 T. These give Larmor frequencies that
superconducng coils in
liquid helium cryostat
magnec field
layers of thermal
insulaon gradient coils
radio frequency
transmit coils
scanner moveable
table detecon coils
MRI
FP.CH29_3PP.indd 665 3/15/2023 12:50:32 PM

correspond to radio waves. An additional weaker variable magnetic field is

applied along the axis of the patient so that the Larmor frequency varies with
position. Pulses of radio waves corresponding to the Larmor frequency across
a particular slice of the body are then transmitted through the body. This dis-
turbs the precessing nuclei in that slice so that they create a rotating magnetic
field at the same frequency. It is this field that is detected using coils placed
outside the body. The rate at which this field decay depends on the type of tis-
sue surrounding the nuclei so can be used to distinguish different types of soft
tissue and to achieve much higher contrast than X-ray techniques. By varying
the frequency of the radio waves nuclei in different locations resonate and an
image can be built up.
29.4 RADIOACTIVE TRACERS

29.4.1 Overview of the Use of Radioactive Tracers
A radioactive tracer is an element or compound containing a radioactive atom
that is introduced into the body so that the emitted radiation can be detected
outside the body. If the tracer is linked to an element or compound used in
a particular metabolic process the doctors can monitor the build-up and dis-
charge of this compound in particular organs. The thyroid gland in a human
uses iodine to produce hormones, so introducing some radioactive iodine as
a tracer can help a doctor to diagnose a faulty thyroid gland. The meta-stable
gamma-emitter, technetium-99m can be added to a number of important bio-
logical compounds and is used to diagnose a wide range of illnesses, including
kidney problems.
Radioactive tracers are usually gamma-emitters because gamma-rays are the
most penetrating and least ionizing form of radiation and can be detected,
using a gamma-camera, outside the body. Sources with a half-life of a few
hours are ideal because they remain in the body long enough for the proce-
dure to be carried out but fall to safe levels relatively soon afterward.
X-rays are used to image static structures inside the body but radioactive trac-
ers are used mainly to monitor functions.
29.4.2 The Gamma-Camera

Gamma-rays are detected when they fall onto a scintillator containing sodium
iodide crystals that emit photons of visible light when they absorb gamma-ray
photons. The scintillator is mounted in front of an array of photomultipliers
FP.CH29_3PP.indd 666 3/15/2023 12:50:32 PM

that emit electrons and amplify the signal. A computer processes its electrical
output to produce the final image. A lead grid is placed in front of the scintil-
lator to reject off axis gamma-rays and create a sharper image.
gamma-rays
collimator:
lead grid
scinllator: sodium
iodide crystal
photomulpliers
Compute
Display
29.5 POSITRON EMISSION TOMOGRAPHY (PET SCANS)

Positron emission tomography detects gamma-rays emitted when a positron
annihilates with an electron inside the body. The positrons result from the
beta-plus decay of a radioactive tracer injected into the body prior to the scan.
This technique is particularly useful for investigating how different organs are
functioning or to monitor the extent, development, and response to the treat-
ment of cancers. It can be combined with CT scans or MRI scans to produce
detailed three-dimensional images of the body.
29.5.1 The Physics of PET Scans

The radioactive tracers used for PET scans are short half-life beta-plus
emitters such as fluorine-18 (half-life of about 110 minutes) carbon-11

FP.CH29_3PP.indd 667 3/15/2023 12:50:33 PM

(half-life of about 20 minutes) and oxygen-15 (half-life of about 2 minutes).

These are attached to compounds that the body uses for particular bio
logical pathways, for example, glucose. The uptake of the compound can
be then monitored from outside the body by detecting the pairs of 511 keV
gamma-rays emitted as the beta-plus particles, positrons, annihilate with
electrons in the s urrounding tissue.
The equations below summarize the process:
18
9 F → 188 O + 10 e + 00 υ
0
1 e+ 0
−1 e → 2 00 γ
The positron emitted in this decay only travels a short distance (less than
1 mm) through the tissue before meeting an electron and annihilating. The
annihilation of the electron-positron pair results in the emission of a pair of
gamma rays that travel in opposite directions (a pair must be emitted in order
to conserve linear momentum). The position of the annihilation along the
line determined by the two gamma rays is determined from the time delay in
arrival at detectors on either side of the patient.
The detectors only respond to near-simultaneous pairs of photon arrivals
(within about 10 ns of each other) and then measure the small additional time
delays for each pair.
B
PET
annihilaon
gamma-ray photon arrives at A a scinllators and

me δt before the corresponding photomulpliers
A
gamma-ray photon arrives at B
FP.CH29_3PP.indd 668 3/15/2023 12:50:33 PM

For a time delay dt the annihilation event must have been at a position
that is cdt/2 closer to A than the center of the chord AB, that is, a distance
AB/2 - cdt/2 from A and AB/2 + cdt/2 from B.
In practice, the two gamma rays emitted from an annihilation event are not
emitted at exactly 180° so this introduces an uncertainty into the position
of the chord. In addition to this, the detector can only resolve events that
are more than about 0.50 ns apart so this introduces an uncertainty into the
position along the chord. The image quality and resolution improve as more
events are detected (signal-to-noise ratio falls) and resolutions of about 1-2
mm are possible with clinical scanners. This is not as good as a CT image but
PET scans can be used to investigate a very wide range of metabolic pathways
and when used alongside CT or MRI scans information about both structure
and function can be combined.
The injection of a radioactive tracer means that the patient remains
radioactive for a short time after the procedure. For a typical PET scan
involving fluorine-18 the total activity injected is about 370 MBq. The patient
will absorb a radiation dose equivalent to that of a full body CT scan (about
7 mSv). If the PET scan is combined with a CT scan the total dose will be the
sum of doses from the two procedures.
29.6 EXERCISES
State and explain the risks associated with X-ray imaging of the
1. (a)
human body.
When an X-ray tube is used to produce a photographic image of a
patient an aluminum plate is placed between the patient and the
X-ray tube.
Explain how this can reduce the radiation dose without affecting the
(b)
intensity of the X-ray image that is formed.
2. Some tissue injuries can be imaged equally effectively using ultrasound
or an MRI scan.
(a) Compare the physical principles used in the two processes.

(b) If you were a doctor, which would you recommend and why?
FP.CH29_3PP.indd 669 3/15/2023 12:50:33 PM

3. The diagram below shows a typical spectrum from a medical X-ray

machine.
X-ray intensity
K-lines
L-lines
wavelength
Explain how the continuous spectrum arises and why it has a short
(a)
wavelength cut-off.
(b) Calculate the cut-off wavelength for a 40 kV X-ray machine.
Explain how the line spectra come about and account for the differ-
(c)
ence between the K and L lines.
4. State and explain four factors that must be considered when selecting a
suitable radioisotope to use as a tracer in the human body.
Explain why the annihilation of an electron and a positron during
5. (a)
a PET scan is likely to result in two gamma-rays of the same wave-
length emitted in opposite directions.
(b) Calculate the wavelength of these gamma rays.
FP.CH29_3PP.indd 670 3/15/2023 12:50:33 PM

APPENDIX
A
Estimations and Fermi Questions
A.0 FERMI AND THE TRINITY TEST

The first atomic bomb was tested at the Trinity site in New Mexico in 1945.
The physicists and engineers involved in its design and construction placed
bets on the energy that would be released by the explosion. Enrico Fermi,
the Italian physicist who had designed and tested the first controlled nuclear
chain reaction, observed the test from a safe distance of about 12 km. The
explosion created a shock wave in the atmosphere that spread out in all direc-
tions. Fermi dropped several pieces of paper as the shock wave reached him
and observed that they were displaced horizontally about 2.5 m by the distur-
bance. From this information (and his knowledge of physics), he estimated
the yield of the bomb to be about 10 kilotonnes (i.e., equivalent to the explo-
sion of 10 kT of TNT, a conventional chemical high explosive). Later, more
detailed calculations showed that the actual yield was closer to 20 kT.
It seems remarkable that Fermi was able to do this, based on such minimal
information and a very simple experiment, but a good understanding of the
underlying physics provides the tools to make useful estimates of a very wide
range of physical quantities and the skills used in making these estimations
can also be employed to non-physical problems. Such methods are invaluable
because real-world problems are rarely fully defined and it is important to
have a ballpark figure in mind before committing to a project or experiment.
It is also useful to estimate expected results so that you can tell whether the
actual results from an experiment or calculation are reasonable.
Fermi built up quite a reputation for being able to estimate values based on
minimal information and these types of problems are often called “Fermi
problems” for that reason.
FP.CH30_App-A_2PP.indd 671 3/14/2023 7:11:24 PM

It’s not clear what method Fermi used to make his estimation but there
are usually many ways to solve a Fermi problem and here is a (very simple)
approach to this one.
Let’s assume that the energy released by the explosion pushes the atmos-
phere back so that a hemisphere of air with a radius equal to Fermi’s distance
from ground zero (the position of the explosion) is displaced outwards by 2.5
m. Work must be done against the atmospheric pressure further out, so the
energy transfer can be calculated using force times distance:
Surface area of hemisphere = 2pr2
Outward force F = 2pr2 × p where p is the atmospheric pressure (about
105 Pa).
Work done = F δx where δx is the outward displacement of the air.
Work done = 2p × (12 000)2 × 105 × 2.5 = 2 × 1014 J
1 kT = 4.2 × 1012 J
Fermi
NOT TO SCALE 12 000 m

2.5 m 2.5 m
ground zero
This gives a result of about 50 KT, much greater than Fermi’s estimate and
about 2.5 times the actual yield of the Trinity test. However, it IS the correct
order of magnitude, which is pleasing given the incredibly simple model used
to make the estimate. Fermi would have used a more sophisticated model,
taking into account the actual pressure differences in the shock wave and
the proportion of the input energy that went into it. His estimate of 10 kT
was impressive, but did not win the bet. Isodor Rabi was the winner with an
estimated yield of 18 kT. We do not know how he did this (or maybe he just
got lucky).

Estimations and Fermi Questions • 673
A.1 MAKING ESTIMATIONS

A good estimation is usually based on a simplification of the relevant physics
and reasonable estimates of the size of relevant parameters. In addition to this,
we usually need to make some assumptions to simplify the actual calculations.
In the example, above we reduced a complex process of energy transfer to
the expansion of a gas at constant pressure so that we could use the equation
W = pdV = Fdx. We estimated the atmospheric pressure and we assumed
that the pressure and displacement were constant over the hemisphere. We
also assumed that all the energy released by the explosion was used to move
this layer of air back. You will also notice that we tended to round values off
quite severely, often to 1 significant figure (e.g., distance to ground zero and
atmospheric pressure). This makes sense because we are not dealing with
definite known values and we cannot justify great precision. Many estimates
will produce a value that can only be quoted to 1 significant figure and that
has an uncertainty of over 100%! In this section, we will illustrate how to make
estimates by using a few examples.
A.1.1 How Many Air Molecules in the Earth’s Atmosphere?

Method 1
Atmospheric pressure is due to the weight of air in the atmosphere, so we can
use the equation for fluid pressure at depth: p = rgh to work out the effective
depth of the atmosphere if it was of uniform density:
ρgh =
10 5
10 5
=h = 10 4 m
1 × 10
This is much less than the radius of the Earth so the atmosphere can be
treated as a thin layer (like a carpet with an area equal to the surface area of
the Earth). The volume of this layer can then be used to find how many moles
of gas are present and this can be used to find the number of molecules. The
atmosphere is equivalent to a 10 km deep uniform layer. The fact that the
density of the atmosphere varies with height does not affect this estimate – it
would have the same weight and exert the same pressure at the surface for
any pattern of density variation because it is only the total weight of all the
molecules that are responsible for the surface pressure.

The surface area of the Earth is: A = 4 π × ( 6.4 × 10 ) = 5.15 × 10 m

6 2 14 2
The volume of air at atmospheric temperature and pressure that would exert
the same pressure at sea level as the Earth’s atmosphere is:
V = 5.15 × 1014 × 10 4 m3
1 mole of an ideal gas at atmospheric temperature and pressure occupies a

volume of 24 liters or 0.024 m3, so the number of moles of air molecules in
the atmosphere is:
5.15 × 1014 × 10 4 m3
=
n = 2.1 × 10 20 moles
0.024
1 mole contains 6.02 × 1023 molecules so the total number of air molecules in
the atmosphere is:
N =2.1 × 10 20 × 6.02 × 10 23 =1.3 × 10 44 molecules
There are about 1044 air molecules in the Earth’s atmosphere.
Method 2
This method has some similarities to the first but uses the total mass of the
atmosphere and the mass of an “air molecule.” The total weight of the atmos-
phere is equal to the atmospheric force exerted on the entire surface of the
Earth:
F= pA= 10 5 × 4 × p × ( 6.4 × 10 6 ) = 5.15 × 1019 N

2
The mass of the atmosphere is:
F
m= = 5.25 × 1018 kg
g
Air consists mainly of oxygen (molar mass = 32 g) and nitrogen

(molar mass = 28 g) so an “air molecule” is taken to have a molar mass of 30 g.
Number of moles of air molecules in the atmosphere:
5.25 × 1018
=
n = 1.75 × 10 20 moles
0.030

The number of air molecules in the atmosphere:
N = 1.75 × 10 20 × 6.02 × 10 23 = 1.05 × 10 44 molecules
There are about 1044 air molecules in the Earth’s atmosphere.

This is (not surprisingly) consistent with our first method.
A.1.2 What Is the Minimum Area for a Parachute?

First, we must interpret the question. Why is there a minimum area? If the
parachute is too small the terminal velocity of the parachutist will be too high
and they will be injured or killed when they hit the ground. The first thing to
do therefore is to estimate a maximum safe landing speed. It might be tempt-
ing to guess what this is but it is better to make a reasoned estimate based on
sensible assumptions. One way to do this would be to think about the highest
object you could jump, from and still expect to land safely – perhaps this is
about 2 m high. This can then be used to calculate a maximum safe landing
speed:
v2 = u2 + 2 gh = 0 + 2 × 9.8 × 2 = 39.2
v = 6.3 ms-1
So, a maximum safe landing speed is about 6 ms-1. parachute

Now we need a physical model of the falling canopy of
parachutist. At terminal velocity, the weight of radius r
the parachutist and parachute is equal and oppo-
site to the total drag force acting upwards on the
parachute. Where does this drag force come from?
It comes from the collision of the parachute canopy δs = vδt
with air molecules as it moves downwards – the
canopy exerts a downward force on the molecules
and the molecules, by Newton’s third law, exert an
equal upward force on the canopy. Using Newton’s Air
second law, the magnitude of these forces must
equal the rate of change of momentum of the air
molecules. This is what we will estimate next.

During a time dt the canopy moves down through a distance vdt and sweeps
through a volume pr2vdt of air. The mass of air in this volume is equal to
rpr2vdt where r is the density of the air. If we assume that all of this air must
be accelerated up to speed v as the canopy passes we can work out the rate of
change of momentum:
mass of air swept out in δt × speed

rate of change of momentum =
δt
ρπr 2 v δt
2
rate of change of momentum = = ρπr 2 v2
δt
This is equal to the drag force on the parachute and, at terminal velocity, the
weight of the parachute and parachutist:
ρπr 2 v2 = mg
This equation can be rearranged to give an expression for r, the radius of the
parachute:
mg
r=
ρπ2 v2
Now we need to input some reasonable values:

mass of parachutist and equipment m = 100 kg
density of air r = 1.2 kgm- 3
maximum speed v = 6 ms- 1
This gives a minimum safe radius of:
100 × 9.8
=r = 7.2 m
1.2 π × 2 × 6 2
This is probably an overestimate but is certainly of the correct order of mag-

nitude. It corresponds to a minimum area of about 160 m2.
The approach above was based on momentum but different estimates result
if we consider kinetic energy instead. To make the estimate more realistic
we should consider the drag coefficient for a parachute of a particular shape.
Nonetheless, our simple method has produced a useful equation to begin an
investigation of how parachute size and terminal velocity are related.

A.2 USEFUL VALUES

You will have noticed that values such as atmospheric pressure and the den-
sity of air at a.t.p. are featured in the estimations above. As a physicist, it is
useful to have some values at your fingertips because they are important in so
many different situations. Here are some numbers that are worth remember-
ing (they are not precise – these are for use in estimates):
Speed of light: 3.00 × 108 ms−1
Speed of sound (in air at sea level): 330 ms−1
Density of air at a.t.p.: 1.2 kgm−3
Density of water: 1000 kgm−3
Density of aluminum: 2700 kgm−3
Density of steel: 8000 kgm−3
Density of mercury: 13 600 kgm−3
Specific heat capacity of water: 4200 Jkg−1 °C −1
Atmospheric pressure: 105 Pa
Molar volume of an ideal gas at a.t.p.: 0.024 m3 (24 litres)
Wavelength of visible light: ~ 500 nm
Wavelength range of visible light: 400 nm – 700 nm
Audible frequency range for humans: 20 Hz – 20 kHz
Speed of ultrasound in water or tissue: ~ 1500 ms−1
Size of an atom: 10−10 m
Size of a nucleus: 10−15 m
Avogadro number: 6 × 1023 mol−1
Mass of a proton: 1.7 × 10−27 kg
Mass of an electron: 9.1 × 10−31 kg
Charge on an electron: −1.6 × 10−19 C
Radius of Earth: 6400 km
Mass of Earth: 6 × 1024 kg

Mass of Sun: 2 × 1030 kg

Distance to Moon: 400 000 km (~1.3 light seconds)
Distance to Sun: 1.5 × 1011 m (8 light minutes)
Age of the universe: 13.7 billion years
Number of stars in the Milky Way: 400 billion (4 × 1011)
You can add to this list. It is also helpful to know the values of several physical
constants and be able to recall the important equations.
A.3 FERMI QUESTIONS

While there is no clear distinction between a simple estimation and a fermi
question the latter tend, on first reading at least, to seem rather abstract.
Perhaps the most famous of these is:
“How many piano tuners in Chicago?”
This seems to have nothing to do with physics and yet questions like this
have been used in selection interviews for places to read the subject at top
Universities. Why? The reason is simple. A good physicist has to be able to
use her knowledge and skills to solve problems in unfamiliar contexts. She
must deconstruct the question and then tackle it logically, making reasonable
estimates of quantities she will need to feed into a calculation. Look at the
question above and ask yourself what information you might need to solve it.
Here are a few suggestions:
the population of Chicago (a)
the fraction of the population that has a piano (b)
the number of times per year on average each piano is tuned (c)
the time is taken to tune a piano (d)
the average number of hours worked per day by a piano tuner (e)
the number of days per year that a piano tuner works (f)
there may well also be other things to consider or estimate, but nothing in this
list should cause too much of a problem or give too wild a result. Once these
values have been estimated they can be fed into a calculation like the one below:
Number of pianos to be tuned in one year: abc
Number of pianos tuned by one piano tuner per year: fe/d
Number of piano tuners in Chicago: abcd/fe

Try it. Which estimates are you most confident about and which have larg-
est uncertainty? How would varying these parameters affect the final esti-
mate? Can you put an upper and lower limit on the number of piano tuners in
Chicago? Is there any way to test your estimate?
A.4 THE DRAKE EQUATION

In 1961, Dr. Frank Drake posed a Fermi question:
“How many advanced civilizations exist in our galaxy?”
Drake approached this problem in the same way that we approached the prob-
lem of piano tuners in Chicago. He considered different factors that might
affect the number of advanced civilizations and then constructed a formula
to determine that number. This formula is known as the “Drake equation.”
However, the factors he identified are much harder to estimate than those in
our problem so the Drake equation is used to discuss possibilities rather than
to provide a ballpark answer. Adjusting the factors can lead to a prediction
that the galaxy is teeming with intelligent life or that we are alone in it.
As we discover more about the Universe we are gradually able to fine-tune the
values that we input into the equation. For example, one of the factors is the
number of stars in our galaxy that have planets orbiting them. In Drake’s time,
the only star known to have a planetary system was our Sun. Now we know
that a very large number of extrasolar planets exist and it might be the case
that most stars have planetary systems. We have also discovered some “Earth-
like” planets orbiting other stars. This increases another factor in the equation
making the likely number higher. However, the uncertainties in other factors,
especially in those linked to the emergence and evolution of life, are huge.
N = R fp ne fl fi fc L
where
N = the number of advanced civilizations whose EM communications are
detectable
R = the rate of formation of suitable stars (stars such as our Sun).
fp = the fraction of those stars with a planetary system.
ne = the number of Earth-like worlds per planetary system.
fl = the fraction of those Earth-like planets where life actually develops.
fi = the fraction of planets where life develops that evolves intelligent life.

fc = the fraction of planets where intelligent life develops that develops

detectable communications
L = the “lifetime” of communicating civilizations
Some of these factors are reasonably well known – for example, R is thought
to be about 10 new stars per year in the Milky Way and fp is probably between
0.5 and 0.8. L is usually assumed to be of the order of human written his-
tory, around 10 000 years, but that’s a rather arbitrary choice. The others are
harder to pin down. Make estimates and see what these imply for N. This
should give you a feel for how this equation can promote arguments between
astrobiologists.
A.5 TRY THESE: ESTIMATES AND FERMI QUESTIONS

Here are some problems to try. There are no definite correct answers but they
are not guesses either. Start from what you know or can reasonably estimate
and then use physical and logical principles to work your way to an answer.
Once you have done this ask yourself whether it seems reasonable.
1. How many atoms are in a golf ball?

2. How far can a car go before it rubs off a layer of rubber one molecule
thick?
3. How many molecules from Isaac Newton’s last breath do we inhale when
we breathe in?
4. What area of solar panels is needed to supply all of Britain’s energy
requirements?
5. How many trucks would be needed to cart away all the rocks in Mount
Everest?
6. How long would it take to fill an Olympic swimming pool from a kitchen
tap?
7. How many photons enter your eye per second on a sunny day?
8. How many times more expensive is electrical energy from an AA battery
than from the main electrical supply?
9. If all the atoms in an elephant were put in a line, how long would the line be?
10. How many photons are emitted by a camera flashgun?

11. What is the speed of the wing tip of a bee?

12. How many leaves are on a mature oak tree?
13. What is the drag force on a large truck traveling at speed on a motorway?
14. What is the power output of a racing cyclist?
15. What is the greatest distance at which the human eye can resolve car
headlamps?
16. Will relativistic effects be important for an electron in the ground state of
a hydrogen atom?
17. What is the spring constant of a car’s suspension system?
18. How many hairs are on your head?
19. How much energy can be supplied by a car battery?
20. What is the temperature at the center of the sun?

APPENDIX
B
Experimental Investigations
B.0 INTRODUCTION: THE NATURE OF SCIENCE

The defining characteristic of science is that it is based on evidence and that
evidence comes from the experiment. You might have a wonderful theory
about the nature of the Universe, but if it is not testable it is not a scientific
theory. If it is a scientific theory then it can be used to make predictions that
can be tested by experiment. If repeated experiments agree with these pre-
dictions, then your theory is a good one. However, if the results of experi-
ments do not agree with your theory, and the experiments have been carried
out carefully and by reputable scientists and repeated by different groups,
then the theory might have to be abandoned or modified. One interesting
consequence of this definition of science is that even if we have a brilliant
theory that agrees with all the evidence we have ever collected, we still do not
know that it is the final theory – there might be other tests that we have not
thought about or conditions we could not create in our tests and the theory
might then fail. For example, the theory that atoms are the smallest units of
the matter is a good one but failed when J.J. Thompson showed that all atoms
contain electrons. A good theory is a survivor but we cannot ever be absolutely
certain it will not fail and be replaced by a better or more comprehensive
theory at a later time.
The philosopher Karl Popper suggested that the criterion that defines science
is that it is in principle “falsifiable.” When Einstein’s general theory of relativ-
ity suggested that the presence of matter should curve space and deflect light,
the astronomer Arthur Eddington set out to detect and measure this deflec-
tion. His measurements, during a total eclipse of the Sun in 1919, showed
that light is deflected by matter and the measured deflection was consistent
FP.CH31_App-B_2PP.indd 683 3/14/2023 7:13:39 PM

with Einstein’s predictions. This propelled Einstein to international fame and

helped secure the general theory of relativity. Had Eddington failed to detect
the deflection, or shown it to be much larger or smaller than predicted, then
Einstein’s theory would have been thrown into doubt. Of course, the experi-
ment too would have been critically assessed to make sure it was capable
of detecting the predicted effects and it would have been repeated, but a
continued failure to detect the deflection of starlight by matter would have
undermined general relativity.
The success of general relativity was at the expense of the existing theory,
Newton’s theory of gravity, so in a sense, Eddington’s experiment falsi-
fied Newton’s theory. Does that mean that from 1919 on we have abandoned
Newtonian gravitation? If so at least one chapter of this book is redundant. No:
while we know that the Newtonian theory is not the most comprehensive
theory of gravity, it is much simpler to use than Einstein’s theory and it works
very well for calculating planetary orbits or spacecraft trajectories. However,
if we want to explain gravitational lenses or the behavior of space and time
close to a black hole, we must use Einstein’s full theory. On a fundamental
level, our best theory of gravity is Einstein’s but on a practical level, Newton’s
theory is precise enough for the majority of applications. The reason for this
is that Einstein’s equations reduce to Newton’s when the gravitational field is
weak and the curvature of space–time is small. So, continuing to use Newton’s
equations does not contradict our knowledge that general relativity gives a
more complete description of space and time.
B.1 CARRYING OUT AN EXPERIMENT

An experiment is an attempt to gather evidence that can be used to answer a
question. There are several reasons why you might do this:
to investigate the relationship between different parameters – for example,
how does mass and resultant force affect the acceleration of a dynamics
trolley?
to make a careful measurement of a physical quantity – for example, what
is the resistivity of nichrome?
to determine the parameters that affect an important quantity – for
example, how does changing the length, mass, and amplitude of a simple
pendulum affect its period?
to test a hypothesis – for example, does painting a coffee cup silver keep
coffee hot for a longer time?

Experimental Investigations • 685
To carry out and write up an experiment effectively you will need to:
Have a clear aim
Consider relevant underlying physical principles (this might involve research)
Identify relevant parameters and decide which will be varied and which
will be held constant
Select appropriate measuring instruments bearing in mind their range
and precision
Plan an experimental procedure
Carry out a risk assessment
Decide how to record and analyze the data including uncertainties
Draw a conclusion
Evaluate the experiment
List any references you have used
B.1.1 Variables
To make your experiment a “fair test,” the parameter you are investigating
must be affected only by the parameter you are varying. For example, if you
are investigating factors affecting the acceleration of a dynamics trolley you
might vary its mass by adding loads to the trolley. However, if you then pull
it with different forces you have varied two parameters, both of which affect
its acceleration, so you will not be able to separate the effect of mass from the
effect of the resultant force. To make this fair you need to keep the resultant
force constant while varying the mass and then carry out a separate experi-
ment in which the mass is kept constant and the resultant force is varied. The
three parameters, mass, resultant force, and acceleration are examples of the
three types of variable parameters in all experimental work:
Independent variable: the unique variable that we change (e.g., mass)
Dependent variable: the variable that we are investigating (e.g., acceleration)
Control variable: the variable we keep constant (e.g., resultant force) so that
it will not affect the dependent variable
In most experiments, there are many control variables that must be kept
constant.
B.1.2 Selecting Measuring Equipment

It is important to select measuring equipment that can be used to provide
good quality data and that is safe for the selected use. For example, measuring

the temperature of a Bunsen flame with mercury in glass thermometer is

dangerous, the glass might melt and release the mercury, and impractical, as
the range of the thermometer does not go too high enough. A better choice
would be a thermocouple or a pyrometer.
You need to consider the precision and range of each measuring instrument.
Precision: smallest scale division
Range: difference between smallest and largest value that can be measured
Even simple measurements of length require careful thought. Four common
pieces of laboratory apparatus to measure length are tape measures, rulers,
calipers, and micrometer screw gauges. The table below shows typical values
and uses for each of these.
Instrument Range Precision Example of use
Tape measure 10.00 m 0.01 m Displacement of a student in the lab
(1 cm)
Ruler 1.000 m 0.001 m Length of a simple pendulum
(1 mm)
Vernier calliper 0.0200 m 0.0001 m The thickness of small-density blocks
(0.1 mm)
Micrometer 0.01000 m 0.00001 m Wire diameter
screw gauge (0.01 mm)
Selecting appropriate instruments for electrical experiments is particularly

important. If you connect a sensitive ammeter to a circuit that carries a large
current you will damage the meter so it is important to consider the likely
currents before switching it on. It is also important to connect the apparatus
correctly. An ideal ammeter has zero resistance so if you accidentally connect
it in parallel with a component (as if it was a voltmeter) it will short the com-
ponent and might damage the circuit and meter.
When you are writing up an experiment it is important to list or label the
measuring equipment and include its range and precision.
You should also consider whether the measuring equipment itself will affect
the measurement. For example, a real ammeter has a small resistance that will
reduce the current in the circuit. If the ammeter resistance is comparable to
that of the circuit this will be a significant effect. An ideal voltmeter should
have infinite resistance but real voltmeters have a large finite resistance. This
will only be a problem if the voltmeter is connected across a component that

also has a comparably large resistance. A mechanical example is the use of

ticker tape – the tape drags as it moves through the ticker-timer and this
affects the motion that is being measured.
B.1.3 Planning a Procedure

Once you have decided what you are going to measure and how you are going
to measure it you must consider three things:
The range of measurements for each variable. Try to make this as
wide as possible because a limited range might not reveal the underlying
pattern of behavior – all curves look straight if you only consider a small
section.
The number and spacing of measurements within the range. Aim
for at least five well-spaced readings across the entire range of measure-
ments. A larger number is usually better and if the behavior is non-linear
it will be worth taking more readings where it is changing rapidly.
Repeating and averaging measurements. If possible, repeat each
measurement several times and use an average value. This helps to reduce
the effects of random errors and if one of your repeats is very different
from the others it can be ignored – this is called an “anomalous” result.
Recording and analyzing data, plotting graphs, and dealing with uncertainties
are dealt with in detail in Chapters 2 and 3.
B.1.4 Risk Assessments

You should carry out a risk assessment before performing any experiment.
Many experiments have low risk but you will only know this if you think about
it! Some simple experiments can be dangerous, for example, overstretching a
steel spring can result in it flying up and hitting you in the eye. This risk should
be identified before attaching loads to the spring and suitable eye protection
should be worn. If you are carrying out an experiment that involves the use of
radioactive sources you will need to consider the likely total radiation dose you
will receive and ensure that this is low enough to justify doing the experiment.
There are three stages to a risk assessment:
Identify the risks to you and others
Adapt the experimental procedure to minimize the risk – that is, take
precautions
Decide whether the risk is low enough to justify carrying out the
experiment

There are guidelines that must be followed when you work with hazardous
chemicals, electricity, radioactive sources, vacuum containers, etc, Your
teacher or supervisor should be consulted about these and if you are ever in
doubt about the risk of an experimental procedure, do not proceed.
B.1.5 Writing Up an Experiment

An experimental account should address all the points noted in Section B.1
and could use these headings:
Aim: this can be a simple statement and should not run to more than two or
three sentences. State what you are hoping to achieve – for example, to deter-
mine how the current in a thermistor varies with temperature.
Theory: use what you know or have found out through research to propose
a hypothesis – for example, thermal energy can release additional charge
∆E
−
carriers and the number is expected to depend on the Boltzmann factor e kT
∆E
−
so the current should be given by an equation of the form: I = Ce kT
.
Variables: state the independent and dependent variable and all of the
important control variables, for example, the dependent variable is current,
the independent variable is temperature, and the control variable is a poten-
tial difference across the thermistor.
Apparatus: you could give a list of instruments, or label them clearly on a
diagram showing the experimental setup. Either way, it is important to state
(and if necessary justify) their range and precision. For example, a suitable
thermometer and ammeter. A voltmeter will also be needed to ensure that the
potential difference is constant.
Procedure: a clearly labeled diagram showing the experimental arrange-
ment can save a lot of writing and is easier to understand than a block of text.
Take care of the diagram and if you draw it yourself use a ruler. Explain what
you will measure and why and state the range and number of readings you
intend to take, including any repeats. For an electrical experiment draw a
circuit diagram and make sure the symbols you use are correct.
Risk assessment: once you have planned a procedure look at it critically.
Assess the risks and plan precautions to minimize them. This might mean
modifying your experiment or seeking advice from your teacher or supervi-
sor. Do not carry out an experiment if you have any doubts about its safety.
Include a summary of the risk assessment as part of the experimental account.

Results: tabulate your results including the uncertainty in each measure-

ment. Take care over table headings (include units) and significant figures
(especially for any calculated values). If you repeat measurements and take an
average include all the raw data in your table.
Analysis: this will often involve plotting a graph and taking a gradient (see
chapter 2). Take care over: axis labels; plotting points; drawing best fit and
worst acceptable lines. Make sure everything is clearly labeled. The analysis
should be based on the theory you have used to propose a hypothesis. For
example, for the current through a thermistor, a graph of ln (I) against (1/T)
should be a straight line with a negative gradient.
Uncertainties: Include an analysis of uncertainties so that any value calcu-
lated from your data should be quoted to an appropriate degree of precision
and be accompanied by its own uncertainty. For example, if you are carrying
out an experiment to determine the specific heat capacity of water and come
up with a value of 4000 Jkg- 1 °C- 1 there is no way to know if your experiment
was a good one even though your value is within 5% of the expected value
(4200 Jkg- 1 °C- 1). However, if your result is 4000 Jkg- 1 °C- 1 ± 400 Jkg- 1 °C- 1 it
is clear that the expected value is within the range of your experimental uncer-
tainties and that your result has an uncertainty of 10%. Refining your methods
should reduce the uncertainties but keep the expected value within the range
of your results. If you end up with a result of 4000 Jkg- 1 °C- 1± 100 Jkg- 1 °C- 1
this does not include the expected value so either you have underestimated
the uncertainties or there is a systematic error in your measurements.
Conclusion
It is surprising how often students forget to state a conclusion to their experi-
ments. This should be a simple statement of what you have achieved in your
experiment and should relate back to the original aims.
Evaluation
Having carried out the experiment and drawn a conclusion you should consider:
How strongly the conclusion is supported.
How do any calculated values compare with expected or known values?
The significance of the uncertainties.
Likely sources of error.
Suitability of the apparatus used.
How the experiment could be improved.

If possible make your evaluation quantitative by referring to the range of

uncertainties in measurements and their impact on your final result.
Glossary
Include a list of technical terms you have used in your account along with
brief explanations.
References
Include a list of references indicating where you have used each source.
Provide enough information so that someone reading your report can easily
find the information you used. This should include:
the title of the work or article,
the author or authors,
publisher and publication date,
the page or pages used, and
URL and date accessed (for websites).
B.2 INVESTIGATIONS
An investigation is a more open-ended project that involves a considerable
amount of preliminary research and pilot experiments (to try things out and
to explore the phenomena). These will help you to plan a sequence of experi-
ments, but as you proceed you should be prepared to modify your plans in the
light of results, or when you discover that a particular experiment does not work.
It takes time to get into an investigation, and you must not be easily deterred,
especially at the start when there is a lot of uncertainty about how to begin.
Researchers spend a lot of time getting nowhere, but without trying a range
of different approaches you are unlikely to hit upon the method that actu-
ally works. You might also discover interesting and unexpected aspects of the
problem along the way.
Science is a collaborative endeavor and carrying out an investigation usually
requires you to discuss ideas with your teacher or supervisor and to think
about the feedback they give you. You are also likely to need a fair amount of
apparatus, and this could involve discussing your needs with laboratory tech-
nicians or equipment suppliers (in advance.) In both cases it is important to
be as clear as you can about what you need and what you need it for – asking

for a “block of wood” is pretty meaningless: what type of wood? What dimen-
sions? What is it for?
Writing up an investigation can be a daunting task, especially if you leave
it all until you have finished in the laboratory. It is worth using a laboratory
notebook and it is essential to write up and process data from every experi-
ment before moving on. A good investigation cannot be completely planned
in advance, it evolves as you discover more about the problem you are tack-
ling. There are many ways to write up an investigation but the written report
could take the following form:
Aim: statement of what is to be investigated
Background physics: summary of research about the problem identifying
aspects for investigation.
Pilot experiments: these should be used to explore the phenomena and test
methods or instruments to see if they are suitable. These pilot experiments
are not intended to produce precise data for analysis although they might
suggest relationships to be investigated in more detail later.
Plan of investigation: this should outline a sequence of related experiments
that can be used to collect good relevant data to move the investigation forward.
Experiments: each experiment should be written up fully (using the guide-
lines in Section B.1) including risk assessments and taking account of anything
learned in earlier experiments. The glossary and references can be left until
the end of the complete report.
Conclusions: while each individual experiment might lead to its own conclu-
sion this section takes an overview of the whole investigation and must be
written at the end of the work. It should relate back to the original aims.
Evaluation: this covers the same points as an individual experimental evalua-
tion but refers to the whole investigation. It might consider the relevant merit
and contribution of different experiments.
Glossary: a list of technical terms you have used in your investigation along
with brief explanations.
References: The point of a reference is so that someone reading your work
can find the information easily without having to search through an entire
book or article, so give full details. There are several standard ways to provide
references; one of the most widely used is the Harvard system.

APPENDIX
C
Units, Constants, and Equations
C.1 SI UNITS
Base units
Quantity Name Symbol
SI base unit
Length Meter m
Mass Kilogram kg
Time Second s
Electric current Ampère A
Thermodynamic temperature Kelvin K
Amount of substance Mole mol
Luminous intensity Candela cd
Derived units
SI-derived units SI base units
Force Newton N kgms- 2
Pressure Pascal Pa kgm- 1s- 2
Energy Joule J kgm2s- 2
Power Watt W kgm2s- 3
Charge Coulomb C As
Resistance Ohm W kgm2s- 2A- 2
Potential difference Volt V kgm2s- 2A- 1
Capacitance Farad F kg- 1m- 2s3A2
FP.CH32_App-C_3PP.indd 693 3/15/2023 12:52:18 PM


SI-derived units SI base units
Magnetic field strength Tesla T kgs- 2A- 1
Magnetic flux Weber Wb kgm2s- 2A- 1
Inductance Henry H kgm2s- 1A- 2
C.2 SIMPLE APPROXIMATE COMBINATIONS OF

UNCERTAINTIES
Combination Rule
Uncertainty in a sum: Add absolute uncertainties dy = da + db
y=a+b
Uncertainty in a product: Add fractional uncertainties dy da db
y = ab = +
y a b
Uncertainty in a quotient: Add fractional uncertainties dy da db
= +
a y a b
y=
b
Uncertainty in a power: Multiply fractional uncertainty dy da
y = an by power =n
y a
C.3 USEFUL DERIVATIVES

dy
dx
Constant value, for example, y = 8 0
dy n −1
Power law, for example, y = Axn = nAx
dx
dy x
Exponential function: y = ex =e
dx
dy bx
Exponential relationship: for example, y = Aebx = bAe
dx
dy
Sine function: y = sin x = cos x
dx

Units, Constants, and Equations • 695
dy
dx
dy
Cosine function: y = cos x = − sin x
dx
dy
Sinusoidal variation: for example, y = A sin (bx) = bA cos ( bx )
dx
dy
Cosinusoidal variation: y = A cos (bx) = − bA sin ( bx )
dx
C.4 DIFFERENTIAL EQUATIONS

Topic Differential equation Solution Conditions
dN
Radioactive decay = −λt N = N0 e−λt N = N0 at t = 0
dt
dQ Q −
t
Capacitor discharge = − Q=Q e RC Q = Q0 at t = 0
dt RC 0
Newton’s second law d2 x F 1 F  F, m, u constants

(constant acceleration) = x ut +   t 2
=
dt 2 m 2m x = 0 at t = 0
2
Simple Harmonic d x A, w , f constants
2
= −ω x =x Acos ( ωt + φ )
Motion 2 (w = 2pf )
dt
C.5 DIFFERENTIALS AND INTEGRALS

ds
Dynamics v= s = ∫vdt
dt
dv
Dynamics a= v = ∫ adt
dt
d ( mv )
Newton’s laws F= mv − mu =
∫Fdt
dt
dQ
Electric circuits I= Q = ∫Idt
dt

dN dN
Radioactivity
dt
= −λN ∫ N
= − ∫ λ dt
dQ Q dQ dt
Capacitors
dt
= −
RC ∫ Q
= −∫
RC
C.6 EQUATIONS
Mechanics
Equations for constantly accelerated motion:
( u + v) t 1 1
v= u + at s = = s ut + at 2 =
s vt − at 2
2 2 2
v=
2
u2 + 2 as
u2 sin 2θ
Range of a projectile: R=
g
Coefficient of friction: Flimit = m N, m S = tan q (q is limiting angle)
Weight: W = mg
m
Density: ρ =
V
F
Pressure: p =
A
Linear momentum: p = mv
dp F
Newton’s second law: F= a= (constant mass)
dt m
Impulse and change
of momentum: ∫Fdt = ∫dp = ∆ p
=
Ft ( mv − mu ) (constant mass)
 m0 
Rockets: v= v + u ln  
f 0
 mf 
 

Work done: W = Fs cos q
Gravitational potential energy: DGPE = mgh
Kinetic energy: KE = ½ mv2

useful output energy
Efficiency:
= η × 100%
total input energy
dE ds
Power: P = =P F= Fv
dt dt
E
Photon momentum: pphoton =
c
I
Radiation pressure: p=
c
Lagrangian: L=T-V
∂  ∂L  ∂L
Euler-Lagrange equations:  =
∂t  ∂q  ∂q
Fluids
Fluid pressure: p = ρgh

vρL
Reynold’s number: Re =
η
Equation of continuity: r1A1v1 = r2A2v2
Stokes’ law: F = 6phrv
Turbulent drag: F = ½ CD r Av2

1 1
Bernoulli equation: P1 + ρv12 + ρgh1= P2 + ρv22 + ρgh2
2 2
πpa4
Poiseuille’s equation: Q=
8 ηl

Materials
Hooke’s law: F = ke
1 1 1 1
Spring systems: = + + …
kseries k1 k2 k3
kparallel = k1 + k2 + k3 …
1 1
Strain energy (spring): EPE = Fe = ke2
2 2
EPE 1
Strain energy (wire): = σε
V 2
F e
Stress and strain: σ = ε=
A l0
σ
Young’s modulus: E=
ε
Thermodynamics
Temperature scales: T = q + 273.15 q = T - 273.15 (q in °C, T in K)

dQ dθ
Thermal conduction: = − kA
dt dx
Wien’s displacement law: λT = constant = 2.9 × 10 −3 mK
Stefan-Boltzmann law: P = eσAT 4

∆E
Specific heat capacity: c=
m∆θ
Specific latent heat: E = mL
Ideal gas equations:
Boyle’s law: pV = constant (constant mass and temperature)
Charles’s law: V/T = constant (constant mass and pressure)
Gay Lussac’s law: p/T = constant (constant mass and volume)

pV
Ideal gas equation: = constant pV = nRT
T
Adiabatic compression: pV g = constant
1 1
Kinetic theory equation: pV = Nm v2 p= ρ v2
3 3
1 3 R 3
Mean molecular KE: =
m v2 = T kT
2 2 NA 2
3 kT
RMS molecular speed: v=
rms =
v2
m
1 3
Internal energy: U==Total KE = N A m v2 RT
2 2
3 5
Heat capacities (ideal gas): cV = R cP = R
2 2
c
Adiabatic gas constant: γ= P
cV
γp
Speed of sound in gas: c=
ρ
∆E
−
Boltzmann factor: f = e kT
First law of thermodynamics: ∆U =

Q –W
W Q T
Heat engine efficiency: =1− 2 ≤1− 2
Q1 Q1 T1
state B dQ
Reversible heat transfer: ∆S =∫ state A T
Entropy change: S = k ln ( W )
1 dS
Thermodynamic temperature: =
T dQ
Qcold 1
Refrigerator: CoPrefrigerator = ≤
W T hot
−1
Tcold

Qhot 1
Heat pump: CoPheat pump = ≤
W 1 − Tcold
Oscillations Thot
1
Frequency: f=
T
Angular frequency: ω = 2 πf
d2 x
SHM: a = = −ω2 x
dt 2
=x A cos ( ωt + δ )
=x A cos ωt

dx
v= = −ωA sin ωt
dt
dv
a = = −ω2 A cos ω = −ω2 x
dt
Mass-spring system: 1 k T = 2 π m
f=
2π m k
1 g l
Simple pendulum: f= T = 2π
2π l g
1
Total energy: =
TE mω2 A2
2
Damped SHM: =x Ae− γt cos ( ωt )
Resonance condition: f d = f0
Rotational Dynamics
l
Angles in radians: θ=
r
Small angle approximations:
as q → 0 sin q → q cos q → 1 tan θ → θ ( θ in radians )

v
Angular velocity: ω=
r
a
Tangential acceleration: α=
r
v2
Centripetal acceleration: a= = rω2
r
Centripetal force: mv2
= =
F ma = mrω2
r
Equations of motion for constant angular acceleration (by analogy):
v= u + at → ω f = ωi + αt
s=
( u + v) t
→ θ=
(ω + ω ) t
i f
2 2
1 1
=
s ut + at 2 → θ = ωi t + αt
2 2
1 1
=
s vt − at 2 → θ = ω f t − αt 2
2 2
v=
2
u2 + 2 as → ω f 2 = ωi 2 + 2αθ
Moment of inertia: = (∑ i= N
i=1
m i ri 2 )
1
Rotational KE: RKE = Iω2
2
Angular momentum: L = Iω
d ( L) dω
Torque: Γ = = I = Iα
dt dt
Moments of inertia for uniform objects:
Point mass: I = mr 2
1 2 1
Rod: Iend = ml ICM = ml 2
3 12
Thin hoop: I = mr 2

1
Disc/cylinder: ICM = ma2 (radius = a)
2
2
Sphere: ICM = ma2 (radius = a)
5
Waves
λ
Wave speed: v = = fλ
T
1D travelling wave: =y A cos ( ωt − kx )
Intensity: I ∝ A2
c
Wave speed in medium: v=
n
Refraction: n1 sin=
θ1 n2 sin θ2
n2
TIR: sin c =
n1
Malus’s law polarization): Itrans = I0 cos2 q
n
Brewster’s law: tan θB = 2
n1
1
Speed of light: c= (vacuum)
ε 0µ 0
1
v = (medium)
ε 0 ε r µ 0µ r
Refractive index: n= εr µr
1
Power of a lens: P=
f
hi v
Linear magnification: =
m =
ho u
1 1 1
Lens equation: = +
f u v

β fo
Astronomical telescope: M= =
α fe
v
Doppler shift (light): ∆λ = ± λ 0
c
λ′ − λ 0 v
Red shift: z= =
λ0 c
Hubble’s law: v = H0 d
2 π∆ x
Phase difference: ∆φ =
λ
sy
Young’s double slit: λ=
d
Diffraction grating maxima: =
nλ d sin θ
1.22λ
Single slit minima: sin θ =
D
1.22λ
Rayleigh criterion: θ≥ (for resolution)
D
T
Waves on a string: v=
µ
γRT
Speed of sound: v= (ideal gas)
M
E
v = (solid)
ρ
 I 
Decibel scale: intensity level ( B ) = log10  
 I0 
Acoustic impedance: Z = (speed of sound) × (density)
2

Reflection: Ir Z2 Z1
I0 Z2 Z1
Electricity
dQ Q
Electric current: I= I = (constant current)
dt t

Q1Q2
Coulomb’s law: F=
4 π∈0 r 2
F dV
Electric field strength: E= E = −
q dx
F Q
E-field strength (point charge): E= =
q 4pe 0 r 2
EPE
Electric potential: V=
Q
W
Potential difference: ∆V =
Q
Q
Electric potential (point charge): E =
4 πε0 r
∑
i= N
Qi
Gauss’s theorem: ∫
surface
E.dS = i=1
ε0
Electric current: I = nAve
Kirchhoff’s second law: ∑ emfs = ∑ p.d.s

closed closed
loop loop
V
Resistance (Ohm’s law): R=
I
Rseries = ∑ i=1 Ri
i= N
Resistors in series:
1 1
= ∑ i=1
i= N
Resistors in parallel:
Rparallel Ri
RA
Resistivity: ρ =
l
Electrical energy: E = VIt
Real cell: V= E − Ir

Q
Capacitance: C=
V
Q2 CV 2 QV
Energy stored (capacitor): =
E = =
2C 2 2
ε0 εr A
Parallel plate capacitance: C=
d
t t t
− − −
Capacitor discharge: Q ( t ) = Q0 e RC
I ( t ) = I0 e RC
V ( t ) = V0 e RC
Capacitor charging:  −
t

=Q QF  1 − e RC
 
 −
t

=V VS  1 − e RC
 
t
−
I ( t ) = I0 e RC

Time constant: t = RC
1 1
= ∑ i=1
i= n
Capacitors in series:
Cseries Ci
i= n
Capacitors in parallel: Cpara = ∑Ci
i =1
Q
Capacitance (charged sphere): Csphere = = 4 πε0 a
V
Magnetism
Magnetic force: =F BIl sin θ (on current)

=
f Bqv sin θ (on moving charge)
Lorentz force: f = qE + qv∧B
mv
Radius of curvature in B-field: r=
Bq
µ 0 I sin θ
Biot-Savart law: =
δB δl
4 πx2

µ 0 NI
Magnetic field strength: B= (narrow coil)
2r
µ0 I
B= (long straight current-carrying wire)
2 πr
µ µ NI
B= 0 r (long solenoid at center)
l
µ µ NI
B= 0 r (long solenoid at end)
2l
Ampère’s theorem: ∫
closed
=
B.dl ∑µI
enclosed
0
loop by loop
Torque on coil in B-field: Γ =NBIA
Magnetic flux: =Φ ∫ B cos θ dA (flux-linkage = NF)

surface
d ( NΦ )
Faraday’s law: E= −
dt
dI
Self-inductance: E = −L
dt
dI
Mutual inductance: E2 = M 1
dt
E1 = M dI2
dt
1
Energy stored in inductor: W = LI 2
2
Ideal transformer: I2 V1 N1
= =
I1 V2 N2
A.C. Circuits
Alternating Current and voltage: =V V0 sin ωt =I I0 sin ωt

I0 V0
RMS values: Irms = Vrms =
2 2
A.C. power: PAC = Irms Vrms

V0 1 1
Reactance (capacitor): X=
C = =
I0 ω C 2 π fC
V
Reactance (inductor): X L = 0 =ωL =π
2 fL
I0
V0
Impedance: Z =
I0
2
V  1 
Z = S = R2 + ( X L − X C ) =
2
R 2 +  ωL −  (RCL series circuit)
I  ωC 
VL − VC X L − X C
Phase angle: =
tan φ = (RCL series circuit)
VR R
1 1
Resonant frequency (RCL circuit): f0 =
2 π LC
Gravitational fields
Gm1 m2
Newton’s law of gravitation: F= −
r2
gravitational force F
Gravitational field strength:
= g =
mass m
F GM
g= = − 2 (point mass or uniform spherical mass)
m r
GPE
Gravitational potential: VG =
m
GM
VG ( r ) = − (point mass or uniform spherical mass)
r
∆GPE = (h>>r)
mgh
2GM
Escape velocity: vesc =
r
2GM
Scwarzschild radius: RS =
c2
Kepler’s third law: r 3 GM

=
T 2 4 π2

−2GMmr
Tidal forces: ∆F =
R3
 ∆V 
Gravitational time dilation: T ′ T  1 + 2  (weak fields)
=
 c 
Special relativity
1
Gamma-factor: g=
v2
1−
c2
Time dilation: T′ = g T
L
Length contraction: L′ =
g
Lorentz transformations:
x′ g ( x − vt )
= z′ = z
y′ = y  vx 
t′ =γt − 2 
 c 
Inverse Lorentz transformations:
x = γ ( x′ + vt′ ) z = z′
 vx′ 
y = y′ γ  t′ + 2 
t=
 c 
( u + v)
Velocity addition: w=
 uv 
1 + 2 
 c 
Mass increase: m = γm0
E2 − p2 c2 = ( m0 c2 )
2
Mass and energy: ∆E = c2 ∆m
Atomic and nuclear physics

R
Inverse-square law for intensity: I =
4 πr 2
Attenuation of gamma-rays: I = I0 e−µx

ln 2
Half-thickness: x 1 =
2 µ
σ
Range of radiation: R=
ρ
Radioactive decay: N = N0 e − l t
N
Activity: A = − = −λN0 e−λt = −λN
dt
ln 2
Half-life: t 1 =
2 λ
Nuclear binding energy: B.E. = c2Dm
Binding energy per nucleon: B.E./A = c2Dm /A
Quantum physics
Photon energy: E = hf
Photoelectric effect: KEmax= hf − Φ= hf − hf0
eV=
S KEmax= hf − Φ
h
de Broglie relation: λ =
p
h
Compton effect: ( λ=
′ − λ) (1 − cos θ )
me c
− me4  1  −13.6 eV
Hydrogen atom energy levels: = En =  
8e 0 2 h2  n2  n2
1  1 1 
Balmer series: = R 2 − 2 
λn 2 n 
1  1 1 
Rydberg formula: = R 2 − 2 
λn  m n 
 me4 
Rydberg constant: R= 2 3 
 8 cε0 h 
h
Heisenberg indeterminacy relation: ∆x∆p ≥
4π

Astrophysics
 L 
Radius of a star: r=  4 
 4 πσT 
Hubble’s law: v = H0 d
1
Hubble time: TH =
H0
Medical Physics
Beer–Lambert law: Iout = Iin e−µx
Larmor resonant frequency: γB0

f=
2π
C.7 CONSTANTS
Speed of light in a vacuum c 3.00 × 108 ms− 1 299 792 458 ms− 1
− 19
Electronic charge (magnitude) e 1.60 × 10 C 1.60217662 × 10− 19 C
Planck constant h 6.63 × 10− 19 Js 6.62607004 × 10− 34 Js
− 11 −2
6.67408 × 10− 11Nm2kg− 2
2
Gravitational constant G 6.67 × 10 Nm kg
Avogadro constant L 6.02 × 1023 mol− 1 6.02214086 × 1023 mol− 1
− 23 −1
Boltzmann’s constant k 1.38 × 10 JK 1.38064852 × 10− 23 JK− 1
Molar gas constant R 8.31 J mol− 1 8.3144598 mol− 1
− 27
Unified atomic mass unit u 1.66 × 10 kg 1.660539040 × 10− 27 kg
−7 −1 −6 −1
Permeability of free space m0 4p × 10 Hm 1.25663706 × 10 Hm
− 12 −1 − 12
Permittivity of free space e0 8.85 × 10 Fm 8.85418782 × 10 Fm− 1
Stefan’s constant s 5.67 × 10− 8 Wm− 2K− 4 5.670367 × 10− 8 Wm− 2K− 4
−3
Wien’s constant W 2.90 × 10 mK 2.8977685 × 10− 3 mK
Proton mass mp 1.67 × 10− 27 kg 1.6726219 × 10− 27 kg
− 31
Electron mass me 9.11 × 10 kg 9.10938356 × 10− 31 kg
− 27
Neutron mass mn 1.67 × 10 kg 1.674927471 × 10− 27 kg
Rydberg constant R 1.10 × 107 m− 1 1.0 973 731.568 508 × 107 m− 1
−2
Standard acceleration of gravity g 9.81 ms 9.806 65 ms− 2 (defined)
Standard atmospheric pressure atm 1.01 × 105 Pa 101 325 Pa (defined)
1.2 × 10− 2 kg⋅mol− 1 1.2 × 10− 2 kg⋅mol− 1 (defined)
12
Molar mass of carbon-12 M( C)

APPENDIX
D
Solutions to Exercises
Descriptive answers are not usually included and can be found by referring to
the relevant chapter.
CHAPTER 1: THE LANGUAGE OF PHYSICS

1. (a) 0.267 kg (b) 25 000 000 mm (c) 5000 000 cm3 (d) 22.2 ms−1
(e) 0.0045 m2
2. (a) 1.44 × 108 km (b) 9.47 × 1015 m (c) 1.27 light seconds
3. (a) 2.00 (b) 0.00209 (c) 0.00950 (d) 3.14
4. (a) Nm2kg−2 or m3kg−1s−2 (in base units)
5. (a) 0.012, 0.016, 0.050 (b) 1.2%, 1.6%, 5.0% (c) 5300 ± 200 cm2
(d) 105 000 ± 8000 cm3
 δT δk 
6. (a) 0.234 ± 0.051 kg (b) time period  2 > .
 T k 
7. (a) Systematic errors affect all readings in the same way (e.g., constant
addition or subtraction) and can be corrected if the error is known
(e.g., subtracting a zero error). Random errors cannot be predicted
and affect each data point independently.
(b) Repeat the reading several times and use an average value (neglecting
obvious anomalies).
(c) 0.32 mm
FP.CH33_Solutions to Exercises_2PP.indd 711 3/14/2023 7:17:38 PM

8. (a) 5.5 × 103 (b) 7 × 10−10 (c) 1.2 × 1024

9. (a) 9.0 × 1015 (b) 9.0 × 10−3 (c) 2.0 × 108 (d) 2.0 × 1016
10. (a) 3.3 × 107 s or 1 year 20.6 days (b) 4.0 × 109 s or 126 years 275.8 days.
(c) 2.8 × 1013 s 0r 900 000 years (d) 9.5 × 1020 s or 3.0 × 1013 years
(>> age of Universe!)
CHAPTER 2: REPRESENTING AND ANALYZING DATA

1. (a) 6 (b) − 3 (c) 1.748 (d) − 0.488 (e) 0
2. (a) 10 000 (b) 501.2 (c) 1.122 (d) 0.001995 (e) 0.8913
3. (a) 1/v on y-axis and 1/u on x-axis (b) y-intercept is 1/f
4. (a) Plot p against 1/V. If pV = constant graph will be a straight line through
the origin.
(b) −
5. (a) Plot log (T) against log (r). This should give a straight line with a gradi-
ent equal to n and n = 3/2.
(b) 225 days.
6. Plot ln (I) against t. to get a straight line with a gradient equal to − 1/T.
RC ~ 44 − 45 s

Solutions to Exercises • 713
CHAPTER 3: CAPTURING, DISPLAYING,

AND ANALYZING MOTION
1. displacement 0.41 s
0.4 0.83 s 1.18 s
0.82 m
0.82
0 82 m
me
−1.2 m
velocity
4.0 ms−
1
me
7.5 ms−
1
acceleraon
me
− 9.8 ms−
2
2. (a) 8.3 ms−1 (b) 1.67 ms−2

3. (a) 5.0 s (b) 2.4 ms−2
4. Car’s speed is 13.9 ms−1 (a) 27.8 m (b) 36.1 m
5. 23.7 m
6. 0.067 ms−2
7. Car B wins by 0.64 s
8. (a) 14.3 m (b) 61.1 m

CHAPTER 4: FORCES AND EQUILIBRIUM
1. All of them.
2. (a) 2.83 N at 45° to left of vertical. (b) 5.10 N at 32.1° left of vertical
3. 6.2 N
4. 0.75 N
5. (a) FA = 1225 N FB = 1715 N (b) 2940 N
6. (a) mS = 0.466 (b) mK = 0.443
(c) Force
2.69 N
2.56 N
me

CHAPTER 5: NEWTONIAN MECHANICS
1. Tension T = mg + ma. If the stone is lifted gradually the acceleration is

small and T ~ mg. If the string is jerked the acceleration is large so ma
is large and the tension can exceed the breaking force for the thread.
2. (a) Passengers continue at constant velocity until acted upon by a result-
ant force (N1). When the bus slows down no additional force acts
on the passengers so they continue moving forward relative to the
decelerating bus and this makes them think they have been thrown
forwards even though no forward force acts on them.
(b) There is an interaction between the bullet and gun when the propel-
lant explodes. The forward force on the bullet is equal to the back-
ward force on the gun (N3).
(c) To accelerate upwards there must be a resultant upward force on you
(N2). This arises because the contact force from the floor of the lift is
greater than your weight: FC − mg = ma. You do not feel your weight but
you do feel the contact force and it is this that makes you feel “heavier.”

(d) These are opposite ends of the same interaction so they must be
equal (N3).
3. (a) 0.50 ms−2 to the right (b) a = 19.4 ms−2 to the right
4. (a) 30 000 kgms−1 (b) 0.16 kgms−1
5. 1.3 ms−1 to the right
6. (a) v1 = 0.594 ms−1 v2 = 0.316 ms−1
(b) Initial KE = 0.32 mJ and final KE = 0.23 mJ so the collision was
inelastic. 094 mJ has been transferred to other forms in the collision.
7. (a) 2500 N
(b) (i) Power needed to overcome drag is unchanged but additional
power must be supplied to increase the GPE of the car as it rises.
(ii) 111 kW
mu2
8. (a)=
s uT +
2B
(b)
distance 2
braking distance
2
thinking distance
speed

(c) Braking distance depends on u2, so halving u reduces this by a factor of 4.
CHAPTER 6: FLUIDS
1. (a) Dp (due to 10 m of seawater) = 101 kPa ~ 1 atmosphere.

(b) 111 MPa
(c) Actual pressure > 111 MPa because there is more mass in each unit
volume so the weight of water creating the pressure is greater than if
the water was incompressible.

2. (a) Dp = rgx (b) F = ½ rglh2

(c) Integrate moments on a narrow strip of width dx and equate the result
to the moment of the total force at height a above the base. Show
that a = h/3.
3. (a) −
(b) The constant is dimensionless so changing its value does not affect the
balance of dimensions in the equation.
(c) The Reynold’s number will be too high; the flow will change from
laminar to turbulent.
4. (a) Sensible estimates must be made for each quantity in the expres-
sion for Reynold’s number. For example, v ~ 30 ms−1, r ~ 1.2 kgm−3,
L ~ 2 m, h ~ 2 × 10−5 Pas giving Re ~ 4 × 106. This is very high and
implies turbulent flow.
(b) (i) Smooth surfaces, streamlined shape, reduced cross-sectional area.
(ii) 765 N (iii) ×8 (iv) 68 ms−1
(v) Engine efficiency is likely to fall and other frictional forces will
increase.
5. (a) ~ 0.08 m3 (b) ~ 800 N (c) ~ 1 N
(d) About 0.1% so probably not noticeable.
6. (a) 6.83 × 10−5 N (b) 1.12 × 10−5 N
(c) F
ree body diagram has two upward forces: buoyancy B and drag D(v) and
one downward force, weight mg. The resultant force SF = mg - (B + D(v))
so the downward acceleration is a = g - (B + D(v))/m. The weight and
buoyancy forces are constant but D increases with speed v. When the
ball bearing is released D = 0 so the initial acceleration is a(0) = g - B/m.
As D increases the resultant force and acceleration decrease until mg -
(B + D(v)) = a = 0, so the ball bearing then falls at a constant terminal
velocity.
(d) 1.4 Pas assuming Stokes’ law is valid − that is, ball bearing falling
slowly in the center of a tube of diameter >> diameter of ball.
(e) Droplets were very small. Stokes’ law assumes that fluid is a con-
tinuum. For tiny droplets we need to take into account the particle
nature of the air − this affects average viscosity.

7. (a) rair << rwater (b) 27 ms−1 (c) ~ 1600

(d) Assumption of laminar flow is dubious. If flow is turbulent result is
not valid.
16 pX + pZ
8. pY =
17
9. (a) Pressure at top of straw is reduced as you suck, so it is less than
atmospheric pressure. Pressure in straw at level of external fluid
is atmospheric. Pressure difference creates an upward force on fluid
inside straw.
aximum pressure difference is 1 atmosphere so rghmax ~ 105 Pa. For
(b) M
water hmax ~ 10 m.
10. For example, by balancing units in Stokes’ law equation.
11. 183 ms−1
12. (a) 1370 Pa (b) 103370 Pa
(c) Mercury has a higher density so the displacements are smaller for the
same pressure difference. This increases range but reduces precision.
13. At higher altitudes the weight of air above each square meter of the sur-
face is less so pressure is lower.
14. (a)
Area / cm2 Speed / cms−1 Volume flow rate / cm3s−1
Aorta 3.0 30 90
Arteries 100 0.9 90
Capillaries 900 0.10 90
Veins 200 0.45 90
Vena cava 18 5.0 90
(b) The largest Reynold’s number is ~ 1000 in the aorta. This is close to
the limit for laminar flow so some turbulence could occur, but else-
where the flow will be laminar.
CHAPTER 7: MECHANICAL PROPERTIES

1. (a) 2000 kgm−3
(b) fractional uncertainty = 0.1, absolute uncertainty = ±320 kgm−3

2. (a) 167 Nm−1 (b) 2.7 J (c) 0.12 m

3. (a) − (b) 11 N (c) 0.30 Ncm−1 or 30 Nm−1 (d) 33.2 cm (e) 28.4 cm
(f) e µ F (Hooke’s law obeyed) (g) 98.8 cm
(h) It has passed its elastic limit and permanent plastic deformation has
occurred.
(i) Less stiff − increases in extension become greater for the same increases
in load.
4. (a) 1.1 mm (b) 8.3 J
5. Loading/unloading cycle shows hysteresis so stretching forces are always
greater than recompression forces at the same extension. More work is
done to stretch it than it does when it recompresses. The difference is
transferred to heat.
6. A: stiff, strong, brittle.
B: less stiff and strong than A, with a small amount of plastic deformation
before fracture.
C: less stiff and strong than B, with a significant amount of plastic defor-
mation before fracture.
D: Non-linear behavior. Not stiff. Tough and has a low yield stress.
CHAPTER 8: THERMAL PHYSICS

1. (a) ~ 6 kW
(b) Other heat losses − for example, through ground, draughts, etc.
2. 20.2 °C
3. Cu: 24.4 J°C−1 mol−1, Al: 24.2 J°C−1 mol−1, Fe: 25.0 J°C−1 mol−1, Hg: 28.1
J°C−1 mol−1
4. 5.3 MJ
5. Molten iron cooling/state change, liquid to solid while latent heat is
dissipated at constant temperature/solid iron cooling
CHAPTER 9: GASES
1. (a) 4.4 × 105 Pa, 15°C
(b) No change. Work is done on the gas but this is equal to the heat trans-
ferred to the surroundings.

(c) DU = 0, Q = - W (d) 1152 K (879°C)

(e) Same number of molecules occupy a larger volume so collision rates are
lower. To exert the same pressure the particles must move faster (have
more KE) so the gas must be heated and its temperature must rise.
2. (a) 0.018 moles
(b) External forces cause compression/expansion and do work on the tire
and air, heating it up.
(c) 32°C
(d) Assumes constant volume. In practice, volume might increase slightly.
3. (a), (b) pressure
(b)
(a)
volume
(c) Slow compression: temperature is constant so U is constant. Heat flows

out of the system are equal to work done on the system. Fast compres-
sion: heat flow is minimal so work done on the system increases its
internal energy, temperature, and pressure.
4. (a) 500 ms−1
(b) Same mean KE at the same temperature but molecular masses are
different: more massive molecules (oxygen) have lower rms speed.
5. (a) Molecular volume can no longer be ignored.
(b) Collisions become inelastic.
(c) Interactions can no longer be ignored (bonds will form).
6. −
7. (a) 0 (b) 519.5 ms−1 (c) 519.9 ms−1 (d) 7.3 × 10−21 J
8. Internal energy U depends only on absolute temperature T. In an iso-
thermal change U is constant and so is T. Work done to compress the gas
equals the heat that flows out of the gas to the surroundings.

9. (a) −
(b) (i) DU = 0, (ii) W = H
10. (a) −
(b) (i) exponential increase (ii) exponential decay (iii) exponential decay.
(c) Increasing T increases the B.F. so a larger fraction of collisions exceeds
the activation energy and the reaction rate increases.
11. 2.5×
12. 1.3 × 10−5
13. (a) As T increases B.F. increases so a larger fraction of the electrons have
enough energy to jump to the conduction band.
∆E
−
(b)=
I GV= const.×V×e kT so a graph of ln I against 1/T has a gradi-
ent -DE/k
14. (a) − (b) 0.67 (c) 0.174 gs−1
CHAPTER 10: STATISTICAL THERMODYNAMICS AND THE

SECOND LAW
1. −
2. (a) −
(b) Newton’s laws are reversible.
(c) Entropy increases so the future is distinct from the past as long as the
universe started in a low entropy state.
(d) Cosmological expansion (but this is also dependent on the universe
starting in a low entropy state).
3. (a) It was very low.
(b) If all possible initial conditions are considered equally likely the one
in which the universe actually began was one of very low probability
(i.e., a microstate belonging to a very small number of microstates).
4. − 5. − 6. −

CHAPTER 11: OSCILLATIONS

1. (a) AC, (b) B, (c) B, (d) C, (e) B, (f) C.
2. (a) Position of zero resultant force. (b) F µ - x (c) 0.50 Hz
(d) 6.0 cm (e) y = 6.0 cos (p t ) (+ve to right) (f) −
(g) vmax = 0.19 ms−1, KEmax = 0.014 J (h) No change.
(i)
Increasing mass reduces acceleration at each point increasing T;
increasing k increases acceleration at each point decreasing T.
(j) Work done against frictional forces. Total energy and amplitude decay.
(k) For example, look for a constant ratio: 5.4/6.0 ~ 4.9/5.4 ~ 4.4/4.9, etc.
This supports exponential decay.
(l) 2.1 cm
3. (a) Differentiate twice.
(b) (i) 2.5 (ii) 0.20 s (iii) 78.5 ms−1 (iv) 2470 ms−2
(c) (i) 1540 sin 2 (10 π t) (ii) − (iii) −
1 g
4. f >
2π A
5. (a) 9.93 × 1014 Hz (b) 3.02 × 10−7 m, UV
6. (a) 0.635 m (b) 0.5% increase.
(c) Mass-spring period unchanged, pendulum period about 3.9 s
CHAPTER 12: ROTATIONAL DYNAMICS

1. (a) − (b) Force and velocity are perpendicular so F.ds = 0.
5
2. (a) mrg
2
(b) N
N N
mg mg mg mg

(c)
No. Ball will have tangential acceleration as well as centripetal
acceleration and resultant force is parallel to resultant acceleration.
(d) v = rg
3. Velocities shown on diagram. Accelerations all equal to v2/r and directed
toward center.
2v
C
2 v
45 B
D
45
2 v
v=0
A
4. (a) 1.82 s (b) 0.55 Hz (c) 3.5 rad s−1 (d) 7.8 × 10−3 Js (e) 0.054 Nm
(f) 0.013 J
5. −
6. (a) 4.7 × 109 kgm2
(b) Astronaut experiences an inward contact force that maintains his cir-
cular motion about the center of rotation. This feels like the reaction
to a gravitational field of strength g = rw2.
(c) 0.22 rad s−1 (d) 94 MJ
7. (a) 0.63 Js (b) 26.2 s
8. No external resultant torque. Angular momentum is conserved. L = Iw. I
is reduced so w increases. Work must be done by the children as they use
forces to move inwards. This increases RKE.
9. 8.3 × 1028 Nm

CHAPTER 13: WAVES

1. (a) amplitude = 25 m frequency = 4.8 Hz, wavelength = 3.14 m
(b) 15 ms−1 (c) In the negative x-direction.
2. (a) One cycle with period 20s and amplitude 1.2 m.
(b) f = 0.10 Hz, l = 12 m.
(c) 90°, (p/2)
3. −
4. (a) Row 1: air or vacuum / 2.0 × 108 ms−1 / 25°
Row 2: 3.0 × 108 ms−1 / 3.0 × 108 ms−1 / 75°
Row 3: 2.3 × 108 ms−1 / 1.2 × 108 ms−1 / 38°
Row 4: diamond / glass / 1.2 × 108 ms−1
(b) −
(c) Lower critical angle, so light reflects internally to a greater extent and
returns to the observer by many paths.
5. HINT: Exploit the symmetry.
1 1
=
6. (a) IT I0 cos θ cos ( 90=
° − θ) I0 sin 2θ maximum when 2q = 90°, that
2 4
is, q = 45°.
(b) 0.22
CHAPTER 14: LIGHT

1. (a) 2.57 s (b) To about 0.1 ns (c) Diffraction.
2. (a) − (b) 4 times
(c) Image distance and magnification increase as u approaches f. When
u = f the image is at infinity. For u > f a real image is formed on the far
side of the lens, inverted, in the space between the eye and the lens (if
the eye is far enough away).

3. First row: 20 cm, 2, virtual erect; second row: infinity, infinity, not
defined; third row: 60 cm, 2, real inverted; fourth row: 40 cm, 1, real
inverted; fifth row: 20 cm, 0, not defined.
4. First row: 6.7 cm, 0.67, virtual erect; second row: 10 cm, o,50, virtual
erect; third row: 12 cm, 0,40, virtual erect; fourth row: 20 cm, 0, not
defined.
5. (a) 30 times (b) 20 times and length increased by 2.0 cm.
6. (a) 12.5 ms−1 (b) 3.4 × 1024 m
7. For example, by measuring the Doppler shift of 21 cm hydrogen lines
from spiral arms and calculating their velocities relative to earth. From
these 9 and some geometry) the angular velocity of the galaxy can be
calculated.
CHAPTER 15: SUPERPOSITION

1. (a) 0 (b) p (c) l/2 (d) Amplitude × 2, intensity × 4 (e) 0
(f) Amplitude × 1/2, intensity × ¼
(g) 0.026 m, 13.1 kHz (h) doubles (i) halves
(j) the phase difference is between p/2 and p
2. (a) − (b) 9.8 mm
(c)(i) maxima closer together (ii) minima not completely dark
(iii) no effect (iv) maxima farther apart
(v) maxima farther apart (vi) clarity of fringes varies − maximum at 0
and p, but no fringes at p/2 and 3p/2 when polarizations are perpen-
dicular.
3. Same fringe separation but maxima 9/4 times brighter.
4. (a) 8.28°, 8.97° (b) 1.44° (c) 6 orders
(d) First-order diffraction minimum at 25.6° and third order grating max-
ima at 25.6° and 27.9° so maxima absent or very dim here.
5. −
6. (a) 0.15° (b) − (c) − (d) Minima closer to the center of pattern.
7. −
8. (a) 1.2 × 10−4 rad (b) about 8 m (c) 2.4 × 10−4 light-years

(d) Limited by other factors − for example, retinal cell density/sensitivity;

optical quality of cornea and lens.
(e) Minimum angle resolved is inversely proportional to diameter and
dscope >> deye
9. (a) Assume: separation ~ 2 m, dmax ~ 20 km (in practice less than this
because of other factors).
(b) Needs objective diameter ~ 30 m, so no.
10. (a) 8.82 × 10−5 kgm−1 (b) 543 ms−1 (c) 362 Hz
(d) 362 Hz, 724 Hz, 1086 Hz
CHAPTER 16: SOUND

1. −
2. (a) 30 dB (b) doubles to 60 Db
3. (a) 0.283 m (b) − (c) h1 = 37.9 cm, h2 = 23.8 cm, h3 = 9.6 cm
4. (a) Otherwise most of the incident ultrasound would reflect off the surface
of the skin. Gel matches the impedance of tissue so most is transmitted.
(b) 31%
CHAPTER 17: ELECTRIC CHARGE AND ELECTRIC FIELDS

1. (a) 14 400 C (b) 3.0 × 10−3 C (c) 0.20 A
2. −
3. (a) Field strengths: A: 9.0 × 1015 Vm−1 to left, B: 4.0 × 1015 Vm−1
to right, C: 8.4 × 1014 Vm−1 to left D: 4.0 × 1015 Vm−1 to right,
E: 8.38 × 1014 Vm−1 to left
Electric potentials: A: 0V, B: 9.0 × 109 V, C: 0V, D: - 9.0 × 109 V, E: 0V
4. (a) 8.19 × 10−8 N (b) 2.2 × 106 ms−1, 2.2 × 10−18 J (c) - 4.3 × 10−18 J
(d) - 2.2 × 10−18 J (e) + 2.2 × 10−18 J
5. (a) -
(b) E = 62500 Vm−1 at 4.0 cm and 8300 Vm−1 at 8.0 cm; V = 2500 V at
4.0 cm and 910 V at 8.0 cm.

6. field lines originate on charge and meet the surface at 90°. Equipotentials
perpendicular to field lines.
7. (a) Gaussian surface inside the sphere encloses zero charges so net flux is
also zero. By symmetry zero everywhere inside.
(b) Use the surface of a small flat cylinder embedded in a conductor with
ends inside and outside the surface of the conductor. All field lines
must pass through the external surface and must be perpendicular to
the conductor so EA = Q/e0 and E = s/e0 where A is the surface of the
conductor inside the cylinder.
(c) Gaussian surface enclosing empty space encloses zero charge so net
flux through surface is zero and flux entering volume must equal flux
leaving it.
8. (a) -
(b) -
(c) 250 000 Vm−1 (d) 8.0 × 10−14 N (e) No change (uniform field)
(f) ae = 60 000 aion
9. (a) 2.3 × 107 ms−1 (b) parabolic downward (c) 2.2 × 10−9 s
(d) 7.7 × 106 ms−1 (e) 18.5° (f) 2.7 × 10−7 J
CHAPTER 18: DC CIRCUITS
1. (a) − (b) −
2. 7.0 × 10−5 ms−1
3. (a) 0.34 A (b) 4.3 × 103 Am−2 (c) 2.2 W −1m−1 (d) 0.17 W −1
(e) 2700 ms−1 (f) e− , h+ . e−
4. (a) B lights normally, A does not light.
(b) A lights almost normally and B lights just below normal brightness.
ER
5. (a) − (b) V = (c) Plot 1/V against 1/R: gradient = r/E and
( + r)
R
intercept = 1/E (d) −
6. (a) copper (b) 0.62 W, 0.39 W, 2.22 W (c) 44%, 54%, 91%
(d) 0.43 W (e) 4.6 A

7. (a) 4.0 V, 0.020 A (b) 3.87 V, 0.019 A

8. (a) in series: R, 2R, 3R; in parallel: R/2, R/3; series+parallel: 2R/3, 3R/2
9. (HINT: redraw equivalent circuit as 3 + 6 + 3 groups of parallel resis-
tors) 6R/5
10. (a) Dark 5.9 V, Light 0.55 V
(b) For high values the range of voltages becomes smaller. This makes the
device less sensitive but more linear.
11. ammeter 1: 0.060 A, ammeter 2: 0.135 A, ammeter 3: 0.075 A; voltme-
ter: 3.0 V
CHAPTER 19: CAPACITORS

1. (a) 1.32 mC (b) 0.00396 J (c) charge doubles, energy quadruples
2. (a) 0.103 s (b) ~ 0.5 s (c) no change, time constant is the same
(d) 30.7 kW, 330 mA
3. (a) exponential decay from I0 = V/R (b) V1 = constant, V2 exponential
decay, V3 growing from zero at decaying rate toward V.
4. (a) 0.00040 C, 0.0016 J (b) Q1 = 0.00013 C, Q2 = 0.00027 C
(c) 0.00053 J Energy dissipated as heat in connecting wires as charge is
redistributed.
(d) Q1 = 0.00040 C, Q2 = 0.00080 C, Qtot = 0.00120 C (e) 0.0048 J, cell
does extra work to charge both capacitors to same voltage
5. series: C, C/2, C/3; parallel: 2C, 3C; series+parallel: 2C/3, 3C/2
6. charge is halved; current forced back through cell as plates separate;
energy halved; voltage constant.
7. 0.0011 C, 2.34 V
CHAPTER 20: MAGNETIC FIELDS

1. Temperature in core is above Curie temperature so permanent magneti-
zation is not possible.

2. (a) Circular field lines around each wire: direction from RH rule. Field of
either wire intersects another current at 90° creating a “motor effect”
force: directions from FLHR. Attraction.
(b) 1.78 × 10−6 Nm−1
(c) Currents on opposite sides of coil have a repulsive interaction pro-
ducing an outward (explosive) force. This will be large if current and
number of turns are large.
m1 v m2 v 2 ( m1 − m2 ) v
3. (a) − =
(b) r1 = r2 (c) separation =
Bq Bq Bq
4. (a) − (b) Distance of line of action of force from axis varies as coil turns.
Max. at 0°, min. at 90°.
(c) 0.0312 Nm, 0.027 Nm, 0.022 Nm, 0.016 Nm. 0 Nm
CHAPTER 21: ELECTROMAGNETIC INDUCTION

1. (a) Flux in coil is changing. By Faraday’s law this results in an induced emf.
dNΦ
(b) (i) Larger deflection − greater (ii) negative deflection − sign
dt
dNΦ
of dNΦ has changed (iii) No deflection - =0
dt dt
2. (a) 2.0 × 10−6 Wb (b) 4.0 × 10−4 Wb (c) - (d) 1.3 × 10−4 A
(e) 6.7 × 10−5 C (f) current increased × 5 but charge is the same
3. Field of falling magnet cuts through conducting copper walls. Rate of
change of flux-linkage in copper creates eddy currents. These create a
magnetic field that acts back on the falling magnet opposing its motion
(Lenz’s law). Magnet accelerates until the upward magnetic field balances
weight and then falls at a terminal velocity.
4. (a) Wings are conductors cutting through the magnetic field. The rate of
cutting flux is equal to induced emf.
(b) 0.40 V
(c) No. The return path would have the same emf so the sum of emfs
around the circuit would be zero. OR: closed circuit encloses constant
flux so no change and no emf around loop.
5. (a) 100 turns (b) − (c) core forms a magnetic circuit so that the chang-
ing flux in the primary coil also links the secondary coil

(d) laminations interrupt eddy currents reducing the heat dissipated in

the core. (e) eddy currents / ohmic heating in coils / magnetic losses
caused by constant magnetization and demagnetization.
6. (a) 0.0017 H, 0.50 W (b) 0.034 J
(c) When the switch is opened I falls to zero very rapidly so dNΦ is large
dt
and there is a large induced emf. This appears across separating switch
contacts. Air breaks down and a spark is formed.
7. (a) 90.5 V (b) − (c) 1.81 A (d) 164 W (e) answer (b) is unchanged,
(c) and (d) are reduced
CHAPTER 22: AC
1. (a) D.C. batteries, compact, portable, no need to rectify but expensive,

limited life, disposal issues;
A.C. supply: can be stepped up or down with transformers, cheap at
point of use, easy to generate
(b) Can be stepped up at generator and transmitted at high voltage and
low current reducing heat loss from transmission lines.
2. (a) 28.3 V (b) 20 V (c) May be shorter with A.C> because of continual
heating/cooling stress
3. Resistor: 100 W at all frequencies.
Capacitor: infinity / 16 W / 1.6 W / 0.16 W
Inductor: 0.063 W / 0.63 W / 6.3 W
4. (a) 46.4 W (b) 0.35 A (c) 14 V (d) 2.6 W (e) current and voltage
have a 90° phase difference so the integral, over one cycle, of P = IV
is zero. (f) voltage leads current by 62°
5. (a) infinity / 12.1 W / 10.0 W / 10.3 W / 32.9 W (b) 4800 Hz
(c) current rises to peak of 1.2 A at f = 4800 Hz
(d) From low-frequency limit with voltage lagging current by 90° through
0 at 4800 Hz to voltage leading current by 90° at high-frequency limit.
(e) Circuit is purely resistive at resonance so power dissipated is maxi-
mum. Energy is dissipated as heat in the resistor but the capacitor and
inductor alternately store and release energy back to the circuit, they
do not dissipate energy.

CHAPTER 23: GRAVITATIONAL FIELDS
1. (a) 9.8 N (b) 2.72 × 10−3 N (c) 3600 and 3602 − that is, the ratios are
the same so it is consistent. (d) − (e) − (f) −
−GM −GM 3GMm

2. (a) VP = (b) VQ = (c) W =
R 4R 4R
GM 2GM GM
(d) v = (e) vesc = (f) vesc =
2R R 2R
GMm
(g)
8R
4
3. (a) g = Gpr r (b) g is greater at poles than equator (c) about 3%
3
4. 3.44 × 108 m from Earth
r 3 GM
5. (a) − (b) − (c) = so constant depends on central mass
T 2 4p 2
(d) 4.32 × 107 m
(e) orbit is centered on center of Earth so anything other than an equato-
rial orbit would vary in latitude and not be stationary with respect to
the surface. (f) 1.4 h
6. 1.9 × 1027 kg
7. more than six times as far
8. −
9. −
CHAPTER 24: SPECIAL RELATIVITY

1. (a) − (b) − (c) −
2. (a) 87.2 m (b) 5680 m (c)(i) 22.2 y (ii) 9.69 y (d) 8.72 ly
(e) Earth twin 65.4, traveler 40.4
3. (a) − (b) 0.14 c / 0.42 c / 0.75 c / 0.98 c (c) − (d) if v << c then g ~ 1
and relativistic effects are negligible (e) − (f) 1.2 GV

4. (a) Travelling clock slows − shows less elapsed time. (b) 18 ns

5. (a) − (b) − (c) − (d) l0
γ
(e) B sees a synchronization error on A’s clocks with X started before Y
so that A’s measurement of time is incorrect. B also sees both of A’s
clocks running slowly. A sees B’s clock running slowly. Both disagree
with the way measurements have been carried out in the other refer-
ence frame. However, both are in inertial reference frames so their
measurements are valid within their own frame.
CHAPTER 25: ATOMIC STRUCTURE AND RADIOACTIVITY

1. (a) − (b) − (c) − (d) − (e) 4.6 × 10−14 m
(f) scattering is purely electrostatic / Coulomb’s law is valid
2. Can control energy using an accelerator, Electrons do not feel the strong
nuclear force but alpha particles do so for close approach electrons are
only scattered by electrostatic forces.
3. (a)
Element Symbol Atomic Nucleon Protons Electrons Neutrons
number number
1
Hydrogen 1 H 1 1 1 1 0
12
Carbon - 12 6 C 6 12 6 6 6
13
Carbon -13 6 C 6 13 6 6 7
14
Carbon - 14 6 C 6 14 6 6 8
16
Oxygen - 16 8 O 8 16 8 8 8
56
Iron - 56 26 Fe 26 56 26 26 30
157
Gold - 79 Au 79 197 79 79 118
235
Uranium - 235 92 U 92 235 92 92 143
235
Uranium - 238 92 U 92 238 92 92 146

(b) For example, carbon isotopes: same atomic number but the different
mass number
(c) Electrostatic repulsion between protons is a cumulative long-range
repulsion. Strong nuclear interaction between neutrons and protons is
short-range. For a large nucleus, more neutrons are needed to stabi-
lize the nucleus against the coulomb force.
4. (a) 20 cpm (b) radioactive decay is a random process so it fluctuates
(c) 8 h (d) alpha and gamma
5. (a) 1.22 × 10−4 y−1 or 3.85 × 10−12 s−1
(b) 2 2 800 y assuming no contamination with “younger” carbon.
(c) 19 000 years
6. 2
7.
Time, years P Q
0 N 0
1000 N/2 N/2
2000 N/4 3N/4
3000 N/8 7N/8
5000 N/32 31N/32
Q rises from zero to asymptote at N, P falls exponentially from N to zero

P + Q = N at all points sop graph lines cross at 1000 years.
8. (a) − (b) 3.43 mm (c) 23 cm
9. −
10. (a) 8 alpha and 6 beta
(b) 92 U → 90Th + 2 a , 90Th →

238 234 4

234 234
91 Pd + −01 β + 00 υ ,
91 Pd → 92 U + −1 β + 0 υ
234 234 0 0
(c) abundances in approximate ratio of half-lives

(d) 214
83 Bi → Tl + 42 α then
210
81
210
81Tl → 210
82 Pb + −01 β + 00 υ
OR 214
83 Bi → 214
84 Po + −01 β + 00 υ then 214
84 Po → 210
82 Pb + 42 α
11. half-life approx. 185 s, decay constant approx. 3.7 × 10−3 s−1

CHAPTER 26: NUCLEAR PHYSICS

1. −
2. B.E. = 470 MeV, B.E./A = 8.3 MeV/nucleon
92 U → 90Th + 2 a (b) Use conservation of momentum: KETh = 81 keV
3. (a) 235 231 4
(c) 0.0052 u
4. (a) 166 C → 167 N + −01 β + 00 υ
(b) 8.0 MeV (c) Energy is shared randomly with anti-neutrino
(d) Neutron-rich nucleus. Large mass defect for decay. High probability
of decay and therefore short half-life.
5. (a) 137 N → 136 C + +01 β + 00 υ (1) 137 N → 138 O + −01 β + 00 υ (2)
(b) For (1) Dm = 0.0024 u, for (2) Dm = - 0.019 u, so mass of products
in (2) is greater than mass of original nucleus so that reaction cannot
proceed spontaneously.
(c) 7.5 MeV / nucleon, 7.7 MeV / nucleon − both are stable and nitrogen-15
might be expected to be more abundant. In fact, nitrogen-14 is much
more abundant because the odd-odd configuration is much more sta-
ble than an odd-even configuration of nucleons, allowing protons and
neutrons to pair up.
(d) t1/2 (C-16) > t1/2 (C-17) > t1/2 (C-18). Neutron excess destabilizes the
nucleus because neutrons are themselves unstable.
6. (a) −
(b)(i) 146 C → 147 N + −01 β + 00 υ (ii) 0 n → 1 p + −1 β + 0 υ
1 1 0 0

(iii) 1/ 3
−2 / 3 d→ 1/ 3
+1/ 3 u + −01 β + 00 υ
(c) Yes − proton (uud) changes to a neutron (udd) so an up quark changes
to a down quark.
CHAPTER 27: QUANTUM THEORY

−11 −11
1. (a) 5.5 × 10 m (b)(i) 6.2 keV (ii) 1.6 × 10 m
(c) Visible photon energy < 10 eV, X-ray photon energy > 10 eV
2. −

3. (a) photon energy < work function

(b) photons only transfer energy to single electrons (at the intensities
used)
(c) UV photon energy > 4.3 eV
(d) more photons per second arrive so more electrons per second emitted
(e) work function zinc > blue photon energy > work function of potassium
(f) Max. KE increases (= hf - F)
4. (a) 0.76 eV (b) 5.2 × 105 ms−1 (c) 0.76 eV
(d) No change in max. KE or stopping voltage but more electrons are
emitted per second so photocurrent increases.
5. (a) 3.4 eV (b) 4.9 × 10−7 m, visible
(c) (i) Elastic − scattered electron has 11 eV of KE.
(ii) Inelastic: electron excites atom to n = 2 level. Atom absorbs 10.2
eV and electron is scattered with 0.8 eV of KE.
6. (a) 2.5 × 1019 s−1 (b) Photon energy is so small that changes in intensity
are effectively continuous.
IA
7. (a) F = (b) Approx. 100 000 m2
c
8. −
9. −
54 eV
10. En = −
n2
CHAPTER 28: ASTROPHYSICS

1. (a) 5.0 × 10−7 m (b) 620 Wm−2 (c) 7.0 × 108 m
(d) Peak of curve shifts to shorter wavelengths. 2.90 × 10−8 m (UV)
2. (a) 3500 K (b) 5.7 × 1011 m
3. (a) 0.05 c moving away (b) 7.2 light-years
4. (a) − (b) 3.09 × 1016 m, 3.26 light-years
(c) parallax angle becomes too small to measure (d) 19.2 parsecs
5. (a) 10700 light-years (b) 26 Mpc

6. −
7. (a) 9.7 × 1010 Hz (b) 9.7 × 1010 Hz
(c) Shift for rocket is caused by Doppler effect due to relative motion,
shift for galaxy is caused by cosmological expansion of space.
(d) 4.3 × 108 years
8. −
CHAPTER 29: MEDICAL PHYSICS

1. −
2. −
3. (a) − (b) 3.1 × 10−11 m (c) −
4. −
5. (a) Photons have momentum. In the CM frame, the electron and positron
have net zero momentum sop if only a single photon was emitted this
would violate the law of conservation of momentum. In the CM frame
a pair of identical photons are emitted in opposite directions.
(b) 2.4 × 10−12 m

Glossary
Absolute space: proposed by Newton as a background against which all

physical processes take place. All observers, regardless of their own motion,
would agree on the separation of events in absolute space. Einstein’s special
theory of relativity showed that this cannot be the case.
Absolute time: proposed by Newton as a universal time in which all observ-
ers, regardless of their own motion, would agree on time intervals and the rate
of flow of time. Einstein’s special theory of relativity showed that this cannot
be the case.
Absolute uncertainty: likely range of values represented by a measure-
ment - for example, a measurement of length might be 4.565 ± 0.002 m: an
absolute uncertainty of ± 0.002 m (± 2 mm). The smaller the uncertainty the
more precise the measurement.
Absolute zero: lower fixed point of the thermodynamic temperature scale.
According to kinetic theory, molecular motion would stop at this temperature.
Absorption lines: dark lines in a spectrum corresponding to wavelengths
that are strongly absorbed by the medium through which radiation has passed.
Acoustic impedance: product of the speed of sound in a medium and the
density of the medium.
Action-at-a-distance: Newton’s original idea that gravitational forces act on
distant objects instantaneously and with nothing acting as an intermediary.
Activation process: process that depends, on a molecular scale, on the
molecules gaining an additional energy DE above the average energy at that
temperature.
FP.CH34_Glossary_3pp.indd 737 3/15/2023 2:38:55 PM

738 • Glossary
Activity: number of decays (disintegrations) per second inside a radioactive

source.
Adiabatic change: change that takes place with no transfer of heat in or out
of the system (Q = 0 in the first law of thermodynamics).
Alpha particle: helium nucleus. Emitted from certain unstable nuclei in
radioactive decay.
Alternating current (AC): current alternates its direction of flow in the cir-
cuit and the polarity of the supply alternates in a periodic way.
Ampère’s theorem: theorem that relates the line integral of the magnetic
field strength around a closed loop to the current passing through the area
enclosed by the loop.
Amplitude: maximum disturbance from equilibrium in a wave or oscillation.
Angular momentum: sum of moments of linear momentum about an axis of
rotation for all points in a rotating body, given by L = Iw and measured in Js.
Anode: positive electrical terminal.
Aphelion: point on an elliptical planetary orbit when the planet is farthest
from the Sun. The equivalent point in the orbit of an Earth satellite is called
apogee.
Archimedes principle: that the buoyancy force on an object placed into a
fluid is equal to the weight of the fluid it displaces.
Arrow of time: distinction between the past and the future linked to an irre-
versible physical change, such as increasing entropy.
Atomic mass number (A): number of nucleons (protons + neutrons) in the
nucleus. Also called nucleon number or just mass number.
Atomic number (Z): number of protons in the nucleus. Corresponds to the
position of the element in the Periodic Table and the number of orbiting elec-
trons in the neutral atom.
Attenuation: reduction of intensity with distance as a result of absorption in
a medium.
Balmer’s formula: numerical relationship between visible wavelengths in
the hydrogen atom line emission spectrum. This relationship could not be
explained using classical physics.
Barometer: instrument used to measure atmospheric pressure.
Base unit: seven agreed independent units from which the S.I. system is
constructed.

Glossary • 739
Battery: several cells connected together in series or parallel to form a power

supply.
Beer–Lambert law: law of attenuation when radiation passes through an
absorbing medium.
Bernoulli equation: equation describing the conservation of energy in a
flowing fluid. Can be used to relate pressures and flow rates in one part of the
system to those in another.
Beta radiation: high energy electron (b- ) or positron (b+ ) emitted when an
unstable nucleus decays.
Big Bang: origin of the Universe exploding from a point about 13.7 billion
years ago.
Biot–Savart law: equation used to find the magnetic field due to electric
currents.
Black body radiation: spectrum of radiation emitted by an ideal radiator.
Black hole: object whose gravitational field is so strong that, at a certain dis-
tance from its center, the escape velocity is equal to the speed of light.
Bohr model: model of the hydrogen atom in which electrons move in circu-
lar orbits around the central nucleus but can only occupy orbits in which their
angular momentum is an integer multiple of h/2p. This results in a discrete
set of allowed energy levels.
Boltzmann factor: ratio of numbers of particles expected to have energy
E + DE to the number having energy E in a system. This is equal to the probabil-
∆E
−
ity that a particle can gain the extra energy DE for an activation process: f = e kT
Bourdon gauge: instrument used to measure pressure.

Boyle’s law: relationship between pressure and volume for a constant mass of
an ideal gas at constant temperature: pV = constant.
Bremsstrahlung: “braking radiation” – continuous spectrum of X-ray radia-
tion e mitted when an electron beam strikes a metal target and the electrons
decelerate as they collide with atoms in target.
Brewster’s law: light incident on the surface of a transparent material at the
n
Brewster angle, given by tan θB = 2 , splits into refracted and reflected beams
n1
that are polarized perpendicular to one another.

740 • Glossary
Brittle: material property when little or no plastic deformation occurs before

fracture and fracture mechanism is by crack propagation.
Buoyancy: upward force on a body in a fluid caused by the pressure differ-
ence between the top and bottom of the body. Also called “upthrust.” The size
of the buoyancy force is given by Archimedes’ principle.
Capacitor: pair of conductors separated by an insulator. When there is a
potential difference between the two conductors they store opposite charges
of magnitude Q = CV, where C is the capacitance of the capacitor, measured
in farads (F).
Cathode: negative electrical terminal.
Center of gravity: point where the resultant gravitational force on a body
acts.
Centripetal acceleration: an object in uniform circular motion has an accel-
eration directed toward the center of the circle in which it moves.
Centripetal force: resultant force acting toward the center of the circle
when an object is in a uniform circular motion.
Cepheid variable: type of variable star with a well-understood intensity-
luminosity relationship so that its absolute luminosity can be inferred from
its period of variation. Used as a standard candle for distance measurements.
Ceramics: solid non-metallic materials formed by high-temperature firing,
usually strong hard, and brittle, for example, pottery or brick.
Charge carriers: particles within an electrical conductor that are responsi-
ble for the electric current (e.g., electrons in most metals).
Charles’s law: relationship between volume and temperature for a constant
V
mass of an ideal gas at constant pressure: = constant.
T
Classical physics: usually refers to Newtonian mechanics (including the law
of gravitation) and Maxwell’s laws of electromagnetism. It corresponds to the
known physics prior to 1900 and excludes relativity, quantum theory, and par-
ticle physics.
Closed system: in mechanics, a system with no external forces acting
upon it. In thermodynamics, a system that does not exchange heat with its
surroundings.
Coefficients of friction: ratio of frictional force to normal contact force
when the surfaces just begin to slip (static coefficient of friction) or when they
are sliding over one another (dynamic coefficient of friction).

Glossary • 741
Coherent sources: monochromatic sources that remain in phase or maintain

a constant phase difference.
Commutator: rotating switch in a D.C. motor that ensures the current direc-
tion changes every half rotation so that the turning effect on the coil is always
in the same direction.
Composite materials: combinations of two or more different materials
designed to take advantage of the desirable properties of each individual
component.
Compound pendulum: pendulum in which the mass is spread out rather
than c oncentrated at one point (i.e., in the pendulum bob).
Compton effect: scattering of X-rays from atomic electrons which can be
analyzed as if it was a collision between particles. The scattered X-ray has a
change of wavelength related to the angle of scatter.
Conservative field: if a particle is moved around a closed loop in a conserva-
tive field there is no net change in energy. The electric and gravitational fields
are both conservative fields.
Continuity (equation of): equation stating that the mass flow rate of a fluid
at one point in a pipe is equal to the mass flow rate at another: ρ1 A1 v1 =ρ2 A2 v2 .
If the fluid is regarded as incompressible this reduces to: A1 v1 = A2 v2 .
Control variables: variables that must be kept constant during an experi-
ment so that any change in the dependent variable is only caused by a
change in the independent variable. All control variables must be kept
constant for the experiment to be a fair test of the relationship between the
independent and dependent variables.
Convection: transfer of heat in a fluid as a result of bulk movement of par-
ticles in convection currents. These arise as a result of density changes in the
fluid when it is heated.
Conventional current: direction of current flow defined to be from the pos-
itive terminal of the supply to the negative terminal of the supply in the exter-
nal field (i.e., the direction in which a positive charge carrier would move)
regardless of the sign(s) of the charge carriers that make up the current.
Copenhagen Interpretation: mainstream interpretation of quantum
mechanics in which the state of a system is described by a wave-function and
there is a collapse of the wave-function when an observation or measurement
is made.

742 • Glossary
Copernican model: heliocentric (Sun at the center) model of the solar

system
Cosmic microwave background radiation: black body radiation corre-
sponding to a temperature of about 2.7 K present throughout the Universe.
One of the key pieces of evidence supporting the idea of the Big bang and the
expanding Universe.
Cosmology: the study of the origin nature and end of the entire Universe.
QQ
Coulomb’s law: force law between electric charges: F = 1 2 2 .
4 πε0 r
Couple: moment of a pair of forces of equal magnitude, acting in opposite
directions through different points in the same body: couple = magnitude of
one force × distance between lines of action of the two forces.
Creep: gradually increasing strain with constant stress.
Crystalline materials: having long-range geometric microstructure, for exam-
ple, metals.
CT (computed tomography): medical imaging technique using X-rays to
create detailed images of slices of the body.
Curie temperature: temperature above which thermal motions prevent the
permanent magnetization of a ferromagnetic material.
Damping force: force opposing motion causing the oscillator to do work as
it o scillates and resulting a decay of amplitude and energy.
De Broglie relation: fundamental relationship between the wave and parti-
Planck constant
cle models of radiation and matter: wavelength = .
momentum
Decibel scale: logarithmic scale of relative intensity used to c ompare inten-
sity levels with the level at the threshold of human hearing (I0): intensity
 I
level (dB) = 10 log10   .
 I0 
Degeneracy pressure: the Pauli Exclusion Principle prevents fermions
(half-integer spin particles) from existing in the same set of quantum states,
so that when a gas of fermions is compressed the lower energy states become
filled and it exerts an outward pressure.
Degree of freedom: in kinetic theory, the different modes in which a parti-
cle can absorb energy, for example, translation, rotation, and vibration.

Glossary • 743
Dependent variable: the variable you are investigating to find out how it
depends on an independent variable.
Derived unit: unit built up from more than one base unit, for example, ms- 1
for velocity. Some derived units have their own name, for example, the joule,
for energy. The joule is equivalent to kgm2s- 2 in base units.
Deterministic: where having complete knowledge of the present state
of a system is sufficient to make a complete prediction of its future state.
Newtonian mechanics is a deterministic system.
Dielectric material: insulating material that is polarized by an external elec-
tric field. When a dielectric material is placed between the plates of a capaci-
tor increases its capacitance.
Diesel cycle: idealized cycle for a diesel internal combustion engine.
Diffraction grating: an optical component consisting of a large number of
narrow parallel slits – used in spectroscopy to analyze light.
Diffraction: spreading of a wave into a region of geometric shadow after
passing through an aperture or past an object or edge.
Dimension: the nature of a physical parameter independent of its quantity,
for example, the dimension of time but not the actual duration of a particular
time.
Dipole (electric): two charges of opposite sign and equal magnitude sepa-
rated by a distance – for example, in a polar molecule.
Dipole (magnetic): object (e.g., a bar magnet) with a south magnetic pole at
one end and a north magnetic pole at the other end.
Direct current (D.C.): current that always flows in the same direction
around a circuit. The polarity of a d.c. supply is constant.
Dispersion: when refractive index depends on wavelength (e.g., for differ-
ent w
avelengths of visible light in glass) the amount of refraction will also be
wavelength dependent (e.g., when a triangular prism spreads white light into
a spectrum of colors).
Doppler effect: shift in observed wavelength and frequency of a wave as a
result of relative motion between the source and the observer.
Drag coefficient: constant in the drag equation related to the shape and
nature of the surface of a body moving through a fluid.

744 • Glossary
Drift velocity: mean velocity of charge carriers in a current-carrying con-

ductor. Usually much lower than the random thermal velocities of the charge
carriers.
Ductile: property of a material that can be drawn out into wires – depends
on plastic deformation.
Dynamic (kinetic) friction: frictional force between two surfaces sliding
over one another.
Earth (electrical): connection to the planet Earth taken to be at a constant
potential of 0V.
Eddy currents: current loops inside a conductor when a changing magnetic
field passes through it. These cause energy losses in the ferromagnetic core of
a transformer and are important in electromagnetic damping systems.
Elastic limit: up to the elastic limit extensions/strains are reversible when
the force/stress is removed. Beyond the elastic limit, they are irreversible and
permanent p lastic deformation occurs.
Electric field strength: force per unit charge acting on a charged particle at
a point in an electric field. Also equal to the potential gradient at each point.
Electric potential energy: energy a charged object has as a result of its
position within an electric field. A property of the object placed at that point
in the field.
Electric potential gradient: equal to the electric field strength at a point in
the field.
Electric potential: electric potential energy per unit charge at a point in an
electric field. Property of the field at that point. The zero of electric potential
is defined to be at infinity.
Electromagnet: usually a coil that creates a magnetic field through its center
when current flows.
Electromagnetic induction: process where changing flux-linkage through a
coil or conductor induces an emf in the coil or conductor. The essential prin-
ciple behind transformers and generators.
Electron-capture: process in some proton-rich nuclei whereby an inner elec-
tron combines with a nuclear proton to create a neutron and emit a neutrino.
Electrostatics: study of electric fields and forces from stationary arrange-
ments of charges.
EMF: energy transferred from other forms to electrical energy per unit
charge passing through a power supply, measured in volts (V).

Glossary • 745
Energy availability: extent to which energy within a system can be har-

nessed to do useful work.
Equilibrium: situation in which the resultant force and resultant moment on
a body are both zero.
Equipartition of energy: hypothesis that when energy is supplied to a
thermodynamic system each degree of freedom gets an average energy of ½
kT.
Equipotential surfaces: surfaces perpendicular to electric field lines. No
work is done on or by a charged particle when it moves from one point on an
equipotential surface to another.
Equivalence principle: Einstein’s postulate that the laws of physics in a
freely falling reference frame are indistinguishable from the laws of physics in
a region of a uniform gravitational field.
Error bars: drawn as a vertical or horizontal bar on either side of each plot-
ted point to indicate the range of uncertainty in that point. Error bars can
then be used to find the worst acceptable lines and the range in gradient and
intercept.
Escape velocity: minimum velocity that will allow an object to escape from a
point in a gravitational field to infinity (neglecting effects of non-gravitational
forces such as atmospheric friction).
Event horizon: surface surrounding a black hole such that no matter or radi-
ation can escape from within this surface and no event occurring inside this
surface can have an effect on an observer in the outside Universe.
Exponential change: growth or decay that changes by a constant propor-
tion in a constant time – for example, the activity of a radioactive source has
a constant half-life. Described mathematically by an equation of the form:
y = Ae±αx .
Extension: difference between unstretched length and stretched length, for
example, of a wire.
Faraday cage: a conducting box (often a metallic mesh box) enclosing a
region of space and preventing the transmission of electromagnetic waves into
or out of the box.
Faraday’s law: fundamental law of electromagnetic induction equating the
induced emf to the rate of change of flux-linkage.

746 • Glossary
Ferromagnetic material: For example, iron or nickel-containing atoms

which are themselves magnetic dipoles and which can be aligned with an
external field and remain aligned when the field is removed.
Fleming’s left-hand rule: used to work out the direction of the motor effect
force.
Forced oscillator: an oscillator coupled to and driven by an external oscil-
lating force.
Fractional uncertainty: ratio of absolute uncertainty dx to the measured
value of x: δx .
x
Free fall: motion of an object when it falls solely under the influence of gravi-
tational forces.
Free-body diagram: diagram representing a body as a single object and
showing all of the forces that act on it.
1
Frequency: number of complete oscillations per second given by f = ,
measured in Hz (1 Hz = 1 cycle per second). T
Galilean relativity: idea that the laws of mechanics are the same in all uni-
formly moving (inertial) reference frames.
Galvanometer: meter used to detect and measure small currents.
1
Gamma-factor: relativistic factor, γ = , that occurs in the calcula-
 v2 
1 − 2 
 c 
tion of many relativistic effects (e.g., time dilation, length contraction, mass
increase with velcoity) and that can be used to gauge the significance of rela-
tivistic effects.
Gauss’s law: mathematical equation connecting the flux of a field through a
closed surface to the number of sources enclosed by the surface, for example,
for an electric field: the flux of electric field through the surface is equal to the
total charge enclosed divided by the permittivity of free space.
Gay Lussac’s law: relationship between pressure and temperature for a con-
p
stant mass of an ideal gas at constant volume: = constant .
T
Geiger counter: instrument used to detect and measure ionizing radiation
from radioactive sources.

Glossary • 747
General relativity: Einstein’s theory of gravity as a distortion of space-time

geometry rather than a field of force.
Geodesic: shortest path through curved space-time. The path is followed by
a freely moving body or a light ray. (A geodesic on the surface of the Earth
would be a great circle route).
Geostationary satellite: satellite in an equatorial orbit placed at such a dis-
tance that its period of orbit is equal to the Earth’s period of rotation on its
axis (23 hours and 56 minutes). Geostationary (or geosynchronous) satellites
remain above the same point on the Earth’s surface and are used for global
communications.
Glasses: similar to ceramics but they have a completely amorphous micro-
structure and result from a rapidly cooled melt.
Graham’s law of diffusion: rate of diffusion in gas is inversely proportional
to the square root of the molecular mass.
Gravitational field strength: gravitational force per unit mass at a point in
the field. A property of the field.
Gravitational potential energy: energy an object has because of its position
in a gravitational field. A property of the body placed in the field.
Gravitational potential: gravitational potential energy per unit mass. A
property of the field at each point in space. The zero of potential is defined to
be at infinity.
Gravitational time dilation: effect of gravitational fields on the rate at which
time passes – a clock placed in a stronger gravitational time ticks more slowly
than one in a weaker field.
Gravitational waves: periodic disturbances of space-time geometry that
travel outwards from their source (e.g., a binary star system or colliding black
holes) at the speed of light. The gravitational equivalent of electromagnetic
waves.
Ground state: lowest allowed energy state for an electron in an atom (n = 1
state). Corresponds, in the Bohr model, to the state in which the circumfer-
ence of the orbit is exactly one wavelength.
Half-life: time taken for the activity of a radioactive source to halve or for the
number of unstable nuclei in the source to halve.
Hard/soft magnetic material: difficult/easy to magnetize and demagnetize.
Hard: resists indentation and scratching

748 • Glossary
Heat capacity: energy required to raise the temperature of an object by 1 K.

Heat death: idea that, as entropy continues to increase, the far future of the
Universe will be characterized by an equilibrium state in which energy avail-
ability has fallen to zero.
Heat engine: engine designed to extract useful work from a heat reservoir.
Heat pumps: system in which work is used to pump heat from a heat reser-
voir at lower temperature to a heat reservoir at a higher temperature.
Heating: energy transfer as a result of a temperature difference.
Heisenberg’s Uncertainty Principle: principle that sets a limit on how
much can be known about certain pairs of variables, for example, position and
momentum or energy and time. for example, the more precisely we deter-
mine the location of an electron the smaller the uncertainty in its position but
the larger the uncertainty in its momentum.
Hertzsprung-Russell diagram: chart displaying luminosity against spectral
type (or inverse surface temperature) of stars, revealing several distinct groups
or bands including the main sequence, red giants, and white dwarf stars.
Homogeneous equation: an equation in which quantities, units, and dimen-
sions balance.
Hubble law: relationship between speed of recession and distance for galaxies
v = H0d.
Hubble time: reciprocal of the Hubble constant, a rough approximation of
the age of the Universe.
Hydrostatic pressure: pressure due to a stationary fluid.
Hysteresis: For example, when the force to load a sample is different from
the force as it is unloaded, a cycle of loading and unloading results in a closed
loop on a graph of force against the extension. This is a hysteresis loop. The
area of the loop is related to the energy dissipated by the sample during the
process.
Ideal fluid: incompressible inviscid fluid.
Ideal gas equation: equation of state for an ideal gas, incorporating all three
gas laws: pV = nRT.
Ideal gas: theoretical model of a gas whose equation of state is pV = nRT.

Glossary • 749
Impedance: ratio of the peak voltage to the peak current in an A.C. system
even though these values occur at different times. Resistance and reactance
are special cases of impedance when the phase difference between voltage
and current is 0 or p/2, respectively. Measured in ohms.
Impulse: integral of force and time equal to the change of momentum. When
force is constant impulse is Ft = mv - mu.
Independent variable: the variable you vary to find its effect on the depend-
ent variable. Sometimes called the “manipulated variable.”
Indeterministic: where having complete knowledge of the present state
of a system is insufficient to make a complete prediction of its future state.
Quantum theory is an indeterministic system.
Indicator diagrams: plot of pressure against volume for a cyclic process in
a heat engine.
Inertial force: an apparent force that is a result of applying Newton’s laws to
an accelerating reference frame as if it is not accelerating. For example, think-
ing that a force throws you forward when the train in which you are traveling
suddenly slows down. No physical force pushes you forward.
Inertial reference frame: an unaccelerated reference frame, one that is at
rest or moving at a constant velocity.
Intensity: energy per unit area per second in a wavefront, measured in Wm- 2
and proportional to the amplitude-squared.
Interaction: all forces arise from interactions. In nature, there are four fun-
damental interactions: gravitational, electromagnetic, and strong and weak
nuclear forces.
Interferometer: instrument in which light rays traveling along two per-
pendicular paths are brought back together and then allowed to superpose
and interfere in order to measure small differences in the optical paths (e.g.,
to detect the distortions caused by gravitational waves passing through the
apparatus).
Internal energy: sum of random thermal kinetic energies and potential
energies of all particles in the body.
Internal resistance: resistance inside a cell or battery that dissipates energy
when current is drawn from the cell and results in a lost voltage so that the
terminal voltage is less than the emf of the supply.
Invariant: a quantity that is the same for all inertial observers (e.g., the 4D
interval between events).

750 • Glossary
Inviscid fluid: fluid with zero viscosity.

Ionizing radiation: radiation capable of ionizing atoms: for example, in the
EM spectrum: short wavelength UV, X-ray, and gamma-rays. Alpha and beta
emissions from unstable nuclei are also forms of ionizing radiation.
Isobaric changes: changes that take place at constant pressure.
Isochoric changes: changes that take place at constant volume.
Isothermal changes: changes that take place at constant temperature.
Kelvin scale: thermodynamic scale based on the absolute zero of tempera-
ture and the triple point of water.
Kepler’s laws: three laws of planetary motion based on elliptical orbits with
the Sun at the center of the system.
Kinetic theory: particle model of matter used with Newton’s laws to derive
the ideal gas equation.
Kirchhoff’s second law: statement of energy conservation for an electric cir-
cuit stating that the sum of emfs must equal the sum of potential differences
around any closed loop in an electric circuit.
Larmor frequency: For example, precession frequency for the axis of a
magnetic dipole rotating about the direction of an applied external magnetic
field during an MRI scan.
Law of Dulong and Petit: postulate that the heat capacity of all metals is 3R.
Length contraction: the observed reduction of lengths in a reference frame
that is moving relative to the observer.
Lenz’s law: states that the direction of an induced emf is such as to oppose
the change that caused it. This ensures that energy is conserved.
Lepton: fundamental particle related to the electron. There are six different
leptons, each with its own antilepton.
Linear absorption coefficient: constant that relates the attenuation of light
in an absorbing medium to the properties of the medium.
Linear momentum: the product of mass and velocity, a vector quantity with
units kgms- 1.
Local force: idea that forces do not come from a distance but are the result of
a local field that acts directly on a particle. This also explains why effects take
time to propagate from one place to another as changes spread out through
the field at a fixed speed (e.g., electromagnetic waves).

Glossary • 751
Logarithm: power of some base number that represents a quantity – for

example, the logarithm of 1000 to base 10 is 3 because 10 to the power 3 is
equal to 1000.
Logarithmic scale: a scale that increases by a constant multiple, for exam-
ple, 1, 10, 100, 1000…. Useful for displaying data with a very wide range of
values onto a single graph or chart.
Longitudinal wave: wave in which the vibration direction is parallel to the
direction in which the wave transfers energy (e.g., sound/ultrasound) creating
regions of compression and rarefaction in the medium.
Lorentz transformation: series of equations that transform space and time
coordinates of an event from one inertial reference frame to another (in agree-
ment with the principle of relativity).
Luminiferous aether: hypothetical medium supporting electromagnetic
fields and through which electromagnetic waves travel at the speed of light.
Einstein’s special theory of relativity and the Micheslon-Morley experiment
both showed that this cannot be the case.
Luminosity: total power radiated from a star.
Magnetic field strength: equivalent to magnetic flux density. Measured in
tesla (T) or webers per square meter (Wbm- 2).
Magnetic field: field created by moving charges (e.g., in an electric current)
that exerts forces on other moving charges (or currents).
Magnetic flux linkage: the magnetic flux through a coil multiplied by the
number of turns in the coil. Measured in Weber (Wb).
Magnetic flux: integral of the perpendicular component of magnetic field
strength and area. Measured in Weber (Wb).
Magnetic resonance imaging (MRI): medical imaging technique that
detects the radio waves emitted when nuclear magnetic dipoles in hydrogen
atoms inside the body align with an external field.
Main sequence: diagonal band on the HR diagram, running from top left
(high luminosity and high temperature) to bottom right (low luminosity and
low temperature). Most stars will spend most of their lives in this band.
Malleable: property of a material that can be beaten out into sheets – depends
on plastic deformation.
Manometer: U-tube containing a liquid, used to measure pressure
differences.

752 • Glossary
Many-worlds theory: interpretation of quantum theory put forward by

Hugh Everett III in which the wavefunction never collapses and each sepa-
rate possibility is realized in a separate world.
Mass spectrometer: instrument used to find the relative masses and abun-
dances of isotopes in a sample.
Mass: fundamental property of matter that determines its inertia (response to
a resultant force) and its effect on and response to gravitational fields.
Maxwell distribution: probability distribution for molecular speeds or
kinetic energies within a gas.
Maxwell’s equations: fundamental equations of electromagnetism describ-
ing how electric and magnetic fields are related to each other and to charges
and how electromagnetic waves propagate.
Michelson–Morley experiment: an attempt to measure the effect of the
Earth’s motion through the postulated luminiferous ether on the speed of
light relative to the Earth and thereby infer the velocity of the Earth relative
to the aether. No effect was detected.
Moment of inertia: property of a body that resists changes in angular veloc-
i= N
ity, given by ∑m r
i=1
i i
2
and measured in kgm2. Its value depends on the axis
about which the body rotates.
Moment: turning effect of a force in Nm. Also called a torque.
Monatomic gas: gas consisting of individual atoms acting as particles with no
internal degrees of freedom.
Monochromatic: light consisting of a single wavelength (single ‘color’).
Monoenergetic: particles having a single energy.
Motor effect: force on a current-carrying conductor when placed into a mag-
netic field such that there is a component of the field perpendicular to the
current.
Mutual inductance: when two coils are close together, a changing current
in either coil creates a changing magnetic field that affects the other coil and
induces an emf in it. The strength of this effect is measured by the mutual
inductance of the system of two coils in henries (H).
Natural frequency: frequency of free oscillations when the oscillator is dis-
placed and released.

Glossary • 753
Neutral point: point in space where the fields caused by two or more sources
cancels out.
Neutron star: fate of a heavy star that has formed a planetary nebula. Its core
continues to collapse beyond the white dwarf stage until it is prevented from
further collapse by neutron degeneracy pressure.
Newton’s law of gravitation: the gravitational force between two point
masses is proportional to the product of the masses and the inverse-square of
their separation: F = Gm1 m2 .
r2
Newton’s laws of motion: three fundamental laws of mechanics related to
the effects of resultant forces and the nature of interactions.
Nucleon: particle found in the nucleus – a proton or a neutron.
Null result: when an expected effect is absent even though the method and
precision should have detected it. The Michelson-Morley experiment is the
most famous example of a null result.
Optical Fiber: narrow transparent fiber along which light or infrared radia-
tion can be transmitted because it repeatedly undergoes total internal reflec-
tion at the boundary. Used to transmit information, for example, for computer
networks, telephone systems, and cable TV.
Oscillation: periodic motion about an equilibrium position, such as the vibra-
tion of a mass on a spring or the swing of a simple pendulum.
Otto Cycle: idealized cycle for a petrol internal combustion engine.
Pair annihilation: conversion of the mass of a matter particle and its corre-
sponding anti-particle into energy in the form of gamma rays.
Parsec: unit of distance in astronomy equal to about 3.26 light years.
Percentage uncertainty: ratio of absolute uncertainty dx to measured value
δx
x expressed as a percentage: × 100% .
x
Perihelion: point on an elliptical planetary orbit when the planet is closest
to the Sun. The equivalent point in the orbit of an Earth satellite is called
perigee.
Permanent magnet: magnetic material in which the atoms are themselves
magnetic dipoles. If these are aligned (e.g., in a ferromagnetic material such
as iron) the sample becomes a magnetic dipole.
Permeability: property of a medium related to its ability to support a mag-
netic field

754 • Glossary
Permittivity: property of a medium related to its ability to support an electric

field.
Phase velocity: velocity at which a point of constant phase in a wave moves
– for example, the velocity of a wave crest.
Phase: position within a cycle of oscillation on a scale of 0 to 360° or 0 to 2p
radians.
Phasor: rotating vector used to represent an oscillation or point on a wave.
Length corresponds to amplitude, angle corresponds to phase, and its rotation
frequency is the same as the oscillation or wave it represents.
Photoelectric effect: ability of light, above a certain threshold frequency, to
eject elctrons from a metal surface.
Photomultiplier tube: very sensitive light detector that can respond to sin-
gle photons by massively amplifying the number of electrons emitted by each
photon that is absorbed.
Photon: quantum of electromagnetic radiation.
Piezoelectric crystal: type of crystal used in ultrasound transmitters and
detectors. When the crystal is stressed it generates a voltage and when a volt-
age is applied to it there is a corresponding strain.
Planck constant (h): fundamental constant in quantum theory.
Poincaré recurrence: idea that given enough time a system will return to all
of its possible macroscopic states an unlimited number of times.
Polar orbit: satellite orbit passing over the poles of the Earth. These orbits
are usually relatively low and with periods of a few hours, so the satellite will
pass over every part of the Earth’s surface as it completes several orbits.
Polarization: selection of a particular vibration direction for a transverse
wave – for example, vertical plane-polarized waves or horizontally plane-
polarized waves.
Polymers: materials consisting of long-chain hydrocarbon molecules which
tend to align with applied stress, for example, rubber or polythene.
Positron Emission Tomography (PET scans): medical imaging technique
using a beta-plus emitter as a tracer. Positrons emitted from the tracer annihi-
late with electrons in the body sending out a pair of gamma rays of the same
energy moving in opposite directions.
Potential difference: difference in electric potential between two points in
space (or in an electric circuit).

Glossary • 755
Potential divider: arrangement of two resistors across a power supply so that

the voltage across one of them is a fraction of the supply voltage determined
by the resistance ratio.
Power: rate of transfer of energy, measured in watts (W). 1W = 1Js- 1
Precession: for example, when the magnetic dipole axis of a nucleus rotates
about the direction of an applied magnetic field.
Principle of moments: condition for an equilibrium of moments acting
on the same body; sum of clockwise moments must be equal to the sum of
counterclockwise moments about any point.
Progressive/traveling wave: wave that transmits energy from a source to an
absorber.
Pulsar: rapidly rotating neutron star whose intense magnetic field results in
two jets of radiation emitted in opposite directions. If the Earth lies in the
path of one of these jets then regular pulses of radiation will be detected.
Quantum of energy: smallest discrete unit of energy transfer – for example,
for the emission or absorption of EM radiation at frequency f the minimum
transfer is one photon with energy E = hf.
Quark: fundamental particle found in all hadrons (baryons and mesons), for
example, protons and neutrons. There are six different types of quark, each
with its own anti-quark.
Radian: unit for measurements of angle based on the geometry of the circle.
One radian is the angle subtended by an arc of length equal to one radius of
the circle. There are 2p radians in a complete circle so 2p radians = 360°.
Radioactive emission: ionizing radiation emitted when an unstable nucleus
decays.
Radiological dating: use of known radioactive half-lives to work out the age
of archaeological or geological samples.
Random error: error-making measured values larger or smaller than true
values by an unpredictable amount – for example, in repeated measurements
of the time period of a pendulum. The significance of random errors can be
reduced by repeating measurements and using an average value.
Ray: line perpendicular to a wavefront in the direction of energy transfer.
Rayleigh criterion: rule for comparing the diffraction limit to the resolving
power of optical instruments.

756 • Glossary
Reactance: A.C. impedance of a component (capacitor or inductor) where

the current and voltage vary with a phase difference of p/2. Reactance is equal
to the ratio of the peak voltage to the peak current in the component even
though these occur at different times. Measured in ohms.
Red giant star: star approaching the end of its life. As fuel for nuclear fusion
reactions in its core begins to run out it swells up and its surface cools until it
is a huge red giant star.
λ − λ0
Red-shift: ratio of increase in wavelength to original wavelength, z = ,
λ0
for waves reaching an observer from a source that is moving away from the
observer.
Refraction: change in direction of a wave when it crosses a boundary between
two media in which the wave speed is different.
Refractive index: the absolute refractive index of a transparent material is
the ratio of the speed of light in a vacuum (c) to the speed of light in the
c
medium (v): n = .
v
Refrigerator: system in which work is used to extract heat from an object at
low temperature and dump it into a heat reservoir at a higher temperature
(e.g., the environment).
Relativity of simultaneity: two separate events that are simultaneous for
one observer can occur at different times for an observer in a different inertial
reference frame.
Resistance: ratio of potential difference across an electrical conductor to
V
the current in it: R = . Resistance is a property of a particular component.
I
Measured in ohms (W).
Resistivity: property of a material equal to the resistance across opposite

faces of a uniform cube of the material with sides of 1m. measured in ohm-
meters (Wm).
Resolving power: minimum angular separation of object points that results
in separate image points in an optical instrument.
Resonance: strong response of an oscillatory system when it is driven at its
natural frequency.
Resultant force: vector sum of all forces acting on the same body.

Glossary • 757
Reynolds number Re: dimensionless number that relates inertial forces to

viscous forces. For large Reynold’s number the flow is turbulent, for small
Reynold’s number the flow is laminar.
RMS values: root mean square value of a quantity, particularly important in
A.C. The rms value for quantities that vary sinusoidally is 1 times the peak
V 2
value. For example, Vrms = peak .
2
Rotational Kinetic Energy: energy of a rotating body given by RKE = ½ Iw2
Rutherford scattering experiment: in which alpha particles were fired at
thin gold foil. Analysis of the scattering data led to the discovery of the atomic
nucleus.
S.I.: système international d’unités, an agreed international system of units
including m, kg, s.
Scalar field: region of space in which each point is associated with a quan-
tity having magnitude only – for example, gravitational potential or electrical
potential.
Scalar: physical quantity with magnitude only, for example, mass, tempera-
ture, energy.
Schrödinger’s cat: a thought experiment in which a microscopic quantum
effect, the decay of a radioactive atom, is linked to a macroscopic event, the
death of a cat. Prior to opening the box to see if the cat has survived it exists
in a superposition of states: both dead and alive.
Schrödinger’s equation: fundamental equation in quantum theory. Solutions
to the Schrödinger equation are wave functions.
Schwarzschild radius RS: surface surrounding a spherical black hole at
which the escape velocity is equal to the speed of light.
Scientific notation: method of expressing quantities in terms of a power of
10 multiplied by a number between 1 and 10, for example, c = 3.00 × 108 ms- 1.
Scintillator: material that emits light when it absorbs X-rays.
Self-inductance: when the current in a coil changes the magnetic field also
changes and affects the coil, inducing a (back) emf in the coil, opposing the
change. The strength of this effect is measured by the inductance of the coil
in henries (H).
Semiconductors: materials such as silicon and germanium that have a small
density of charge carriers at room temperature (compared to a metal) but

758 • Glossary
whose conductivity increases with temperature as more charge carriers are

freed.
Simple harmonic motion: oscillatory motion in which the acceleration is
directly proportional to displacement from equilibrium and directed back
toward equilibrium.
Simple pendulum: point mass suspended from a light inextensible string.
Snell’s law: law of refraction stating that the ratio of the sine of the incident
angle to the normal to the sine of the refracted angle to the normal at the
boundary of two media is constant.
Solenoid: long coil, usually used to create an electromagnet.
Sonography: use of ultrasound to scan the body.
Space-time curvature: disturbance of the geometry of space and time as a
result of the presence of matter or energy.
Specific heat capacity: energy required to raise the temperature of 1 kg of
an object by 1 K.
Specific latent heat: energy required to change the state of 1 kg of a material
from solid to liquid or from liquid to gas with no change in temperature (i.e.,
at its melting or boiling point).
Spectrometer: an instrument used to analyze the spectrum of a light source.
Spectroscopy: analysis of light which involves spreading the light into a spec-
trum and determining the wavelengths present and their relative intensities.
Spectrum: range of wavelengths that might be continuous (e.g., the colors
of the rainbow) or might consist of lines or bands (e.g., atomic or molecular
emission spectra).
Speed of light (c): speed of electromagnetic waves in a vacuum. Fundamental
maximum speed at which matter and information can be transmitted from
one place to another.
Spring constant: ratio of force to extension in Hooke’s law, a measure of the
stiffness of a spring.
Stability: the extent to which a system can return to equilibrium after being
displaced from it.
Standard candle: astronomical object whose absolute luminosity is known
(e.g., Cepheid variable, type 1a supernova) so that it can be used to determine
distances.

Glossary • 759
Standard model: current best model of all particles and forces in the
Universe.
Standing/stationary wave: localized wave disturbance consisting of a sta-
tionary p
attern of nodes (positions of minimum disturbance) and antinodes
(positions of maximum disturbance).
Static friction: frictional force between two surfaces in contact with each
other and at rest.
Stefan-Boltzmann law: relationship between the power per unit area emit-
ted by a radiator and its temperature.
Stiff: material that has a large stress to strain ratio (large Young modulus) –
that is, hard to stretch.
Strain energy: energy stored because of deformation, for example, in a
stretched spring.
Strong: large breaking force (for a sample) or large breaking stress (UTS) for
a material.
Sum-over-histories: approach to quantum theory suggested by Richard
Feynman in which all possible paths contribute a phasor and the square of
the resultant phasor at each point represents the probability of the process
taking place.
Supernova: explosion of a massive star at the end of its life.
Superposition: when two or more waves are present at the same point in
space the resultant disturbance is the vector sum of the disturbances due to
each wave.
Symmetry principle: when an operation carried out on a system leaves it
unchanged.
Systematic error: measurement error that affects all measurements in the
same way – for example, making them all too large or too small by the same
quantity or proportion. If the error is known it can be corrected (e.g., by sub-
tracting a constant value from each measurement).
Tensile strain: ratio of extension to the original length. Dimensionless.
Tensile stress: ratio of axial force applied to the cross-sectional area of sam-
ple perpendicular to the force, measured in Nm- 2.
Thermal conduction: transfer of heat as a result of particle-to-particle
interactions.

760 • Glossary
Thermal equilibrium: when two objects are at the same temperature and,
if placed in thermal contact, there is no net transfer of heat between them.
Thermal radiation: emission of electromagnetic radiation with a spectrum
that depends on the temperature of the emitting body.
Thought experiment: an imagined experiment used to explore the implica-
tions of the theory, for example, Schrödinger’s cat or the twin paradox.
Tidal forces: differential forces arising because of the difference in gravita-
tional force across the diameter of an orbiting body. Tidal forces tend to dis-
tort the body along and perpendicular to the line joining it to the body around
which it orbits.
Time dilation: the observed slowing of time in a reference frame that is mov-
ing relative to the observer.
Time period: time for one complete cycle of oscillation.
Torricelli vacuum: space above the mercury inside a mercury barometer
containing very low-pressure mercury vapor.
Total internal reflection: when a wave strikes the boundary between a
medium of higher refractive index and one of lower refractive index above a
certain critical angle and all of the wave energy is reflected back into the first
medium.
Tough: undergoes a considerable amount of plastic deformation and absorbs
a lot of energy before fracture.
Transformer: electrical device usually consisting of a primary coil and a sec-
ondary coil wound onto the same ferromagnetic core. An A.C. voltage in the
primary can be stepped up or down according to the transformer equation:
V2 N2 .
=
V1 N1
Transverse wave: wave in which the vibration direction is perpendicular to
the direction in which the wave transfers energy (e.g., all EM waves).
Trigonometric parallax: method for determining the distance to a star by
measuring the change in its apparent direction (parallax) angle as the Earth
orbits the Sun. The more distant the star, the smaller the parallax angle. A
method of triangulation with the Earth’s orbital diameter as baseline.
Triple point of water: unique temperature at which ice, water and water
vapour are in equilibrium.

Glossary • 761
Twin paradox: thought experiment in which two twins separate on a particu-

lar date with one taking a high-speed return journey to a star and the other
remaining behind on Earth. According to special relativity the twins should
have different ages when they reunite – but which twin should be younger?
Ultrasound: sound waves at a higher frequency than the upper limit of
human hearing, usually taken to be f > 20 kHz.
Ultraviolet catastrophe: the failure of classical theory, at short wavelengths,
to derive an expression for the black-body radiation spectrum. The best
attempts predicted an ever-increasing amount of radiation at the UV end of
the spectrum.
Uniform gravitational field: field of constant strength and direction
throughout a region of space. The gravitational field near the surface of the
Earth is approximately uniform over distances that are small compared to the
Earth’s radius.
Universal constant of gravitation: “big G,” the constant in Newton’s law
of gravitation.
Vector field: region of space in which each point is associated with a quantity
having magnitude and direction – for example, gravitational field strength or
electric field strength.
Vector: physical quantity with magnitude and direction, for example, force,
momentum, velocity.
Velocity-selector: region of space in which a magnetic field and an electric
field act at right angles to one another and when charged particles are fired
through this region only those with a unique velocity remain undeflected.
Used in a mass spectrometer to ensure that all of the ions enter the device at
the same velocity.
Viscosity: measure of a fluid’s resistance to flow. Units of viscosity are Pas.
Voxel: a volume element in a 3D image that is assigned a value determined by
how strongly it absorbs X-rays during a CT scan.
Wave function (psi, Y): mathematical expression that describes the state of a
physical particle or system. The “intensity” of the wavefunction at a point in space
2
( Ψ ) is proportional to probability, for example, for a photon Ψ 2 δV repre-
sents the probability of finding the photon in a volume of space dV.
Wavefronts: lines of constant phase (usually crests and/or troughs) repre-
senting wave motion. Wavefronts are perpendicular to rays.

762 • Glossary
Wavelength: shortest distance between two points oscillating in phase in a

wave.
Wave-particle duality: loose description given to the observation that a
particle model and a wave model are needed to explain different aspects of
the behavior of matter and radiation, but neither model can give a complete
explanation of all aspects of behavior.
Weight: gravitational force acting on a body.
White dwarf star: fate of a medium mass star (like our Sun) after it has swol-
len to become a red giant and its outer layers have drifted off into space. A
white-hot core remains, and electron degeneracy pressure prevents further
collapse.
Wien’s displacement law: relationship between wavelength at the peak of
the black body radiation spectrum and the temperature of the radiating body:
λT = constant .
Work: energy transfer as a result of movement in the direction of an applied
force.
Worldline: path through space-time (line on a space-time diagram).
Yield stress: stress at which a material starts to deform plastically.
Young modulus: measure of the stiffness of a material, equal to the ratio of
stress to strain when the material obeys Hooke’s law (linear part of graph of
stress against strain).
Young’s double slit experiment: famous experiment in which monochro-
matic light is passed through a pair of narrow parallel slits and forms an
interference pattern. This originally provided the first measurement of the
wavelength of light (thereby supporting a wave model for light) and is now
often used to explore ideas about the interpretation of quantum theory.
Zero error: non-zero reading on an instrument when it should read zero
(e.g., when a micrometer screw gauge is closed but the scale reads 0.02 mm).
Must be subtracted from readings when using the instrument.

Index
A Annihilation, 104, 529, 668–670, 753

Absolute space, 513, 534, 737 Antilogarithms, 34
Absolute time, 517 Anti-neutrino, 554
Absolute zero, 157, 210, 396, 398, 750 Antinode, 331
Accelerometer, 51 Archimedes’ Principle, 118
Acoustic impedance, 345 Aristotle, 82, 83, 495
A.C. parallel circuits, 475 Arrow of time, 200, 205, 209, 214–216
A.C. power, 464 Artificial satellites, 498
A.C. series circuits, 471 A-scan, 655
Activation energy, 183 Astrophysics, 639
Activity, 552 Atom bomb, 576, 584
Adiabatic change, 187, 192 Atomic bomb, 576
Adiabatic gas constant, 179 Atomic mass number, 541
Alpha decay, 554 Atomic mass units (a.m.u.), 569
calculating energy released, 569 Atomic number, 541
Alpha particles, 543, 554 Attenuation coefficient, 544
absorption, 547
Alternating current, 463 B
Ammeter, 376 Background radiation, 548
Ampère’s theorem, 431 Balmer’s formula, 618
Amplitude, 269 Barometer, 115
of a wave, 269 Batteries and cells, 377
Ancient Greeks, 495 Becquerel, 542
Angular magnification Becquerel (S.I. unit), 553
of a telescope, 303 Beer–Lambert law, 661
Angular momentum, 249 Bernoulli effect, 126
conservation of, 252 Bernoulli equation, 125
FP.CH35_Index_2pp.indd 763 3/17/2023 12:49:04 PM

764 • Index
Beta emission, 543 Capacitor charging, 411

Beta-minus decay, 554 Capacitor discharge, 409
Beta-plus emission, 556 Capacitors, 403
Beta radiation energy stored, 406
absorption, 546 in series and parallel, 412
Big Bang, 307, 581, 650–654, 739 Carbon-14, 561
Binding energy per nucleon, 568 Carnot, Sadi, 188, 204
Binomial theorem, 528 Celsius scale, 156
Biot–Savart law, 427 Center of gravity, 59
Black body radiation, 161, 739 Center of mass reference frame, 104
Black-body radiation spectrum, 597, Centrifugal forces, 241
652, 761 Centripetal acceleration, 238
Black dwarf, 640 Centripetal force, 239
Black holes, 493, 494, 508, 512, 641, Cepheid variables, 646–648
684, 745, 757 Ceramics, 151
Blue shift, 306 CERN, 6, 8, 25, 398, 424, 581, 650
Bohr, 616–621, 626, 739, 747 Chain reaction, 575–579, 583, 671
Boltzmann, 114, 162, 174, 177, 183, Charged particle
184, 196, 208, 210, 688, 698, 699, 710, path in magnetic field, 423
739, 759 Charge (charge conjugation) symmetry, 21
Boltzmann factor, 183, 184, 196, 688, Charging by friction, 350
699, 739 Charging by induction, 352
Boltzmann’s constant, 177 Charles’s law, 172
Born, Max, 613, 627 Chromatic aberration, 298
Boundary conditions Circular motion
standing waves, 331 uniform, 237
Bourdon gauge, 116 Circular orbits, 496
Boyle’s law, 170 Classical physics, 513, 597, 616, 740
Brahe, Tycho, 495 Closest approach
Brewster’s law, 281 Rutherford scattering, 539
Brittle, 150 Cloud chamber, 547
Brushes, 433 COBE, 652
B-scan, 655 Coefficient of friction
Buoyancy, 118 static and dynamic, 72
Coherence, 312
Collapse of the wavefunction, 613, 626
C Commutator, 433
Candela, 1 Compass needle
Capacitance, 405 behavior, 418
Capacitance of a charged sphere, 413 Composite materials, 151

Index • 765
Compound microscope, 304 Dark energy, 592, 593

Compound pendulum, 256 Dark matter, 592
Compression ratio, 192 Datalogger, 49
Compton, Arthur, 608 Davisson and Germer, 606
Compton effect, 608 de Broglie, Louis, 605
Conduction de Broglie equation, 541, 609
thermal, 158 de Broglie relation, 535, 605, 621, 709
Conductors, 350 Decibel scale, 340
Conservation of baryon number, 590 Density, 137
Conservation of lepton number, 590 Dependent variable, 685
Constant head apparatus, 129 Deterministic theory, 628
Continuity Deuterium, 580, 584, 586
equation of, 122 Diesel cycle, 193
Control variable, 685 Differential calculus, 9
Convection, 160 Differential equation, 13
Conventional current, 373 Differentiation, 10
Coolant, 579 Diffraction by slits and holes, 326
Copenhagen Interpretation, 613, 615, Diffraction grating equation, 321
626, 627, 632, 635, 637, 741 Diffraction gratings, 318
Copernicus, 495 Diffraction through a circular
Copper losses hole, 329
transformers, 452 Dimensionless number, 4
Cosmic background radiation, 651 Diode, 385
Cosmic microwave background Direct current, 463
radiation, 307, 652 Dispersion, 276
Cosmology, 216, 323, 592, 593, 639, 650 Doppler effect, 304
Coulomb meter, 353 Doppler mode ultrasound, 656
Coulomb’s law, 354 Doppler shifts, 305
Couple, 65 Double slit experiment, 311, 315, 321,
Creep, 150 610, 626, 627, 630, 762
Critical angle, 274 Double slit pattern, 314
Critical mass, 577 Drake equation, 679
Crystalline materials, 150 Dulong and Petit’s law, 180
CT scan, 550, 669, 761 Dynamics
Curie, Marie, 542 defined, 41
Curie temperature, 419
E
D Earth’s magnetic field, 418
Damping, 229 Earth’s ocean tides, 501
Dams, 116 Eddington, Arthur, 506, 683, 684

766 • Index
Eddy current losses Energy level, 615, 618, 619, 623

transformers, 453 Energy resources, 98
Efficiency, 97 types, 98
Einstein, 20, 58, 81, 84, 97, 102, 106, Enriched uranium, 576
174, 287, 289, 291, 307, 503–506, 508, Entropy, 155, 202, 203, 205, 208–217,
516, 517, 519, 529, 567, 574, 588, 590, 720, 738, 748
600–602, 605, 613, 636, 639, 640, 652, Equation of motion, 48
653, 683, 684, 737, 745, 747, 751 Equilibrium, 55
theory of gravity, 503 of coplanar forces, 61
Elastic, 149 types of, 68
Elastic limit, 141 Equipartition of energy, 179, 745
Elastic potential energy, 142 Equipotential surfaces, 364
Electrical characteristics, 382 Equivalence principle, 505
Electrical potential difference, 363 Equivalent dose, 549
Electric charge, 349 Error bars, 32
Electric current, 349 Escape velocity, 492
Electric field strength, 356 Evaluation, 689
Electric flux, 359 Event horizon, 494
Electric motors, 432 Everett, Hugh, 626, 632, 633, 752
Electric oscillators, 476 Expanding Universe, 649, 650
Electric potential, 363 Exponential
absolute, 367 relationship, 37
Electric potential energy, 362 Exponential decay, 37, 546, 552, 561,
Electric potential gradient, 363 720, 721, 727
Electromagnetic damping, 454, 456
Electromagnetic field, 286 F
Electromagnetic induction, 437–441, Fair test, 685
443, 745 Faraday, Michael, 285, 286, 356, 361,
Electromagnetic waves, 286 362, 417, 437, 439, 441, 443, 444, 446,
Electron-capture, 557 450, 459, 460, 706, 728, 745
Electron diffraction, 540, 606, 610 Faraday cage, 361
Electron waves in atoms, 620 Faraday’s law, 441
Electroscope (gold leaf), 351 Fermi, 671, 672, 678–680
Electrostatics, 350 Ferromagnetic material, 419
Elements, 542 Feynman, Richard, 22
Elliptical orbits, 496 Feynman diagrams, 589
EMF, 390 Fine structure constant, 4
Energy First law of thermodynamics, 179, 185,
law of conservation, 94 187, 190, 194, 738
Energy availability, 215 Fizeau, Armand, 288

Index • 767
Fleming’s left hand rule, 422 General theory of relativity, 58, 84, 307,
Flotation, 119 503, 505, 508, 530, 588, 590, 639, 652,
Fluids, 111 653, 683
Flux losses Generations
transformers, 453 leptons, 587
Forced (or driven) oscillator, 230 quarks, 587
Foucault, 289 Generator, 453
Fourier analysis, 346 Geocentric model, 495
Fourier’s equation, 159 Geodesics, 504
Fourier synthesis, 346 Geometry
Frames of reference, 102 and special relativity, 529
Free-body diagrams, 55, 56, 85, 134, 226 Geostationary satellites, 499
Freely falling reference frames, 505 Geosynchronous orbit, 499
Frequency, 265 Glasses, 151
Friction GPS satellites, 499
static, 72 Gradient
Frictional forces, 71 how to calculate, 31
Fundamental, 332 Graham’s law of diffusion, 181
Fundamental interactions, 588 Gran Telescopio Canaris, 303
Fusion reactors, 581, 584 Graphs of motion, 42
Gravitational fieldlines and
G equipotentials, 490
Galilean relativity, 80, 81, 513, 516, Gravitational field strength, 484
534, 746 Gravitational field strength of the
Galilean transformation, 105 Earth, 486
Galileo, 79–83, 103, 496, 513 Gravitational mass, 83
Gamma-camera, 666 Gravitational potential, 487
Gamma emission, 555 Gravitational potential energy, 93, 487
Gamma-factor, 519, 520, 523, 524, Gravitational time dilation, 508
533, 534 Gravitational waves, 508
Gamma-rays Gray (S.I. unit), 549
inverse-square law, 543 Great Magellan telescope, 303
Gauss’s law, 359 Ground state, 617
Gauss’s theorem, 360, 361, 371, Gyromagnetic ratio, 665
484, 704
gravitational field, 484 H
Geiger, 537 Hadrons and quarks, 587
Geiger counter, 558 Hahn, Otto, 575
General relativity, 493, 504, 506, 590, Half-life, 552
652, 684, 747 Half-thickness, 545

768 • Index
Harmonics, 332 Impulse, 87

Heat capacity, 163 Incoherent light, 314
Heat death of the Universe, 214 Independent variable, 685
Heat engine, 188–191, 201, 203, 204, Indeterminacy, 628, 630, 709
212, 213, 216, 217, 749 Indeterministic theory, 614
Heat pumps, 214 Indicator diagram, 189
Heisenberg, Werner, 626, 628, 630, Induced emf, 437
709, 748 Induced fission, 575
Heisenberg’s Uncertainty Inductance, 446
(indeterminacy) Principle, 628 Induction motors, 456
Henry (S.I. unit), 446 Inductor
Hertz, 598 energy stored, 448
Hertzsprung–Russell diagram, 644, 748 Inertia, 58
Higgs, 590 Inertial confinement, 586
Higgs field, 590 Inertial forces, 241
Hiroshima and Nagasaki, 577 Inertial frames, 513
Hooke’s law, 140 Inertial mass, 83
Hubble, Edwin, 307, 309, 487, 648–654, Inertial reference frame, 104
703, 710, 748 Insulators, 350
Hubble constant, 307, 309, 653, Integration, 15
654, 748 Intensifying screen, 662
Hubble’s law, 307, 648–652, 654, Intensity
703, 710 of a wave, 269
Huygens, 285 Inter-atomic forces, 138
Hydrogen line spectrum, 618 Interferometer, 514
Hysteresis to detect gravitational waves, 509
rubber, 148 Internal energy, 185
Hysteresis losses Internal resistance of a real cell, 390
transformers, 453 Invariant, 20, 106, 530, 532, 533
Inverse-square law, 36
I Investigations, 690
Ideal fluid, 112 Inviscid fluid, 112
Ideal gas equation, 174 Ionizing radiation, 542, 543, 547, 550,
Image, 294 551, 553, 557, 558, 561, 659, 664, 746,
formation, 294 750, 755
in plane mirror, 299 biological effects, 547
real and virtual, 294 Irreversible process, 199
Impact parameter, 539 Isobaric change, 187
Impedance, 469, 470, 707, 749 Isochoric change, 187

Index • 769
Isothermal, 170, 185, 186, 194, 196, 719 Light gates, 49

Isotopes, 542 LIGO, 509
Limit of proportionality, 141
J Limit of resolution, 330
Jet propulsion, 99 Linear magnification
of lens, 295
K Local force, 484
Logarithms, 34
Kelvin scale, 157
Longitudinal waves, 266
Kepler, Johannes, 38, 495–497, 511,
Lorentz, 424
512, 707, 750
Lorentz force law, 424
Kepler’s laws
Lorentz transformation, 524
of planetary motion, 496
Lorentz transformation equations, 525
Kinematics
Luminiferous aether, 287, 291, 514,
defined, 41
534, 752
Kinetic energy, 93
Kinetic theory, 169, 174, 176, 177,
181, 737, 742 M
Kinetic theory equation, 177 Macro-state, 205
Kirchhoff’s first law, 376 Magnetic drag force, 455
Kirchhoff’s second law, 378 Magnetic field, 417
center of narrow coil, 427
L current-carrying wire, 429
Lagrange, Joseph-Louis, 106 rotating, 457
Lagrangian, 107 solenoid, 430
Lagrangian method, 106 Magnetic flux, 442
Laminar flow, 121 Magnetic flux-linkage, 443
Large Hadron Collider (LHC), 581 Magnetic force
Larmor frequency, 665, 666, 750 on electric current, 420
Lasers, 314 on moving charge, 423
Leavitt, Henrietta, 647 Magnetic quantum number, 622
Length contraction, 523 Magnetic resonance imaging
Lens (MRI), 664
convex and concave, 293 Main sequence, 645
Lens equation, 298 Manometer, 114
Lenz’s Law, 444 Many-worlds, 626, 632–635, 637, 638
Leptons, 587 Marsden, 537
Light Mass
as an EM wave, 285 relativistic, 527

770 • Index
Mass defect, 567 Morley, 515

Mass, energy andmomentum Motion sensor, 49
in relativity, 533 Multimeter, 380
Mass spectrometer, 425 Mutual inductance, 448
Mass-spring oscillator, 224
Matter waves, 605
Maxwell, 14, 22, 181–183, 285–287, N
290, 291, 513, 514, 516, 588, 597, 614, Neap tides, 502
615, 740, 752 Neodymium magnet, 455
Maxwell distribution, 181, 182 Neumann, 441
Maxwell’s equations, 14, 22, 286, 287, Neutral point, 359
290, 291, 514, 516, 588, 614, 615, 752 Neutron number, 541
Mean-squared speed, 176 Neutrons, 588
Measurement problem, 613, 626, 632, Neutron star, 641, 755
633, 637 Newton, 9, 14, 16, 20, 22, 41, 79–84,
Medical physics, 655 86–88, 99, 100, 102, 106, 109, 111,
Meitner, Lise, 575 122, 126, 127, 174, 175, 200, 216,
Melde’s experiment, 331 238–240, 251, 252, 257, 258, 285, 303,
Mesons, 588 421, 456, 483, 493, 494, 496, 503, 507,
Method of dimensions, 4 510, 517, 675, 680, 684, 695, 696, 707,
Michelson, 289, 514–517, 526, 534, 720, 737, 749, 750, 753, 761
752, 753 definition, 82
Michelson–Morley experiment, 289, Newtonian mechanics, 20, 79, 106, 107,
514, 516, 517, 526, 534, 752, 753 504, 513, 597, 615, 714, 740, 743
Micrometer screw gauge, 146 Newton’s first law, 80
Micro-states, 205 Newton’s law of gravitation, 483
Minkowski, Hermann, 20, 529, 531 Newton’s second law, 82
Moderator, 578 Newton’s third law, 84
Molar heat capacity, 163 Node, 331
Mole, 1, 163, 174, 177, 178, 180, 195, Noether, Emmy, 20
674, 693 Non-renewable energy sources, 99
Molecular kinetic energy, 177 N-type semiconductor, 373
Moment of a force, 63 Nuclear atom, 541
Moment of inertia, 252 Nuclear binding energy, 567
cylinder, 254 Nuclear fission, 98, 572, 574, 575, 585
rod, 253 Nuclear fusion, 98, 99, 567, 574, 575,
several point masses, 252 577, 580–582, 584, 585, 587, 593, 640,
uniform sphere, 255 643, 645, 652, 756
Momentum Nuclear stability, 571
law of conservation, 87 Nuclear transformations, 554
linear, 86 Nucleon number, 541

Index • 771
Nucleons, 567 Photocell, 602

Nucleosynthesis, 581 Photoelectric effect, 598, 600–602, 635,
Nucleus, 538 636, 661
Null result, 516 Photomultiplier tube, 663
Number of ways Photon theory, 102, 601, 605, 636
entropy, 206 Pitot tube, 131
Planck, Max, 599
O Planck constant
measuring, 602
Oersted, Hans Christian, 420
Planetary nebula, 640, 753
Ohm, 383
Plastic, 149
Ohm’s Law, 383
Plutonium, 580
Optical Fibres, 275
Plutonium-239, 577, 580
Optical infinity, 297
Poincaré recurrence, 209, 754
Orbital motion, 494
Poiseuille’s equation, 128
Orbital quantum number, 622
Polarisation, 277
Orbitals, 622
Polarisation by reflection and
Otto Cycle, 190
scattering, 280
Polarising filters, 279
P Polymers, 150
Parallel plate capacitor, 408 Popper, Karl, 683
Parity symmetry, 21 Positron, 104, 105, 529, 556, 583, 587,
Particle physics, 586 667, 668, 670, 735, 739
Past hypothesis, 215 Positron Emission Tomography
Pauli Exclusion Principle, 623, 742 (PET scans), 667, 754
Penzias, 307 Potential dividers, 394
Penzias, Arno, 652 Power, 97
Period-luminosity relation Power transfer
for Cepheids, 648 electrical, 393
Perlmutter, 593 Precision, 686
Permanent magnets, 419 Predictable rays for thin
Permeability of free space, 290, lenses, 293
427, 614 Prefixes, 5
Permittivity of free space, 4, 290, 354, Pressure
360, 614, 746 atmospheric, 113
Phase and phase difference, 223 hydrostatic, 112
Phase velocity, 265, 268 variation with depth, 113
Phasors, 235, 244, 311, 316, 321, 322, Pressure law (Gay Lussac’s law), 172
457, 469, 470, 610, 630–632 Pressurised water reactor, 579
Philosophiæ Naturalis Principia Principal quantum number, 617
Mathematica, 79 Principle of moments, 66

772 • Index
Principle of superposition RCL series circuit, 473

electric fields, 357 RC series circuit, 471
Projectile motion, 46 Reactance, 465
Proton–proton cycle, 583 Real and apparent depth, 300
Protons, 588 Red giant, 640, 756, 762
Ptolemaic system, 495 Red-shifts, 306, 307, 593, 649,
Ptolemy, 495 651, 652
P-type semiconductor, 373 Reflection, 269
Pulsar, 508 Refraction, 270
Refractive index, 272
Q Refrigerator, 212
Resistance, 379
Quantization of angular momentum, 620
temperature coefficient, 397
Quantization of energy, 600
variation with temperature, 397
Quantum atom, 615
Resistivity, 388
Quantum chromodynamics, 588
Resistors
Quantum electrodynamics, 4, 588, 589
in parallel, 386
Quantum of energy, 599
in series, 386
Quark flavours, 587
Resolving power, 330
Resolving vectors, 17
R Resonance, 230
Radian, 235 Resultant force, 55
Radiation Reversible process, 199
thermal, 160 Reynolds number, 121
Radiation detectors, 557 Right hand grip rule, 420
Radiation dose, 549 Risk assessments, 687
effect on humans, 550 RL series circuit, 472
Radiation pressure, 102 RMS values, 464
Radioactive tracers, 666, 667 Rockets, 100
Radioactivity, 542 Roentgen, Wilhelm, 542, 549, 658
Radiocarbon dating, 561 Römer, Olaf, 288
Radiological dating, 561 Root-mean-squared speeds, 177
Radiological dating of rocks, 562 Rotational dynamics, 235
Radio telescopes, 303 Rotational Kinematics, 246
Random process, 551 Rotational kinetic energy, 247
Range, 686 Rutherford, Ernest, 537, 538, 540, 541,
Ray, 265 563, 615, 616, 757
Ray diagrams Rutherford’s scattering experiment, 537
how to construct, 293 Rydberg constant, 618
Rayleigh criterion, 330 Rydberg formula, 619

Index • 773
S absorption, 324
Scalar product, 18 line, band and continuous, 323
Scalars, 16 Speed of light
Schrödinger atom, 621 constancy, 291
Schrödinger equation, 614, 615, 621, measuring, 288
622, 624–627, 632, 757 Speed of sound, 181, 195, 336, 339, 340,
Schrödinger’s cat, 634 343–347, 703, 737
Schwarzschild radius, 493 in a gas, 181
Scientific notation, 5 Spherical aberration, 300
Second derivatives, 11 Spin quantum number, 622
Second law of motion for rotation, 250 Spontaneous process, 551
Second law of thermodynamics, 188, Spreadsheet
199–204, 212–217 using, 31
Self-inductance, 446 Spring constant, 140
Semiconductors, 350, 396 Spring tides, 502
Shear stress, 111 Sputnik 1, 498
Sievert (S.I. unit), 549 Stability, 68, 69
Simple harmonic motion, 223 Standard candles, 650
Simple pendulum, 225 Standard Model, 587, 590, 591
Simultaneity Standing (stationary) waves, 331
relativity of, 522 Standing waves in air columns, 341
Single photon interference, 612 Standing waves on a string, 331
Single slit diffraction equation, 329 Stars, 640
Singularity, 641, 651 Stars as black bodies, 643
Slipher, Vesto, 307, 648 Statistical interpretation
Small angle approximations, 236 of quantum theory, 613
Snell’s law of refraction, 271 Statistical thermodynamics, 199
Solenoid, 430 Stefan–Boltzmann law, 162
Sound waves, 339 Stellar spectra, 644
Space-time, 504, 531, 758 Stokes’ law, 3, 123
Space-time curvature, 504 Stopping voltage, 604
Spark counter, 558 Strain, 143
Special relativity, 513, 730 Strain energy, 142
Specific heat capacity, 162 Streamlines
measuring, 164 in a fluid, 121
Specific latent heat, 165 Stress
Spectrometer, 325 tensile, 143
Spectroscopy, 323 Strong nuclear force (color force), 588
Spectrum Sum-over-histories, 626, 630, 632

774 • Index
Superconductor, 398 Time reversal symmetry, 21

Supernova, 582, 640, 641, 650, 654, 758 Tokamak, 585
Superposition, 311 Torque, 250
Superposition of harmonic waves, 316 meaning, 64
Surface density, 546 Total internal reflection, 274
Suvat Toughness, 149
equations derived, 44 Transducer, 97
Symmetry, 19 Transformer equation, 451
Systematic error, 9 Transformers, 452
Szilard, Leo, 575 Transverse waves, 265
Travelling or progressive wave, 264
T Triangle of forces, 61
Trigonometric parallax, 645
Telescope
Trinity Test, 671
astronomical reflecting, 302
Triple point of water, 157
astronomical refracting, 301
Tritium, 580, 584, 586
Teller–Ulam design, 584
Turbulent flow, 121
Temperature
Twin paradox, 521
defined, 209
Temperature scales, 156 U
Tesla, 421
Thermal conductivities, 159 Ultimate tensile strength, 147
Thermal energy, 155 Ultrasound, 344
Thermal neutrons, 578 Ultrasound (sonography), 655
Thermistor, 384 Ultraviolet catastrophe, 598, 600
Thermometer Uncertainty
types, 156 combining, 7
Thermonuclear weapons, 581, 583 types, 7
Thin lenses, 291 Upthrust. See buoyancy
Thompson, J.J., 537 Uranium-235, 562, 575, 577, 578
Thomson, G.P., 606 Uranium-238, 554, 562, 565, 566, 569,
Thought experiment, 79–82, 103, 505, 570, 575, 576, 580
517, 634, 637, 638, 757, 761
Threshold frequency, 601 V
Time constant Variables, 685
charging or discharging types, 25
capacitors, 410 Vector product, 19
Time dilation Vectors, 16
special relativity, 517 Velocity addition equation, 526
Time period, 265 Velocity-selector, 424

Index • 775
Venturi meter, 130 Work done, 91

Video capture, 51 Work function, 600
Virtual particles, 589
Viscosity, 3, 71, 112, 120–125, 127, 129, X
132, 134–136, 716, 750, 761 X-ray absorption coefficient, 661
coefficient of, 129 X-ray filter, 662
Visible light, 287 X-ray image intensifier, 662
Voltmeter, 377 X-ray images, 662
Voxels, 664 X-ray penetration, 545
X-ray radiography, 658
W
X-rays, 658
Wave front, 265 X-rays attenuation in matter, 661
Wave function, 613–615, 623, 625–628, X-ray spectrum, 660
630, 632–635, 752, 761 X-ray tube, 660
Wavelength, 264
Wavelength of light Y
from double slit experiment, 315
Yield point, 147
Wave-particle duality, 285, 605, 609,
Young, 5, 144–147, 149, 153, 311, 315,
613, 762
340, 610, 698, 703, 759, 762
Weak nuclear force, 588
Young’s modulus, 144
Weber, 443
measurement of, 145
Wheeler, John, 22, 504
White dwarf, 640, 645, 653, 748, 753
Wien’s displacement law, 162 Z
Wilson, Robert, 307, 652 Zero error, 9
WMAP, 651, 652 Zeroth law of thermodynamics, 156


Adams S. Foundations of Physics

Uploaded by

Copyright:

Available Formats

Adams S. Foundations of Physics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Adams S. Foundations of Physics

Uploaded by

Copyright:

Available Formats

Foundations

FP.CH00_FM_4PP.indd 1 3/17/2023 12:51:18 PM

FP.CH00_FM_4PP.indd 2 3/17/2023 12:51:18 PM

Steve Adams, PhD

MERCURY LEARNING AND INFORMATION

FP.CH00_FM_4PP.indd 3 3/17/2023 12:51:19 PM

Publisher: David Pallai

Steve Adams. Foundations of Physics, Second Edition.

Library of Congress Control Number: 2022952345

FP.CH00_FM_4PP.indd 4 3/17/2023 12:51:19 PM

FP.CH00_FM_4PP.indd 5 3/17/2023 12:51:19 PM

FP.CH00_FM_4PP.indd 7 3/17/2023 12:51:19 PM

1.7 Differential Equations 13

FP.CH00_FM_4PP.indd 8 3/17/2023 12:51:19 PM

CHAPTER 3: CAPTURING, DISPLAYING,

FP.CH00_FM_4PP.indd 9 3/17/2023 12:51:19 PM

4.3 Equilibrium of Coplanar Forces 61

FP.CH00_FM_4PP.indd 10 3/17/2023 12:51:19 PM

5.3.2 Gravitational Potential Energy Changes (Uniform Field) 92

FP.CH00_FM_4PP.indd 11 3/17/2023 12:51:19 PM

6.4 Fluid Flow 121

FP.CH00_FM_4PP.indd 12 3/17/2023 12:51:19 PM

CHAPTER 8: THERMAL PHYSICS 155

FP.CH00_FM_4PP.indd 13 3/17/2023 12:51:19 PM

9.4 The Maxwell Distribution 181

FP.CH00_FM_4PP.indd 14 3/17/2023 12:51:19 PM

10.7 Implications of the Second Law 214

FP.CH00_FM_4PP.indd 15 3/17/2023 12:51:19 PM

12.3.3 Centripetal Not Centrifugal 239

FP.CH00_FM_4PP.indd 16 3/17/2023 12:51:19 PM

13.3.3 Absolute and Relative Refractive Indices 274

FP.CH00_FM_4PP.indd 17 3/17/2023 12:51:19 PM

14.4 The Doppler Effect 304

FP.CH00_FM_4PP.indd 18 3/17/2023 12:51:19 PM

CHAPTER 17: ELECTRIC CHARGE AND

FP.CH00_FM_4PP.indd 19 3/17/2023 12:51:19 PM

18.3.2 Current–Voltage Characteristics 382

FP.CH00_FM_4PP.indd 20 3/17/2023 12:51:19 PM

20.2 Magnetic Forces on Electric Currents and Moving Charges 420

FP.CH00_FM_4PP.indd 21 3/17/2023 12:51:19 PM

21.5 A Simple AC Generator 453

FP.CH00_FM_4PP.indd 22 3/17/2023 12:51:19 PM

23.2.4 Gravitational Potential Energy in the Earth’s Field 491

FP.CH00_FM_4PP.indd 23 3/17/2023 12:51:19 PM

24.6 Special Relativity and Geometry 529

FP.CH00_FM_4PP.indd 24 3/17/2023 12:51:19 PM

25.7 Radiation Detectors 557

FP.CH00_FM_4PP.indd 25 3/17/2023 12:51:19 PM

26.4.5 The Standard Model 590

FP.CH00_FM_4PP.indd 26 3/17/2023 12:51:19 PM

CHAPTER 28: ASTROPHYSICS 639

FP.CH00_FM_4PP.indd 27 3/17/2023 12:51:19 PM

29.5 Positron Emission Tomography (PET Scans) 667

FP.CH00_FM_4PP.indd 28 3/17/2023 12:51:19 PM

How to use the book

FP.CH00_FM_4PP.indd 29 3/17/2023 12:51:19 PM

Note added for the 2nd edition

FP.CH00_FM_4PP.indd 30 3/17/2023 12:51:19 PM

1.1 THE SI SYSTEM OF UNITS

FP.CH01_3pp.indd 1 3/15/2023 12:25:38 PM

1.7 Differential Equations 13

CHAPTER 3: CAPTURING, DISPLAYING,

4.3 Equilibrium of Coplanar Forces 61

5.3.2 Gravitational Potential Energy Changes (Uniform Field) 92

6.4 Fluid Flow 121

CHAPTER 8: THERMAL PHYSICS 155

9.4 The Maxwell Distribution 181

10.7 Implications of the Second Law 214

12.3.3 Centripetal Not Centrifugal 239

13.3.3 Absolute and Relative Refractive Indices 274

14.4 The Doppler Effect 304

CHAPTER 17: ELECTRIC CHARGE AND

18.3.2 Current–Voltage Characteristics 382

20.2 Magnetic Forces on Electric Currents and Moving Charges 420

21.5 A Simple AC Generator 453

23.2.4 Gravitational Potential Energy in the Earth’s Field 491

24.6 Special Relativity and Geometry 529

25.7 Radiation Detectors 557

26.4.5 The Standard Model 590

CHAPTER 28: ASTROPHYSICS 639

29.5 Positron Emission Tomography (PET Scans) 667

1.3 SCIENTIFIC NOTATION, PREFIXES,

1.5 DEALING WITH RANDOM AND SYSTEMATIC