15
$\begingroup$

I am mostly aware of the Aharonov-Bohm effect's (AB effect) physical interpretation, as well as the corresponding mathematical/differential geometric interpretation.

What does confuse me slightly however is the physical part of the derivation leading to it. Namely, in a "heuristic" description, one usually brings up trajectories, namely that either "an electron going one way and another the other way around the cylinder will pick up a phase shift" or an "electron going around the cylinder will pick up a phase shift compared to its original value".

In QM there are no trajectories, however, though of course there is the path integral point of view, and I know the AB effect can be approached from this perspective too (Sakurai, for example). Buuuut, in a hungarian textbook I have seen a particularly simple way to derive the phase shift.

Let $C$ be the (solid) cylinder in $\mathbb R^3$ and let $M=\mathbb R^3\setminus C$. The manifold $M$ is not contractible, so Poincaré's lemme does not apply. In particular, if $\mathbf A$ is the vector potential, $$ \boldsymbol{\nabla}\times\mathbf A=\mathbf B=0 $$ does not imply that there exists a globally defined scalar field $\chi$ such that $\mathbf A=\boldsymbol{\nabla}\chi$.

Let $\psi$ be the wave function satisfying the Schrödinger equation $$ i\hbar\partial_t\psi=-\frac{\hbar^2}{2m}D^2\psi $$ with $$ \mathbf D=\boldsymbol{\nabla}+iq\mathbf A $$ the covariant derivative. I assume the proper interpretation should be that $\psi$ simply describes the state of an electron that is diffracted on the cylinder. Let $\psi_0$ be the wave function corresponding to the case of $\mathbf A=0$.

Now, let us partition $M$ into two halves, $M^+$ and $M^-$ such that both domains are contractible. Since they are contractible, with $\mathbf B=0$, one can choose gauge transformations $\chi^+$ and $\chi^-$ to "turn off" $\mathbf A$. Dropping the $\pm$ signs, this gauge function is given by $$ \chi(\mathbf x)=-\int_{\mathbf {x}_0}^{\mathbf{x}}\mathbf A(\mathbf y)\cdot d\mathbf y $$ where the integral is performed over any curve connecting the arbitrary initial point $\mathbf x_0$ with the target point $\mathbf x$ (as the integral is path-independent).

Since $\chi$ turns off $\mathbf A$ (in one of the $M^\pm$ domains), we have $$ \psi_0(\mathbf x)=e^{-\int^\mathbf x \mathbf A(\mathbf y)\cdot d\mathbf y}\psi(\mathbf x). $$

Reversing, we have $$ \psi(\mathbf x)=e^{\int^\mathbf x \mathbf A(\mathbf y)\cdot d\mathbf y}\psi_0(\mathbf x). $$

Now we perform this procedure on both trivializations and compare them: $$ \psi^+(\mathbf x_1)/\psi^-(\mathbf x_1)=e^{\int_{\gamma^+}^{\mathbf x_1}\mathbf A (\mathbf y)d\mathbf y}e^{-\int_{\gamma^-}^{\mathbf x_1}\mathbf A (\mathbf y)d\mathbf y}=\exp\left(\oint\mathbf A(\mathbf y)\cdot d\mathbf y\right). $$ (I have probably dropped some $q$s and $\hbar$s somewhere, but doesn't affect the basic method)

Question:

I am imagining that if the diffraction on the cylinder actually happens, then the diffracted electron is described by a wave function $\psi$, in particular, a wave function that is single-valued.

If the wave function is single-valued, then we should have a well-defined $\psi(\mathbf x_1)$, and we cannot have differing $\psi^+(\mathbf x_1)$ and $\psi^-(\mathbf x_1)$ wave functions.

However, despite what the connection-theoretic background would suggest, we did not calculate a parallel transport actually, but a gauge transformation. So the two wave functions need not agree, as they are in different gauges. However, then, why do we compare them? Comparing them and saying they differ would be akin to comparing a vector to itself in two different coordinate systems and saying they differ, cause the components don't agree.

So

  • If this derivation is "correct", then why do we compare wave functions in different gauges? In particular, why do we expect to get physically meaningful results from that.

  • If the derivation is incorrect, then what is a simple way to show that the phase shift is given by $\oint A$, that does not rely on path integrals?

$\endgroup$
3
  • 1
    $\begingroup$ This derivation is an example of excess mathematical formalism confusing the physics. It can be made right but at the moment it is physically completely wrong, because it doesn’t use the fact that the charge must move slowly. It’s just doing familiar-looking mathematical operations until the answer falls out by accident. $\endgroup$
    – knzhou
    Commented Aug 3, 2018 at 19:41
  • $\begingroup$ @knzhou Since asking this question I have been able to consult other sources (for example Ballantine's book), which contain the same derivation - set up two overlapping trivializations and gauge transform a "free" wave function separately. No source on the AB effect I have read ever said anything about charges needing to move slowly. Please elaborate? $\endgroup$ Commented Aug 3, 2018 at 20:01
  • 1
    $\begingroup$ Yeah, will do once I get to a computer. $\endgroup$
    – knzhou
    Commented Aug 3, 2018 at 20:14

3 Answers 3

16
$\begingroup$

This "derivation" hits a pet peeve of mine, which is that mathematical treatments of topological phases persistently confuse the phase shift resulting from a physical process with abstract, physically meaningless phases computed by blinding plugging equations into each other.

Physical and Formal Phases

The Aharanov-Bohm effect isn't even the worst example; that award goes to anyons. Anyons pick up a phase $e^{i \phi}$ when their positions are physically exchanged, i.e. when two of them are picked up and swapped by an experimentalist, assuming that there are no extra external fields, the anyons are moved slowly, and so on. However, this is persistently confused with the phase that results from formally swapping two variables in the many-body wavefunction, $$\psi(x_1, x_2, \ldots) = e^{i \theta} \psi(x_2, x_1, \ldots).$$ It is trivial to prove this formal phase is always $\pm 1$ in any dimension, leading even very capable mathematicians to assert that anyons cannot exist. The majority of introductory quantum mechanics books that attempt to treat anyons make precisely this mistake, then mumble something incorrect about topology allowing the formal phase to differ from $\pm 1$. It's a mess. (For a good treatment, see this.)

Similarly, the Aharanov-Bohm phase is the fact that a particle picks up an extra phase $e^{i \theta}$ upon being transported around a flux. It is easy to see where both the Aharanov-Bohm and anyon phase come from if you use the path integral. Mathematically minded students often dismiss this argument, based on trajectories, as "heuristic", but this misses the point, because the physics of the situation is explicitly about trajectories. You can't easily see the phase shift between two trajectories if you use the time-independent Schrodinger equation.

If you don't like the path integral, you can also derive these phases with the adiabatic theorem: trap a particle in a box at location $\mathbf{R}$ and transport the box around the flux. The gauge connection $\mathbf{A}$ functions precisely as the Berry connection on the states $|\mathbf{R} \rangle$, and the derivation then proceeds exactly the same way as the formal fiber bundle derivation below. Note that in both the path integral and adiabatic theorem explicitly require the transport to be slow. In the former case, it's to avoid picking up extra $\int \mathbf{p} \cdot d \mathbf{x}$ phases, and in the latter case it's a condition of the adiabatic theorem.

A Correct Fiber Bundle Derivation

The argument you gave rests on comparing wavefunctions in two different gauges, which is physically meaningless. Here is a correct derivation.

As you know, we may describe the gauge field in terms of a $U(1)$-bundle over $M$. All such bundles are trivial, which is why most courses don't talk about them; it just makes things more complicated. However, suppose we chose to use bundles anyway and covered $M$ with two patches. Then we may compute the phase picked up by transporting a particle around the flux as follows.

  • Within the first patch, integrate $\int \mathbf{A} \cdot d\mathbf{x}$.
  • When the particle passes from the first patch to the second, add a phase to account for the transition function between the patches.
  • Within the second patch, integrate $\int \mathbf{A} \cdot d \mathbf{x}$.
  • When the particle passes from the second patch back to the first, add another transition function phase.

Since the bundle is trivial, the transition functions can be chosen to be trivial, reducing to the non-bundle formalism. However, we could also choose to gauge away the connection within each patch. Then the particle picks up no phase at all as it is parallel transported through the patches (again, assuming it is moving slowly, with no extra external fields, ignoring dynamical phases, etc.) but does pick up phases from nontrivial transition functions. Of course, since the answer is a physical quantity, it will be the same calculated either way. Your text just showed this explicitly.

Using the Fake Derivation

The comparison of wavefunctions in two different gauges has nothing to do with the physical process in the Aharanov-Bohm effect, but your text gets the right answer basically by accident; there's only one answer you could possibly get in this simple situation. Luckily, your text's setup is useful for a different thing: finding the spectrum of particles on a ring.

Suppose a particle is constrained to a ring, through which a flux passes. If there were no flux, the energy eigenstates would be $$\psi_n(\theta) \propto e^{i n \theta}, \quad E_n \propto n^2.$$ Now suppose the flux is turned on, giving an Aharanov-Bohm phase $e^{i \phi}$. Usually, to get the spectrum you have to solve the Schrodinger equation with a vector potential, but using the fiber bundle setup we can just set it to zero on each patch. Supposing we set one of the transition functions to be trivial too, and letting the other patch intersection be at $\theta = 0$, we have $$\lim_{\theta \to 0^+} \psi_n(\theta) = e^{i \phi} \lim_{\theta \to 2\pi^-} \psi_n(\theta)$$ where $\psi_n(\theta)$ satisfies the Schrodinger equation for zero vector potential. (Of course the wavefunction remains single-valued, as long as we remember it only makes sense to compare it to itself within one patch.) Then we have $$\psi_n(\theta) \propto e^{i (n - \phi/2\pi) \theta}, \quad E_n \propto (n - \phi/ 2 \pi)^2$$ which gives a measurable change in the spectrum. This is a case where you do want the time-independent Schrodinger equation, not path integral trajectories, but that is because the physics is completely different.

$\endgroup$
9
  • $\begingroup$ Thanks for the answer. My only issue is that in the "fiber bundle derivation" you give, I find it nontrivial to motivate that the quantity of interest is $\oint A$. Maybe it is actually trivial, but I cannot see it now. Could you provide a source that treats this matter rigorously, but preferably without the path integral formalism? To make it more understandable what I am looking for, I am gonna give a backstory. I often help out my supervisor with the practice seminar of a GR course he holds (I do research in GR). His course is not very mathematical, so I often cover some (cont'd) $\endgroup$ Commented Aug 4, 2018 at 21:15
  • $\begingroup$ of the differential geometric background in the practice seminar. I know a nice and rigorous proof that the vanishing of the curvature tensor (in an open region) implies trivial holonomy, if the loop is null homotopic. I do not know any simple-to-show GR example of flat holonomy, so to show the consequence of topological obstructions, I intend to show the AB effect as an example. However I prefer an all-or-nothing approach, and I find it is perhaps the most stringent point in all of this is to physically motivate that the phase shift is related to $\oint A$. $\endgroup$ Commented Aug 4, 2018 at 21:18
  • 1
    $\begingroup$ @Uldreth Well, $\oint A$ falls out of the classical action (where you account for electromagnetism by adding an $\mathbf{A} \cdot d\mathbf{x}$ term), and as you know the path integral directly uses the classical action in $e^{iS}$. I get why you might not want to use it, but the link between quantum phases and classical action is really really fundamental, so not using it makes everything much harder. However, you can get by, by using the Berry phase instead. $\endgroup$
    – knzhou
    Commented Aug 4, 2018 at 21:22
  • 1
    $\begingroup$ @Uldreth I alluded to this briefly above, but the Aharanov-Bohm phase can also be viewed as a Berry phase associated with slowly transporting the particle position $\mathbf{R}$ in a loop, where the Berry connection is the gauge connection. You can look up derivations of this by just googling those keywords. So that might work. $\endgroup$
    – knzhou
    Commented Aug 4, 2018 at 21:24
  • 1
    $\begingroup$ @Uldreth In fact, here's possibly an even better way: first argue that the Berry phase is exactly zero when $\mathbf{A} = 0$. (If you're sneaky, you could even tacitly assume this without saying it; most wouldn't notice.) Then the only phase you pick up by transport in a loop is from the transition function. Finally, use the argument from your book in reverse to show that, by undoing the gauge transformations they did (which should not change the physical phase), that transition function phase is precisely $\oint A$. $\endgroup$
    – knzhou
    Commented Aug 4, 2018 at 21:25
1
$\begingroup$

The original Bohm-Aharonov proplem [Y. Aharonov and D. Bohm Phys. Rev. 115, 485 (1959)] is about electrons scattering from a solenoid. They give a nice solution to the Schrodinger equation as a sum of Besssel functions that it is fun to plot: Bohm with 1.4 unit of flux through solenoid

The image is the real part of the wave function for the case of 1/4 unit of flux through the solenoid. Wave incoming from right. The B-A phase shift results in the downstream above and below wavecrests being offset by 1/4 of a wavelength. The wavefunction itself is everywhere single-valued though.

$\endgroup$
0
$\begingroup$

If the derivation is incorrect, then what is a simple way to show that the phase shift is given by , that does not rely on path integrals?

While the BA shift has been experimentally confirmed many times, I believe all those derivations of physical effect (shift) based on mathematical arguments about $\mathbf A$ outside the solenoid are unconvincing, perhaps entirely invalid.

The major problem is that all those derivations assume that loop integral along the loop $\partial S$ around the solenoid equals magnetic flux through the surface $S$ that is defined by the loop:

$$ \oint_{\partial S} \mathbf A \cdot d\mathbf{l} = \iint_S \mathbf B\cdot d\mathbf S~~~(1) $$

While this is true for the vector potentials considered in most situations discussed in physics textbooks, it is not necessary property of a vector potential. The only conditions that restrict the vector potential in EM theory are $$ \mathbf B = \nabla \times \mathbf A, $$ $$ \mathbf E + \nabla \varphi = -\partial_t \mathbf A. $$

From the first condition, the formula (1) can be derived, but only if $\mathbf A$ is well behaved on all points of the surface (including the surface and inside of the solenoid). If it is not (if there is a discontinuity or singularity), the derivation fails. Consequently, there are valid functions $\mathbf A(\mathbf x)$ which, when integrated outside the solenoid, do not obey (1).

For example, there is a function $\mathbf A_0(\mathbf x)$ that vanishes outside the solenoid (so it gives $\nabla \times \mathbf A = 0$ trivially) and is only non-zero inside the solenoid. It also has, necessarily, discontinuity on the surface of the solenoid (or, there is another function that is continuous across the surface, but then has singularity inside the solenoid). So the relation $\mathbf B=\nabla \times \mathbf A$ fails on the surface of the solenoid, but that is true for all functions, including the standard one, if the current distribution on the surface of the solenoid is infinitely thin.

These details should not influence the solution to the Schroedinger equation if the solenoid is modeled as infinite potential barrier (admittedly, this is not very clear and perhaps there is an effect of the discontinuity or singularity even through the infinite potential wall...).

It is only when we restrict the vector potential to the family that has non-zero loop integral that we can get any effect of magnetic flux on the $\psi$ function.

For these reasons, I think it is good to either 1) seek some argument for why only certain vector potentials are allowed (which does not seem likely to be very fruitful, considering they are just auxiliary tool to get the physical field) or 2) seek other explanations, preferably those that do not rely on special property of vector potential.

There has been some intriguing work on the possibility of classical explanation of the BA shift, see, for example, papers by Timothy Boyer who argues that there is classical EM interaction between the electrons and the metallic solenoid which suggests that the explanation could be much more classical and not require special properties of vector potential:

https://philpapers.org/rec/BOYCEA

https://link.springer.com/article/10.1023%2FA%3A1003602524894

$\endgroup$
18
  • $\begingroup$ This sounds like excess mathematical formalism clouding a very simple physical point. The vector potential in an ideal solenoid is not singular. It is often written down explicitly in freshman-level electromagnetism classes. There is no reason it should have to be singular, unlike for the magnetic monopole, for the fiber bundle here is trivial. $\endgroup$
    – knzhou
    Commented Aug 4, 2018 at 11:16
  • $\begingroup$ You are free to confuse yourself by forcing yourself to work with artificially singular gauge potentials, but that doesn't invalidate the actual derivation, which is perfectly straightforward. $\endgroup$
    – knzhou
    Commented Aug 4, 2018 at 11:17
  • $\begingroup$ Re "artificially singular gauge potentials": do you believe some vector potentials are more correct outside the solenoid than others, even if they all satisfy the definition $\nabla \times \mathbf A = \mathbf B$ in that region? I don't. Vector potential does not have to be differentiable everywhere. $\endgroup$ Commented Aug 5, 2018 at 0:41
  • $\begingroup$ I believe everything real is not only differentiable, but infinitely differentiable. So I think going against this principle, even for something unobservable like the gauge potential, is physically unacceptable. $\endgroup$
    – knzhou
    Commented Aug 5, 2018 at 9:09
  • $\begingroup$ Differentiable or discontinuous, these are just properties of physics models. Each is useful in some situations. There is a lot of occurrence of non-differentiable functions in physics and it is generally not a problem. We can differentiate even such functions, with help of delta distributions. $\endgroup$ Commented Aug 5, 2018 at 13:13

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.