15
$\begingroup$

What is the intuition behind chain rule in mathematics in particular why there is a multiplication in between?

$\endgroup$
2

5 Answers 5

32
$\begingroup$

The best way to think about the derivative is: if $f$ is differentiable at $x$, then \begin{equation*} f(x + \Delta x) \approx f(x) + f'(x) \Delta x. \end{equation*} The approximation is good when $\Delta x$ is small. This is practically the definition of $f'(x)$.

Now suppose $f(x) = g(h(x))$, and $h$ is differentiable at $x$, and $g$ is differentiable at $h(x)$. Then \begin{align*} f(x + \Delta x) & = g(h(x+\Delta x)) \\ &\approx g(h(x) + h'(x) \Delta x) \\ &\approx g(h(x)) + g'(h(x)) h'(x) \Delta x. \end{align*} Comparing this with the equation above suggests that \begin{align*} f'(x) = g'(h(x)) h'(x). \end{align*}

Many other rules about derivatives can be derived easily in this way.

$\endgroup$
4
  • 8
    $\begingroup$ You can even make that rigorous by writing $f(x + \Delta x) = f(x) + f'(x) \Delta x + o(\Delta x)$ $\endgroup$ Commented Mar 25, 2014 at 10:26
  • 3
    $\begingroup$ +1 to @nik's comment: we can define the derivative $f'(x)$ as the unique number $c$ satisfying $f(x + \delta) = f(x) + c\delta + o(\delta)$ (for $\delta \to 0$). Then for $f = g\circ h$ we have $$f(x+\delta) = g(h(x+\delta)) = g(h(x) + h'(x)\delta + o(\delta)) = g(h(x)) + g'(h(x)) (h'(x)\delta + o(\delta)) + o(\delta) = f(x) + g'(h(x))h'(x)\delta + o(\delta),$$ and thus this is a proof of the chain rule. As an aside, see Knuth's letter "Calculus via O notation" (PDF) for ideas of teaching calculus along these lines. :-) $\endgroup$ Commented Mar 25, 2014 at 11:55
  • 1
    $\begingroup$ df/dx = df/dh * dh/dx , so called physisist expansion ;-) $\endgroup$ Commented Mar 25, 2014 at 15:25
  • 1
    $\begingroup$ I found the top comment funny looking at the poster's username. $\endgroup$
    – user370967
    Commented Jan 5, 2018 at 23:17
9
$\begingroup$

For a function $g(x)$, imagine walking at constant (unit) speed along one number line, and seeing a red dot mark the function value of your current position on another number line. That is, imagine your position to be $x$, and the red dot to appear at $g(x)$. $g'(x)$ would be the speed of the red dot. Now, assume we chain this red dot to trigger a blue dot on a third number line, representing $f(x)$, i.e. if you yourself were to walk at unit speed along the $g$ line, then the blue dot on the $f$ line would light up at $f(x)$ and move with the speed $f'(x)$.

As you move along your original number line, the red dot appears at $g(x)$, so the blue dot appears at $f(g(x))$. This makes the blue dot move with speed $[f(g(x))]'$

The red dot on the $g$ line moves with speed $g'(x)$. The red and blue dots' movement speeds are proportional with proportionality factor $f'(g(x))$. Thus the resulting movement speed of the blue dot must be $f'(g(x))\cdot g'(x)$.

$\endgroup$
4
  • $\begingroup$ Very nice visual intuition! $\endgroup$ Commented Oct 17, 2015 at 18:46
  • $\begingroup$ it would be nicer if you could draw a graph do demonstrate the concept, but still, thank you very much for the explanation. it was helpful to my understanding $\endgroup$
    – Thor
    Commented Jul 15, 2018 at 7:56
  • 1
    $\begingroup$ @Thor Since I made this answer, 3blue1brown has made a video showing this visualisation better than I ever could: youtu.be/CfW845LNObM It may not go into chain rule, but it does do derivatives, and with that as a help, I think visualising the chain rule the way I describe here isn't too difficult. $\endgroup$
    – Arthur
    Commented Jul 15, 2018 at 8:33
  • $\begingroup$ @Arthur that video is amazing! i would never encounter it if not because of your recommendation. thanks again for your help $\endgroup$
    – Thor
    Commented Jul 15, 2018 at 13:13
5
$\begingroup$

True even in several variables. Differentiable is locally linear-like. Composition of functions is locally $\approx$ composition of linear approximations. Composition of linear functions is matrix product.

$\endgroup$
3
$\begingroup$

In terms of differentials, we know that if the variables $x,y,z$ are related by $y = f(x)$ and $z = g(y)$, then

  • $dy = f'(x) dx$
  • $dz = g'(y) dy$

If differentials are even the slightest bit reasonable to think about, then we should be able to substitute, and get

  • $dz = g'(y) f'(x) dx$

or, if you prefer,

  • $dz = g'(f(x)) f'(x) dx$
$\endgroup$
2
$\begingroup$

Let $h(x)=f(g(x))$

$$\begin{align}h'(x) &= \lim_{t\to0}\frac{h(x+t)-h(t)}{t}\\&=\lim_{t\to0}\frac{f(g(x+t))-f(g(x))}{t}\\\end{align}$$

Now there are two possible cases,

  • $g(x+t)=g(x)$

    In this case, $h(x+t)=h(x)$, and $h'(x)=0$, and $h'(x)=f'(g(x))\cdot g'(x)=0$ is satisfied.

  • $g(x+t)\to g(x)$

    In this case we can write the limit as,

$$\lim_{t\to0}\frac{f(g(x+t))-f(g(x))}{t} = \frac{f(g(x+t))-f(g(x))}{g(x+t)-g(x)}\cdot \frac{g(x+t)-g(x)}t = f'(g(x))\cdot g'(x)$$

We do not consider the case where $\lim_{t\to0} g(x+t) \not \to g(x)$ since continutity and differentiability are requisite conditions here.

$\endgroup$
0

Not the answer you're looking for? Browse other questions tagged .