Maths Harvard

Math 1A: introduction to functions and calculus Oliver Knill, 2011
Lecture 1: What is Calculus?

Calculus is a powerful tool to describe our world. It formalizes the process of taking dierences
and taking sums. Both are natural operations. Dierences measure change, sums measure how
things accumulate. We are interested for example in the total amount of precipitation in Boston
over a year but we are also interested in how the temperature does change over time. The process
of taking dierences is in a limit called derivative. The process of taking sums is in the limit called
integral. These two processes are related in an intimate way. In this rst lecture, we want to look
at these two processes in a discrete setup rst, where functions are evaluated only on integers. We
will call the process of taking dierences a derivative and the process of taking sums as integral.
Start with the sequence of integers
1, 2, 3, 4, ... .
We say f(1) = 1, f(2) = 2, f(3) = 3 etc and call f a function. It assigns to a number a number.
It assigns for example to the number 100 the result f(100) = 100. Now we add these numbers up.
The sum of the rst n numbers is called
Sf(n) = f(1) + f(2) + f(3) + ... + f(n) .
In our case we get
1, 3, 6, 10, 15, ...
It denes a new function g which satises g(1) = 1, g(2) = 3, g(2) = 6 etc. The new numbers are
known as the triangular numbers. From the function g we can get f back by taking dierence:
Dg(n) = g(n) g(n 1) = f(n) .
For example Dg(5) = g(5) g(4) = 15 10 = 5 and this is indeed f(5).
Finding a formula for the sum Sf is not so easy. The young mathematician Karl-Friedrich
Gauss realized as a 7 year old kid when giving the task to sum up the rst 100 numbers that
it is the same as adding up 50 times 101 which is 5050. Gauss found g(n) = n(n + 1)/2 . He
did that by pairing things up. To add up 1 + 2 + 3 + . . . + 10 for example we can write this as
(1 + 10) + (2 + 9) + (3 + 8) + (4 + 7) + (5 + 6) leading to n/2 terms of n + 1 if n is even. Taking
dierences again is easier Dg(n) = (n + 1)n/2 n(n 1)/2 = n = f(n).
Lets add up the new sequence again and compute h = Sg. We get the sequence
1, 4, 10, 20, 35, ...
These numbers are called the tetrahedral numbers because one use h(n) marbles to build a
tetrahedron of side length n. For example, we need h(4) = 20 golf balls for example to build a
tetrahedron of side length 4. The formula which holds for h is h(n) = n(n + 1)(n + 2)/6 . We
see that summing the dierences gives the function in the same way as dierencing the sum:
SDf(n) = f(n) f(0), DSf(n) = f(n)
Dont worry yet, if this is too abstract. We will come back to it again and again. But this is
an arithmetic version of the fundamental theorem of calculus which we will explore in this
course. The process of adding up numbers will lead to the integral
x
0
f(x) dx . The process of
taking dierences will lead to the derivative
d
dx
f(x) . One of the high lights of this course is to
understand the fundamental theorem of calculus:
x
0
d
dt
f(t) dt = f(x) f(0),
d
dx
x
0
f(t) dt = f(x)
and see why it is such a fantastic result. You see formally that it ts the result for dierence and
sum. A major goal of this course will be to understand the fundamental theorem result and see
its use. But we have packed the essence of the theorem in the above version with S and D. It is
a version which will lead us.
1 Problem: Given the sequence 1, 1, 2, 3, 5, 8, 13, 21, . . . which satises the rule f(x) = f(x
1) + f(x 2). It denes a function on the positive integers. For example, f(6) = 8. What
is the function g = Df, if we assume f(0) = 0? Solution: We take the dierence between
successive numbers and get the sequence of numbers
1, 0, 1, 1, 2, 3, 5, 8, ...
After 2 entries, the same sequence appears again. We can also deduce directly from the above
recursion that f has the property that Df(x) = f(x 2) . It is called the Fibonnacci
sequence, a sequence of great fame.
2 Problem: Take the same function f given by the sequence 1, 1, 2, 3, 5, 8, 13, 21, ... but now
compute the function h(n) = Sf(n) obtained by summing the rst n numbers up. It gives
the sequence 1, 2, 4, 7, 12, 20, 33, .... What sequence is that?
Solution: Because Df(x) = f(x 2) we have f(x) f(0) = SDf(x) = Sf(x 2) so
that Sf(x) = f(x + 2) f(2). Summing the Fibonnacci sequence produces the Fibonnacci
sequence shifted to the left with f(2) = 1 is subtracted. It has been relatively easy to nd
the sum, because we knew what the dierence operation did. This example shows:
We can study dierences to understand sums.
The next problem illustrates this too:
3 Problem: Find the next term in the sequence
2 6 12 20 30 42 56 72 90 110 132 . Solution: Take dierences
2 6 12 20 30 42 56 72 90 110 132
2 4 6 8 10 12 14 16 18 20 22
2 2 2 2 2 2 2 2 2 2 2
0 0 0 0 0 0 0 0 0 0 0
.
Now we can add an additional number, starting from the bottom and working us up.
2 6 12 20 30 42 56 72 90 110 132 156
2 4 6 8 10 12 14 16 18 20 22 24
2 2 2 2 2 2 2 2 2 2 2 2
0 0 0 0 0 0 0 0 0 0 0 0
In the rest of this hour, we talk about some applied and not so applied problems which involve
calculus.
Homework
1 We have dened Sf(n) = f(1) + f(2) + ... + f(n) and Df(n) = f(n) f(n 1) and seen
f(n) = 1 we have g(n) = Sf(n) = n .
f(n) = n we have g(n) = Sf(n) = n(n + 1)/2.
f(n) = n(n + 1)/2 we have g(n) = Sf(n) = n(n + 1)(n + 2)/6.
Guess a formula g(n) = Sf(n) forf(n) = n(n + 1)(n + 2)/6 and verify using algebraic
manipulation that it satises Dg(n) = f(n). Can you see a pattern?
2 Find the next term in the sequence 3, 12, 33, 72, 135, 228, 357, 528, 747, 1020, 1353.... To do
so, compute successive derivatives g = Df of f, then h = Dg until you see a pattern.
3 The function f(x) = 2
x
can rst be dened on integers, then on rational numbers like
2
(3/2)
=
2
3
. We have for example f(0) = 1, f(1) = 2, f(2) = 4, , . . .
a) Verify that f satises the equation Df(x) = f(x 1), where Df(x) = f(x) f(x 1).
b) The function f(x) = 5
x
satises a similar rule. Which one?
4 Find g(n) = Sf(n) for the function f(n) = n
2
. This means we want to nd a formula such
that g(1) = 1, g(2) = 5, g(3) = 14 leading to the sequence of numbers 1, 5, 14, 30, 55, 91, 140, 204, 285, ...
Note that we have already have computed Sf for g(n) = n(n+1)/2 as well as for h(n) = n.
Try to write f as a combination of g and h and use the rule D(f + g) = Df + Dg.
5 Find a formula g(n) = Sf(n) for the function f(n) = 7
n
. First compute the derivative
Df of f and go from there.
General remarks about homework
Make sure to think about the problem yourself rst before discussing it with others.
The time you spend on homework is valuable. Especially the exploration time before
you know how to solve it.
If you do not know how to get started, dont hesitate to ask.
Lecture 1: Worksheet
In this rst lecture, we want to see that the essence of calculus is
already in basic arithmetic.
Triangular numbers
We stack disks onto each other building n layers and count the num-
ber of discs. The number sequence we get are called triangular
numbers.
1 3 6 10 15 21 36 45 ...
n=1 n=2 n=3 n=4
This sequence denes a function on the natural numbers. For ex-
ample, f(4) = 10.
1 Can you nd f(100)? The task
to nd this number was given to Carl
Friedrich Gauss in elementary school.
The 7 year old came up quickly with
an answer. How?
Carl-Friedrich
Gauss, 1777-1855
Tetrahedral numbers
We stack spheres onto each other building n layers and count
the number of spheres. The number sequence we get are called
tetrahedral numbers.
1 4 10 20 35 56 84 120 ...
Also this sequence denes a function. For example, g(3) = 10.
But what is g(100)? Can we nd a formula for g(n)?
n=1 n=2 n=3 n=4
2 Once you know the formula for g(n) given to you as g(n) =
n(n + 1)(n + 2)/6, verify that it is the right one, by checking
g(n) g(n 1) = n(n + 1)/2.
Lecture 2: Functions
A function is a rule which assigns to a real number a new real number. An example
is f(x) = x
2
x. For example, it assigns to the number x = 3 the value 3
2
3 = 6. A
function is given with a domain A, the points where f is dened and a codomain
B a set of numbers in which f takes values.
Typically, the codomain agrees with the set of real numbers and the domain to be all the numbers,
where the function is dened. The function f(x) = 1/x for example is not dened at x = 0 so that
we chose the domain A = R\ {0}, all numbers except 0. The function f(x) = 1/x takes values in
the codomain R. If we choose A = B, then f(x) = 1/x reaches every point in B and is invertible.
It is its own inverse. Here are a few examples of functions. We will look at them in more detail
during the lecture, especially the polynomials, trigonometric functions and exponential function.
identity f(x) = x
constant f(x) = 1
linear f(x) = 3x + 1
quadratic f(x) = x
2
cosine f(x) = cos(x)
sine f(x) = sin(x)
exponentials f(x) = exp
h
(x) = (1 + h)
x/h
logarithms f(x) = log
h
(x) = exp
1
h
power f(x) = 2
x
exponential f(x) = e
x
= exp(x)
logarithm f(x) = log(x) = exp
1
(x)
absolute value f(x) = |x|
devil comb f(x) = sin(1/x)
bell function f(x) = e
x
2
witch of Agnesi f(x) =
1
1+x
2
sinc sin(x)/x
We can build new functions by:
add functions f(x) + g(x)
scale functions 2f(x)
translate f(x + 1)
compose f(g(x))
invert f
1
(x)
dierence f(x + 1) f(x)
sum up f(x) + f(x + 1) + . . .
Here are important functions:
polynomials x
2
+ 3x + 5
rational functions (x + 1)/(x
4
+ 1)
exponential e
x
logarithm log(x)
trig functions sin(x), tan(x)
inverse trig functions arcsin
1
(x), arctan(x).
roots

x, x
1/3
We will look at these functions a lot during this course. The logarithm, exponential and trigono-
metric functions are especially important.
For some functions, we need to restrict the domain, where the function is dened. For the square
root function

x or the logarithm log(x) for example, we have to assume that the number is
positive. We write that the domain is (0, ) = R
+
. For the function f(x) = 1/x, we have to
assume that x is dierent from zero. Keep these three examples in mind.
The graph of a function is the set of points {(x, y) = (x, f(x)) } in the plane, where
x runs over the domain A of f. Graphs allow us to visualize functions. We can
see them, when we draw the graph.
expx
x
logx
x
e
x
2
x
x sin1x
x
x
x
x
3
3 x
x
Homework
1 Draw the function f(x) = x + sin(x). Its graph goes through the origin (0, 0).
a) A function is called odd if f(x) = f(x). Is f odd?
b) A function is called even if f(x) = f(x). Is f even?
c) A function is called monotone increasing if f(y) > f(x) if y > x. Is f monotone
increasing? You do not have to decide this yet analytically. Just draw
()
the function and
make up your mind.
2 A function f : A B is called invertible or one to one if there is an other function
g such that g(f(x)) = x for all x in A and f(g(y)) = y for all y B. For example, the
function g(x) =

x is the inverse of f(x) = x
2
as a function from A = [0, ) to B = [0, ).
Determine from the following functions whether they are invertible. If they are invertible,
nd the inverse.
a) f(x) = sin(x) from A = [0, /2] to B = [0, 1]
b) f(x) = x
3
from A = R to B = R
c) f(x) = x
6
from A = R to B = R
d) f(x) = exp(5x) from A = R to B = R
+
= (0, ).
e) f(x) = 1/(1 + x
2
) from A = [0, ) to B = [0, ).
3 Look at the function f1(x) = sin(x), f2(x) = sin(sin(x)), f3(x) = sin(sin(sin(x))).
a) Draw the graphs of the functions f1, f2, f3 on the interval [0, 4].
b) Can you imagine what f100000(x) looks like? You might want to make more experiments
here to see the answer. Of course you are allowed to plot the functions with a calculator or
with an online grapher like Wolfram alpha. (The weblink can be found below).
4 Lets call a function f(x) a composition square root of a function g if f(f(x)) = g(x). For
example, the function f(x) = x
2
+ 1 is the composition square root of g(x) = x
4
+ 2x
2
+ 2
because f(f(x)) = (x
2
+ 1)
2
+ 1 = g(x). Find the composition square roots of the following
functions:
a) f(x) = sin(sin(x)).
b) f(x) = x
4
c) f(x) = x
d) f(x) = x
4
+ 2x
2
+ 2
e) f(x) = e
e
x
.
Note that it can be dicult in general to nd the square root function in general. Already
for basic functions like exp(x) or sin(x), we are speechless.
5 A function f(x) has a root at x = a if f(a) = 0. Roots are places, where the function is
zero. Find one root for each of the following functions or state that there is none.
a) f(x) = sin(x)
b) f(x) = exp(x)
c) f(x) = x
3
x
d) f(x) = sin(x)/x 1
e) f(x) = csc(x) = 1/ sin(x)
(*) Here is how you can use the Web to plot a function. The example given is sin(x).
http : //www. wol f ramal pha . com/ i nput /? i=Pl ot+s i n ( x)

In this lecture, we want to learn what a function is and get acquainted
with the most important examples.
Trigonometric functions
The cosine and sine functions can be dened geometrically by the co-
ordinates (cos(x), sin(x)) of a point on the unit circle. The tangent
function is dened as tan(x) = sin(x)/ cos(x)).
cos(x) = adjacent side/hypothenuse
sin(x) = opposite side/hypothenuse
tan(x) = opposite side/adjadcent side
Pythagoras theorem gives us the important identity
cos
2
(x) + sin
2
(x) = 1
Dene also cot(x) = 1/ tan(x). Less important but sometimes used
are sec(x) = 1/ cos(x), csc(x) = 1/ sin(x).
1 Find cos(/3), sin(/3).
2 Where does cos and sin have roots, places, where the function
is zero?
3 Find tan(3/2) and cot(3/2).
4 Find cos(3/2) and sin(3/2).
5 Find tan(/4) and cot(/4).
cos
sin
1
cosx
x 2
sinx
x 2
tanx
x 2 2
The exponential function
The function 2
x
is rst of all dened for all integers like 2
10
= 1024.
By taking roots, we can dene it for rational numbers like 2
3/2
=
8
1/2
=
8 = 2.828.... Since the function 2

x
is monotonone on the set
of rationals, we can ll the gaps and dene the function 2
x
for any x.
By taking square roots again and again, we see 2
1/2
, 2
1/4
, 2
1/8
, ... we
approach 2
0
= 1.
2
x
x
There is nothing special about 2 and we can take any positive base a
and dene the exponential a
x
. It satises a
0
= 1 and the remarkable
rule:
a
x+y
= a
x
a
y
It is spectacular because it provides a link between addition and mul-
tiplication.
We will especially consider the exponential exp
h
(x) = (1 + h)
x/h
,
where h is a positive parameter. This is a supercool exponential be-
cause it satises exp
h
(x + h) = (1 + h) exp
h
(x) so that
[exp
h
(x + h) exp
h
(x)]/h = exp
h
(x) .
Hold on to that. We will look at this later again. In modern language,
we would say that the quantum derivative of the quantum exponen-
tial is the function itself for any Planck constant h.
For h = 1, we have the function 2
x
we have started with. In the
limit h 0, we get the important exponential function exp(x) which
we also call e
x
. For x = 1, we get the Euler number e = e
1
=
2.71828....
1 What is 2
5
?
2 Find 2
1/2
.
3 Find 27
1/3
.
4 Why is A = 2
3/4
smaller than B = 2
4/5
? Take the 20th power
of both numbers.
5 Assume h = 2 nd exp
h
(4).
Lecture 3: Limits
Sometimes, functions look as if they are not dened at some point. They often allow a continuation
to non-allowed places however. Lets look at some examples:
1 The function f(x) = (x
3
1)/(x 1) is at rst not dened at x = 1. However, for x close
to 1, nothing really bad happens. We can evaluate the function at points closer and closer
to 1 and get closer and closer to 3. We will say limx1 f(x) = 3. Indeed, as you might
have seen already, we have f(x) = x
2
+ x + 1 by factoring out the term x 1. While the
function was initially not dened at x = 1, we can assign a natural value 3 at the point
x = 1 and keep a nice function. The graph will continue nicely through that point.
Denition. We write x a if we mean that the number x approaches a from either
side. A function f(x) has a limit at a point a if there exists b such that f(x) b
for x a. We write limxa f(x) = b. It should not matter, whether we approach a
from the left or from the right. In both cases, we should get the same limiting value
b.
2 The function f(x) = sin(x)/x is called sinc(x). It converges to 1 as x 0. We can see this
geometrically by comparing the side a = sin(x) of a right angle triangle with a small angle
= x and hypotenuse 1 with the length of the arc between B, C of the unit circle centered
at A. The arc has length x which is close to sin(x) for small x. Keep this example in mind.
It is a good one. Remark. It is possible to see this analytically. A computer for example
approximates the function sin(x) with the polynomial x x
3
/3! + x
5
/5! + x
100
/100!
and if we divide this by x, we get 1 x
2
/3! + x
4
/5! + x
99
/100! which converges to 1
as x approaches 0.
3 The quadratic function f(x) has the property that f(x) approaches 4 if x approaches 2.
This is a very typical case. To evaluate functions at a point, we do not have to take a
limit. The function is already dened there. This is important: given a typical function,
most points are healthy. We do not have to worry about limits there. In most cases we
see in real pplications we only have to worry about limits when the function divides by
0. For example f(x) = (x
4
+ x
2
+ 1)/x needs to be investigated carefully only at x = 0.
You see for example that for x = 1/1000, the function is slightly larger than 1000, for
x = 1/1000000 it is larger than one million. There is no rescue here. The limit does not
exist at 0.
4 More generally, for all polynomials, the limit limxa f(x) = f(a) is dened. We do not
have to worry about limits, if we deal with polynomials.
5 For all trigonometric polynomials involving sin and cos, the limit limxa f(x) = f(a) is
dened. We do not have to worry about limits if we deal with trigonometric polynomials
like sin(3x) + cos(5x). The function tan(x) however has no limit at x = /2. There is no
value b we can nd so that tan(/2 +h) b for h 0. This is due to the fact that cos(x)
is zero at /2. We have tan(x) goes to + plus innity for x /2 and tan(x) goes
to for x /2. In the rst case, we approach /2 from the right and in the second
case from the left.
6 The cube root function f(x) = x
1/3
converges to 0 as x 0. For x = 1/1000 for example,
we have f(x) = 1/10 for x = 1/n
3
the value f(x) is 1/n. The cube root function is dened
everywhere on the real line, like f(8) = 2 and is continuous everywhere.
1
2
Why do we worry about limits at all? One of the main reasons will is that we will dene
the derivative and integral using limits. But we will also use limits to get numbers like
= 3.1415926, ..... In the next lecture, we will look at the important concept of continuity,
which involves limits too.
a
b
x
fx
Figure: We can test whether a function has the limit b at a point a if for every vertical
interval I containing b there exists a horizontal interval J containing a such that if x is
in J, then f(x) is in I. If the function stays bounded, does not oscillate at the point like
sin(1/x) or jump, then the limit exists.
x
Figure: We see here the function f(x) = arctan(tan(x) + 1), where arctan is the inverse
of tan giving the angle from the slope. In this case, the limit does not exist for a = /2. If
we approach this point a from the right, we are always far below the limiting value. The
limit exists from the left if we postulate f(/2) = /2. Note that f has a priori no value
at x = /2 because tan(x) becomes innite there.
3
7 Problem: Determine from the following functions whether the limits limx0 f(x) exist.
If the limit exists, nd it.
a) f(x) = cos(x)/ cos(2x)
b) f(x) = tan(x)/x
c) f(x) = (x
2
x)/(x 1)
d) f(x) = (x
4
1)/(x
2
1)
e) f(x) = (x + 1)/(x 1)
f) f(x) = x/ sin(x) g) f(x) = sin(x)/x
2
h) f(x) = sin(x)/ sin(2x)
Solution: a) There is no problem at x = 0. Both, the nominator and denominator
converge to 1. The limit is 1
b) This is sinc(x)/ cos(x). There is no problem at x = 0 for sinc nor for 1/ cos(x). The
limit is 1 .
c) We can heal this function. It is the same as x + 1. The limit is 1 .
d) We can heal this function. It is the same as x
2
+ 1. The limit is 2 .
e) There is no problem at x = 0. There is mischief at x = 1 although but that is far, far
away. At x = 0, we get 1 .
f) This is the prototype. We know that the limit is 1 .
g) This limit does not exist. Because it is sinc(x)/x. Because sinc(x) converges to 1. we
are in trouble when dividing again by x. There is no limit.
h) We know sin(x)/x 1 so that also sin(2x)/(2x) has the limit 1. If we divide them, see
sin(x)/ sin(2x) 1/2. The result is 1/2 .
4
Homework
1 a) Draw the graph of the function
f(x) =
(1 cos(x))
x
2
.
b) Where is the function f dened? Can you nd the limit at the places, where it is
not dened?
c) A function is even if f(x) = f(x), odd if f(x) = f(x). Is f even or odd, or
neither?
d) What happens with the function f in the limit x + and x ?
2 Find the limits of each of the following functions at the point x 0:
a) f(x) = (x
4
1)/(x 1)
b) f(x) = sin(3x)/x
c) f(x) = sin(5x)/x
d) f(x) = sin(3x)/ sin(5x)
3 a) Can you see the limit of g(h) = [f(x +h) f(x)]/h as a function of h at the point
x = 0 for the function f(x) = sin(x)?
b) Verify that the function f(x) = exp
h
(x) = (1+h)
x/h
satises [f(x+h) f(x)]/h =
f(x). We dene e
x
= exp(x) = limh0 exp
h
(x).
4 Find the limits for x 0:
a) f(x) = (x
2
2x + 1)/(x 1).
b) f(x) = 2
x
.
c) f(x) = 2
2
x
.
d) f(x) = sin(sin(x))/ sin(x).
5 We explore in this problem the limit of the function f(x) = x
x
if x 0. Can we nd
a limit? Take a calculator or use Wolfram and experiment. What do you see when
x 0? Optional: can you nd a explanation for your experiments?
We study a few limits.
The Sinc function
A prototype function for studying limits is the sinc function
f(x) =
sin(x)
x
.
It is an important function and appears in many applications like in
the study of waves or signal processing (it is used in low pass lters).
The name sinc comes from its original latin name sinus cardinalis.
sincx
x 2
1 Does the function
cos(x)
x
have a limit at x 0?
2 Does the function
sin(x
2
)
x
2
have a limit for x 0?
3 Does the function
sin(x
2
)
x
4 Does the function
sin
2
(x)
x
2
5 Does the limit
1cos
2
(x)
x
2
exist for x 0?
6 Does the function
x
sin(x)
7 Does the function
sin(x)
|x|
8 Does the function
sin(x)
|x|
Lecture 4: Continuity
A function f is called continuous at a point p if a value f(p) can be found such
that f(x) f(p) for x p. A function f is called continuous on [a, b] if it is
continuous for every point x in the interval [a, b].
In the interior (a, b), the limit needs to exist both from the right and from the left. At the bound-
ary a only the right limit needs to exist and at b only the left limit. Intuitively, a function is
continuous if you can draw the graph of the function without lifting the pencil. Continu-
ity means that small changes in x results in small changes of f(x).
1 Any polynomial is continuous everywhere. To see this note that the sum of two continuous
functions is continuous and that a multiple of a continuous function is continuous. Since
x
n
is continuous for all n, and every polynomial is a sum of multiples of such functions, we
have continuity in general.
2 The function f(x) = 1/x is continuous everywhere except at x = 0. It is a prototype of a
function which is not continuous due to a pole. The source for the trouble is the division
by zero which would happen if we would try to evaluate the function at x = 0.
3 The function csc(x) = 1/ sin(x) is not continuous at x = 0, x = , x = 2 and any multiple
of . It has poles there because sin(x) is zero there and because we would divide by zero at
such points.
4 The function f(x) = sin(/x) is continuous everywhere except at x = 0. It is a prototype of
a function which is not continuous due to oscillation. We can approach x = 0 in ways that
f(xn) = 1 and such that f(zn) = 1. Just chose xn = 2/(4k + 1) and zn = 2/(4k 1).
5 The signum function f(x) = sign(x) =
1 x > 0
1 x < 0
0 x = 0
is not continuous at 0. It is a
prototype of a function which has a jump discontinuity at 0.
We can rene the notion of continuity and say that a function is continuous from the
right, if there exists a limit from the right limxa f(x) = b. Similarly a function f can be
continuous from the left only. Most of the time we mean with continuous= continuous
on the real line.
Rules:
a) If f and g are continuous, then f + g is continuous.
b) If f and g are continuous, then f g is continuous.
c) If f and g are continuous and if g > 0 then f/g is continuous.
d) If f and g are continuous, then f g is continuous.
6

x
2
+ 1 is continuous everywhere on the real line.
7 cos(x) + sin(x) is continuous everywhere.
8 The function f(x) = log(|x|) is continuous everywhere except at 0. Indeed since for every
integer n, we have f(e
n
) = n, this can become arbitrarily large for n even so e
n
converges to 0 for n running to innity.
9 While log(|x|) is not continuous at x = 0, the function 1/ log |x| is continuous at x = 0. Is
it continuous everywhere?
10 The function f(x) = [sin(x + h) sin(x)]/h is continuous for every h > 0. We will see next
week that nothing bad happens when h becomes smaller and smaller and that the continuity
will not deteriorate. Indeed, we will see that we get closer and closer to the cos function.
There are three major reasons, why a function is not continuous at a point: it can jump,
oscillate or escape to innity. Here are the prototype examples. We will look at more
during the lecture.
sign x
x
x
1x
x
Why do we like continuity? We will see many reasons during this course but for now lets
just say that:
A wild continuous function. This
Weierstrass function is believed to
be a fractal.
Continuity
tames a func-
tion.
It can be pretty
wild, but not too
crazy.
A crazy discontinuous function. It
is discontinuous at every point and
known to be a fractal.
Continuity will be useful later for extremization. A continuous function on an interval [a, b]
has a maximum and minimum. And if a continuous function is negative at some place and
positive at an other, there is a point between, where it is zero. These are all useful properties
to have and they do not hold if a function is not continuous.
11 Problem Determine from each of the following functions, where discontinuities appear and
give a short reason.
a) f(x) = log(|x
2
1|)
b) f(x) = sin(cos(/x))
c) f(x) = cot(x) + tan(x) + x
4
d) f(x) = x
4
+ 5x
2
3x + 4
e) f(x) =
x
2
x
x
Solution.
a) log(|x|) is continuous everywhere except at x = 0. Since x
2
1 = 0 for x = 1 or x = 1,
the function f(x) is continuous everywhere except at x = 1 and x = 1.
b) The function /x is continuous everywhere except at x = 0. Therefore cos(cos(/x) is
continuous everywhere except possibly at x = 0. We have still to investigate the point x = 0
but there, the function cos(/x) takes values between 1 and 1 for points arbitrarily close
to x = 0. The function f(x) takes values between sin(1) and sin(1) arbitrarily close to
x = 0. It is not continuous there.
c) The function x
4
is continuous everywhere. We do not have to consider it. The function
tan(x) is continuous everywhere except at the points points k, integer multiples of . The
function cot(x) is continuous everywhere except at points /2 + k. The function f is
therefore continuous everywhere except at the point x = k/2, multiples of /2.
d) The function is a polynomial. We know that polynomials are continuous everywhere.
e) The function is continuous everywhere except at x = 0, where we have to look at the
function more closely. But we can heal the function by dividing nominator and denominator
by x which is possible for x dierent from 0. We get x 1.
Homework
1 On which intervals is the following function continuous?
1 1 2 3 4 5 6
4
2
2
4
6
2 For the following functions, determine the points, where f is not continuous.
a) f(x) = tan(1 x)
b) xcos(1/x)
c) sign(x)/x
d) sinc(x) + sin(x) + x
8
+ log(x)
e)
x
2
+5x+x
4
x1
State which kind of discontinuity appears.
3 Construct a function which has a jump discontinuity, an oscillatory one as well as an escape
to innity. Can you construct an example where two of these aws happen at the same
point? Can you even construct an example where all three happen at the same point?
4 Heal the following functions:
a) (x
5
32)/(x 2)
b) x
5
x
3
/(x
2
1)
c) ((sin(x))
3
sin(x))/ sin(x).
d) (x
3
+ 3x
2
+ 3x + 1)/(x
2
+ 2x + 1)
e) (x
1000
1)/(x
100
1)
5 Is the following function continuous?
cos(cos(cos(cos(cos(x))))
sin(sin(sin(e
(e
(e
(e
(e
(e
(e
(e
(e
(e
(e
e
x
)
)
)
)
)
)
)
)
)
)
)))
log(2x+1)+2+cos((x))
Whats good and whats bad?
We have seen that oscillation, poles and jumps are the
perils for continuity. In general, we do not have to worry
about continuity. There are very few mechanisms which
bring you in peril. A function can either start to oscillate
like mad, rush to innity or jump. All cases are usually
due to division by zero somewhere.
Good Guys Bad Guys
x
2
+ 4x + 6 1/x at 0
sin(x), cos(x) tan(x) at /2
exp(x) log |x| at 0
sinc(x) =
sin(x)
x
sec(x) =
1
cos(x)
at /2
Which functions are continuous?
Which of the following functions are continuous?
1 Is f(x) =
|x| continuous at x = 0?
2 Is f(x) =
1
|x|
continuous at x = 0?
3 Is
1
log |x
2
|
continuous at x = 0?
4 Is log(log |x|) continuous at x = 0?
5 Is 1/(1 + 1/(x
4
+ 1)) continuous everywhere?
6 Is sin(sec(x)) continuous everywhere?
Enemy of continuity
Oscillations, escape to innity and jumps are reasons for
discontinuity.
x
1x
x
sign x
x
Lecture 5: Intermediate Value Theorem
If f(a) = 0, then the value a is called a root of f. For example, f(x) = cos(x) has
the root x = /2.
1 f(x) = 4x + 6. Find the roots of f. Answer: set the function equal to 0 and solve for x.
We get 4x + 6 = 0
2 f(x) = x
2
+2x+1 Find the roots of f. Answer: we can write f(x) = (x+1)
2
. The function
has the root x = 1.
3 f(x) = (x 2)(x + 6)(x + 3). Find the roots of f.
4 f(x) = 12 +x 13x
2
x
3
+x
4
. Find the roots of f. We do not have a formula for this, but
we can try. Indeed, we see that for x = 1, x = 3, x = 4, x = 1 we have roots.
5 f(x) = exp(x). This function does not have any root.
6 f(x) = 2
x
16 has the root x = 2.
Intermediate value theorem of Bolzano. If f is continu-
ous on [a, b] and f(a), f(b) have dierent signs, there is a root
of f in (a, b).
Proof. We can assume f(a) < 0 and f(b) > 0. The other case is similar. Look at the point
c = (a + b)/2. If f(c) < 0, then look take [c, b] as your new interval, otherwise, take [a, c].
We get a new root problem on a smaller interval. Repeat the procedure. After n steps,
the search is narrowed to an interval [un, vn] of size 2
n
(b a). Continuity assures that
f(un) f(vn) 0 and f(un), f(vn) have dierent signs. Both un, vn converge to a root of
f.
7 The function f(x) = x
17
x
3
+ x
5
+ 5x
7
+ sin(x) has a root. Solution. The function
goes to + for x and to for x . We have for example f(10000) > 0
and f(1000000) < 0. The intermediate value theorem assures there is a point where
f(x) = 0.
8 There is a solution to the equation x
x
= 10. Solution: for x = 1 we have x
x
= 1 for x = 10
we have x
x
= 10
10
> 10. Apply the intermediate value theorem.
9 There exists a point on the earth, where the temperature is the same as the temperature on
its antipode. Solution: Lets draw a meridian through the north and south pole and let f(x)
be the temperature on that circle. Dene g(x) = f(x) f(x+). If this function is zero on
the north pole, we have found our point. If not, g(x) dierent signs on the north and south
pole. There exists therefore a point, where the temperature is the same.
10
Wobbly Table Theorem. On an arbitrary oor, a square table
can be turned so that it does not wobble any more.
Why? The 4 legs ABCD are on a square. Let x be the angle of the line AC with with some
coordinate axes if we look from above. Given the angle x, we can position the table uniquely
as follows: the center of ABCD is on the z-axes, the legs ABC are on the oor and AC
points in the direction x. Let f(x) denote the height of the fourth leg D from the ground.
If we nd an angle x such that f(x) = 0, we have a position where all four legs are on
the ground. Assume f(0) is positive. (If it is negative, the argument is similar.) Tilt the
table around the line AC so that the two legs B,D have the same vertical distance h from
the ground. Now translate the table down by h. This does not change the angle x nor the
center of the table. The two previously hovering legs BD now touch the ground and the
two others AC are below. Now rotate around BD so that the third leg C is on the ground.
The rotations and lowering procedures have not changed the location of the center of the
table nor the direction. This position is the same as if we had turned the table by /2.
Therefore f(/2) < 0. The intermediate value theorem assures that f has a root between 0
and /2.
Dene Df(x) = (f(x+h)f(x))/h. Lets call it the derivative of f for the constant
h. We will study it more in the next lecture. But you have veried for example
Dexp
h
(x) = exp
h
(x) in a homework.
Lets call a point p, where Df(x) = 0 a critical point for h. Lets call a point a a
local maximum if f(a) f(x) in an open interval containing a. Dene similarly
a local minimum as a point where f(a) f(x).
11 The function f(x) = x(x h)(x 2h) has the derivative Df(x) = 3x(x h) as you have
veried in the case h = 1 in the rst lecture of this course in a worksheet. We will
write [x]
3
= x(x h)(x 2h) and [x]
2
= x(x h). The computation just done tells that
D[x]
3
= 3[x]
2
. Since [x]
2
has exactly two roots 0, h, the function [x]
3
has exactly 2 critical
points.
12 More generally for [x]
n+1
= x(x h)(x 2h)...(x nh) we have D[x]
n+1
= (n + 1)D[x]
n
.
Because [x]
n
has exactly n roots, the function [x]
n+1
has exactly n critical points. Keep the
formula
D[x]
n
= n[x]
n1
in mind!
13 The function exp
h
(x) = (1 + h)
x/h
satises Dexp
h
(x) = exp
h
(x). Because this function has
no roots and the derivative is the function itself, the function has no critical points. Indeed,
this function is monotone.
Figure: We see the function [x]
4
= x(x h)(x 2h)(x 3h) with h = 0.5. This function
has 3 critical points because D[x]
4
= 4[x]
3
and [x]
3
has roots at 0, h, 2h. There are three
local maxima or minima according to the theorem.
Later in the course, we will look at the derivative Df in the limit when h 0. And then the
critical points are places where the tangent is horizontal. In our case now, a critical point is
a point so that if we walk by a step h to the right, the function does not change. For now,
just remember the formula D[x]
n
= n[x]
n1
. It will be the same formula later on when we
go to the limit h 0.
Critical points lead to extrema as we will see later in the course. In our discrete setting we
can say:
Fermats maximum theorem If f is continuous and has a critical point a for
h, then f has either a local maximum or local minimum inside the open interval
(a, a + h).
Look at the range of the function f restricted to [a, a +h]. It is a bounded interval [c, d] by
the intermediate value theorem. There exists especially a point u for which f(u) = c and
a point v for which f(v) = d. These points are dierent if f is not constant on [a, a + h].
There is therefore one point, where the value is dierent than f(a). If it is larger, we have
a local maximum. If it is smaller we have a local minimum.
14 Problem. Verify that a cubic polynomial has maximally 2 critical points. Solution f(x) =
ax
3
+ bx
2
+ cx + d. Because the x
3
terms cancel in f(x + h) f(x), this is a quadratic
polynomial. It has maximally 2 roots.
Homework
1 Find the roots for f(x) = 30 + 49x 19x
2
x
3
+ x
4
2 Use the intermediate value theorem to nd a root of f(x) = x
2
6x + 8 on [0, 3]. Are
all roots in this interval?
3 a) Argue why there was a time, when Lady Gagas height was exactly 1 meter and not
one mm more less.
b) And that there was a time, when she weighed 50 kg and not a milligram more or
less.
c) Was there a time, when she owned exactly 1000000 dollars and not one dime more
or less?
4 Argue why there is a solution to
a) cos(x) = x.
b) exp(x) = x.
c) sinc(x) = x
4
.
5 a) Draw the graph of f(x) = x
3
x.
b) Locate the local maxima and minima.
c) Find the critical points of f to the constant h = 1. That means, nd the places,
where f(x + 1) f(x) = 0.
d) For every point a you have found in c), verify that there is a local maximum or
minimum in [a, a + 1].
Its groundhog day and a blizzard is coming. We study extrema and
the intermediate value theorem.
The intermediate value theorem
1 Today on groundhog day, the average temperature is 33
Fahren-
heit. Last summer, there was an average temperature was 77.2
.
Was there a time between July 1, 2010 and Feb 2, 2011, when the
temperature was exactly 50
?
2 We have got 38 inches of snow this month already. Does this
mean there was a time that we had 20 inches of snow on the
ground?
3 Is there a point x, where 1/ sin(x) = 1/2? Why does the
intermediate value theorem not give such a point? We have
1/ sin(/2) = 1 and 1/ sin(3/2) = 1.
4 Is there a point, where sign(x) = 1/2? Remember the signum
function. It is 1 for positive numbers, 0 for 0 and 1 for negative
numbers.
5 Lets call the function f(x) = x oor(x) the ground hog
function. If you know the movie with Bill Murray, you know
why. Can you nd an interval on which the intermediate value
theorem fails?
Feb 2 Feb 2 Feb 2
0 1 2
The derivative and extrema
6 Find a concrete function which has only one local maximum,
and no local minimum.
7 We have seen a remarkable theorem assuring the existence of
maxima and minima. In the classical sense this is not true. We
will dene critical points as points, where f
(x) = 0 and see that

for f(x) = x
3
, the derivative is 3x
2
which is zero at x = 0. Does
f(x) have a local maximum or minimum at x = 0?
Lecture 6: Some examples
Here are some worked out examples, similar to what we expect you to do for the homework of
lecture 6: The homework should be straightforward, except when nding Sf(x), we want to add
a constant such that Sf(0) = 0. In general, you will not need to evaluate functions and can leave
terms like sin(5x) as they are. If you have seen calculus already, then you could do this exercice
by writing
d
dx
f(x)
instead of Df(x) and by writing
x
0
f(x) dx
instead of Sf(x). We did not introduce the derivative df/dx nor the integral

x
0
yet. For now, just
use the Dierentiation rules and integrations rules in the box to the right to solve the problem.
1 Problem: Find the derivative Df(x) of the function f(x) = sin(5 x) + x
7
+ 3.
Answer: From the dierentiation rules, we know Df(x) = 5 cos(5 x) + 7x
6
.
2 Problem: Find the derivative Df(0) of the same function f(x) = sin(5 x) + 5x
7
+ 3.
Answer: We know Df(x) = 5 cos(5 x) + 35x
6
. Plugging in x = 0 gives 5 .
3 Problem: Find the integral Sf(x) of the function f(x) = sin(5 x) + 5x
7
+ 3.
Answer: From the integration rules, we know Sf(x) = cos(5 x)/5 + 5x
8
/8 + 3x .
4 Problem: Find the integral Sf(1) of the function f(x) = x
2
+ 1.
Answer: From the integration rules, we know Sf(x) = x
3
/3 + x. Plugging in x = 1 gives
1/3 + 1 if we use the functions in the limit h 0. For positive h, we have to evaluate
x(x h)(x 2h)/3 + x for x = 1 which is (1 h)(1 2h)/3 + 1
5 Problem: Find the integral Sf(1) of the function f(x) = exp(4 x).
Answer: From the integration rules, we know Sf(x) = exp(4 x)/4 1/4. We have added
a constant such that Sf(0) = 0. Plugging in x = 1 gives exp(4)/4 1/4 .
6 Problem: Assume h = 1/1000. Determine the value of
1
1000
[f(
0
1000
) + f(
1
1000
) + ... + f(
999
1000
)]
for the function f(x) = sin(7x) + exp(3x).
Answer: The problem asks for Sf(1). We rst compute Sf(x) taking care that Sf(0) = 0.
Sf(x) = cos(7x)/7 + exp(3x)/3 (1/7 + 1/3) .
Now plug in x = 1 to get cos(7)/7 + exp(3)/3 (1/7 + 1/3) .
Lecture 6: Fundamental theorem
Calculus is the theory of dierentiation and integration. We x here a positive constant h
and take dierences and sums. Without taking limits, we prove a version of the fundamental
theorem of calculus and dierentiate and integrate polynomials, exponentials and trigonometric
functions.
Given a function, dene the dierential quotient
Df(x) = (f(x + h) f(x))
1
h
If f is continuous then Df is a continuous function too. We call it also derivative.
1 Lets take the constant function f(x) = 5. We get Df(x) = (f(x+h)f(x))/h = (55)/h =
0 everywhere. We see that in general if f is a constant function, then Df(x) = 0.
2 f(x) = 3x. We have Df(x) = (f(x + h) f(x))/h = (3(x + h) 3x)/h which is 3 .
3 If f(x) = ax + b, then Df(x) = a .
For constant functions, the derivative is zero. For linear functions, it is the slope.
4 For f(x) = x
2
we compute Df(x) = ((x + h)
2
x
2
)/h = (2hx + h
2
)/h which is 2x + h .
Given a function f dene a new function Sf(x) by summing up all values of f(hj)
where 0 jh < x. That is, if k is such that kh is the largest below x, then
Sf(x) = h[ f(0) + f(h) + f(2h) + .... + f(kh) ]
We call Sf also the integral or antiderivative of f.
5 Compute Sf(x) for f(x) = 1. Solution. We have Sf(x) = 0 for x h, and Sf(x) = h
for h x < 2h and Sf(x) = 2h for 2h x < 3h. In general S1(jh) = j and S1(x) = kh
where k is the largest integer such that kh < x. The function g grows linearly but quantized
steps.
The dierence Df(x) will become the derivative f
(x) .
The sum Sf(x) will become the integral

x
0
f(t) dt .
Df means rise over run and is close to the slope of the graph of f.
Sf means areas of rectangles and is close to the area under the graph of f.
k h
x
y
0 h
fk hf0
k h
x
y
0 h
fk h
Theorem: Sum the dierences and get
SDf(kh) = f(kh) f(0)
Theorem: Dierence the sum and get
DSf(kh) = f(kh)
6 For f(x) = [x]
m
h
= x(x h)(x 2h)...(x mh + h) we have
f(x+h) f(x) = (x(x h)(x 2h)...(x kh + 2h)) ((x + h) (x mh + h)) = [x]
m1
hm
and so D[x]
m
h
= m[x]
(m1)
h
. Lets leave the h away to get the important formula D[x]
m
= m[x]
m1
We can establish from this dierentiation formulas for polynomials.
7 If f(x) = [x] + [x]
3
+ 3[x]
5
then Df(x) = 1 + 3[x]
2
+ 15[x]
4
.
The fundamental theorem allows us to integrate and get the right values at the points k/n:
8 Find Sf for the same function. The answer is Sf(x) = [x]
2
/2 + [x]
4
/4 + 3[x]
6
/6.
Dene exp
h
(x) = (1+h)
x/h
. It is equal to 2
x
for h = 1 and morphs into the function
e
x
when h goes to zero. As a rescaled exponential, it is continuous and monotone.
9 The function exp
h
(x) = (1 + h)
x/h
satises D exp
h
(x) = exp
h
(x).
Solution: exp
h
(x + h) = (1 + h) exp(x) shows that. Dexp
h
(x) = exp
h
(x)
10 Dene exp
a
(x) = (1 + ah)
x/h
. Now D exp
a
h
(x) = a exp
a
h
(x). Since exp
h
(ax) is not equal to
exp
a
h
(x), we write also e
ax
h
= exp
h
(a x) = exp
a
h
(x). Now: Dexp
h
(a x) = a exp
h
(a x)
11 We can also replace a with the complex ai and consider exp
ai
h
(x) = (1 + aih)
x/h
. Now,
Dexp
ai
h
(x) = ai exp
ai
h
(x). Real and imaginary parts dene new functions exp
ai
h
(x) = cosh(a
x) + i sinh(a x). We have Dsinh(a x) = a cosh(a x) and D cosh(a x) = a sinh(a x).
These functions morph into the familiar cos and sin functions for h 0. But in general, for
any h and any a, we have Dcosh(a x) = a sinh(a x) and Dsinh(a x) = a cosh(a x).
Homework
We leave the h away in this homework. To have more fun, also dene log
h
as the inverse of exp
h
and dene 1/[x]h = Dlog
h
(x) for x > 0. If we start integrating from 1 instead of 0 as usual we
have S11/[x]h = log
h
(x).
1
We also write here x
n
for [x]
n
h
and write exp(a x) = e
ax
instead of
exp
a
h
(x) and log(x) instead of log
h
(x) because we are among friends. Use the dierentiation and
integration rules on the right to nd derivatives and integrals of the following functions:
1 Find the derivatives Df(x) of the following functions:
a) f(x) = x
2
+ 6x
7
+ x
a) f(x) = x
4
+ log(x)
c) f(x) = 3x
3
+ 17x
2
5x. What is Df(0)?
2 Find the integrals Sf(x) of the following functions:
a) f(x) = x
4
.
b) f(x) = x
2
+ 6x
7
+ x
c) f(x) = 3x
3
+ 17x
2
5x. What is Sf(1)?
3 Find the derivatives Df(x) of the following functions
a) f(x) = exp(3 x) + x
6
b) f(x) = 4 exp(3 x) + 9x
6
c) f(x) = exp(5 x) + x
6
4 Find the integrals Sf(x) of the following functions
a) f(x) = exp(6 x) 3x
6
b) f(x) = exp(8 x) + x
6
c) f(x) = exp(5 x) + x
6
5 Dene f(x) = sin(4 x) exp(2 x) + x
4
and assume h = 1/100 in part c).
a) Find Df(x)
b) Find Sf(x)
c) Determine the value of
1
100
[f(
0
100
) + f(
1
100
) + + f(
99
100
)] .
1
We do not see h in daily lives, or do we? An allegory: in our universe, where h = 1.616 10
35
m, the
dierence between the sinh and sin is so small that a x-ray oscillating with = 10
17
Herz traveling for 13 billion
years t = 4 10
17
s would only start to deviate noticeably from the classical sin(x) wave when it reaches us at
t = 4 10
34
oscillations. Since sinh(x) sin(x) only starts to grow at around x = 1/h 10
35
oscillations, the
x-ray would look the same when using the trig functions sinh, cosh. If is in the Gamma ray spectrum 10
19
Hz,
the functions sinh, cosh start to grow in amplitude earlier. A wave emitted 1 billion years ago would be observed
as a Gamma ray burst.
All calculus on 1/3 page
Fundamental theorem of Calculus: DSf(x) = f(x) and SDf(x) = f(x) f(0).
Dierentiation rules
Dx
n
= nx
n1
De
ax
= ae
ax
Dcos(a x) = a sin(a x)
Dsin(a x) = a cos(a x)
Dlog(x) = 1/x
Integration rules (for x = kh)
Sx
n
= x
n+1
/(n + 1)
Se
ax
= (e
ax
1)/a
S cos(a x) = sin(a x)/a
S sin(a x) = cos(a x)/a
S
1
x
= log(x)
Fermats extreme value theorem: If Df(x) = 0 and f is continuous, then f has a local
maximum or minimum in the open interval (x, x + h).
Pictures
[x]
3
h
for h = 0.1 exp
h
(x) for h = 0.1
sinh(x) for h = 0.1 log
h
(x) for h = 0.1
Here are some worked out examples, similar to what we expect you to do for the homework
of lecture 6: The homework should be straightforward, except when nding Sf(x), we want
to add a constant such that Sf(0) = 0. In general, you will not need to evaluate functions
and can leave terms like sin(5 x) as they are. If you have seen calculus already, then you
could do this exercice by writing
d
dx
f(x)
instead of Df(x) and by writing
x
0
f(x) dx
instead of Sf(x). Since we did not introduce the derivative df/dx nor the integral

x
0
yet,
for now, just use the dierentiation and integrations rules in the box to the right to solve
the problems.
1 Problem: Find the derivative Df(x) of the function f(x) = sin(5 x) + x
7
+ 3.
Answer: From the dierentiation rules, we know Df(x) = 5 cos(5 x) + 7x
6
.
2 Problem: Find the derivative Df(0) of the same function f(x) = sin(5 x) + 5x
7
+ 3.
Answer: We know Df(x) = 5 cos(5 x) + 35x
6
. Plugging in x = 0 gives 5 .
3 Problem: Find the integral Sf(x) of the function f(x) = sin(5 x) + 5x
7
+ 3.
Answer: From the integration rules, we know Sf(x) = cos(5 x)/5 + 5x
8
/8 + 3x .
4 Problem: Find the integral Sf(1) of the function f(x) = x
2
+ 1.
Answer: From the integration rules, we know Sf(x) = x
3
/3 + x. Plugging in x = 1
gives 1/3 + 1 if we use the functions in the limit h 0. For positive h, we have to
evaluate x(x h)(x 2h)/3 + x for x = 1 which is (1 h)(1 2h)/3 + 1
5 Problem: Find the integral Sf(1) of the function f(x) = exp(4 x).
Answer: From the integration rules, we know Sf(x) = exp(4 x)/4 1/4. We have
added a constant such that Sf(0) = 0. Plugging in x = 1 gives exp(4)/4 1/4 .
6 Problem: Assume h = 1/1000. Determine the value of
1
1000
[f(
0
1000
) + f(
1
1000
) + ... + f(
999
1000
)]
for the function f(x) = sin(7x) + exp(3x).
Answer: The problem asks for Sf(1). We rst compute Sf(x) taking care that
Sf(0) = 0.
Sf(x) = cos(7x)/7 + exp(3x)/3 (1/7 + 1/3) .
Now plug in x = 1 to get cos(7)/7 + exp(3)/3 (1/7 + 1/3) .
The exponential function
We illuminate the fundamental theorem for the exponential function
exp(x) = (1+h)
x/h
. While the discussion could be done for any h > 0
we look at the special case where h = 1 in which case exp(x) = 2
x
maps positive integers to positive integers. You have veried in a
homework that
Dexp(x) = exp(x) .
From the fundamental theorem, we get SDexp(x) = S exp(x) =
exp(x) exp(0) for integers x. That is
S exp(x) = exp(x) 1 .
In other words, for the exponential function, we know both the deriva-
tive and the integral.
1
1 The formula S exp(x) = exp(x) 1 tells for x = 5 that 1 +
2 + 4 + 8 + 16 = 32 1. Verify it for x = 7.
2 Because S exp(x) = exp(x) 1 we can interpret exp(x) 1 as
an area of a union of rectangles. In the picture below, shade an
area exp(3) 1.
3 In the right of the two pictures, there is a line vertical segment
which has length exp(3). Which one?
1
Later in this course, we will look at these two formulas in the limit h 0, where
d
dx
exp(x) = exp(x),
x
0
exp(t) dt = exp(x) 1 .
4 We know Dexp(x) = exp(x). Why is also the following for-
mula true?
D(exp(x) 1) = exp(x)
5 The just veried formula can be interpreted as a dierence
between areas and so an area. Which one for x = 4?
0 1 2 3 4
1
2
4
8
16
0 1 2 3 4
1
2
4
8
16
Lecture 7: Rate of change
Last week, we dened
Df(x) =
f(x + h) f(x)
h
.
It is the rate of change of the function with step size h. When changing x to x + h and get
a response change f(x) to f(x + h). In this lecture, we take the limit h 0 and derive the
important formulas
d
dx
x
n
= nx
n1
,
d
dx
exp(x) = exp(x),
d
dx
sin(x) = cos(x),
d
dx
cos(x) = sin(x)
which we have seen already in a discrete setting.
1 You walk up a snow hill of height f(x) = 30x
2
meters. You walk with a step size of h = 0.5
meters. You are at position x = 3. How much do you climb or descend when making an
other step? We have f(3) = 21 and f(3.5) = 17.75. We have walked down 3.25 meters.
How steep was the snow hill at this point? We have to divide the height dierence by the
walking distance: 3.25/0.5 = 7.5.
Today, we take the limit h 0:
If the limit
d
dx
f(x) = limh0
f(x+h)f(x)
h
exist, we say f is dierentiable at the
point x. The value is called the derivative or instantaneous rate of change of
the function f at x. We denote the limit also with f
(x).
2 In the previous problem, f(x) = 30 x
2
we have
f(x + h) f(x) = [30 (x + h)
2
] [30 x
2
] = 2xh h
2
Dividing this by h gives 2x h. The limit h 0 gives 2x. We have just seen that for
f(x) = x
2
, we get f
(x) = 2x. For x = 3, this is 6. The actual slope of the snow hill is a
bit smaller than the estimate done by walking. The reason is that the hill gets steeper.
The derivative f
(x) has a ge-

ometric meaning. It is the
slope of the tangent at x. This
is an important geometric in-
terpretation. It is useful to
think about x as time and
the derivative as the rate of
change of the quantity f(x) in
time.
fx
fxh
h
x xh
For f(x) = x
n
, we have f
(x) = nx
n1
.
Proof: f(x+h) f(x) = (x+h)
n
= (x
n
+nx
n1
h+a2h
2
+... +h
n
) x
n
= nx
n1
h+a2h
2
+
... + h
n
). If we divide by h, we get nx
n1
+ h(a2 + . . . + h
n2
) for which the limit h 0
exists: it is nx
n1
. This is an important result because most functions can be approximated
very well with polynomials.
3
For f(x) = sin(x) we have f
(0) = 1 because the dif-

ferential quotient is [f(0 + h) f(0)]/h = sin(h)/h =
sinc(h). We have already seen that the limit is 1 before.
Lets look at it again geometrically. For all 0 < x < /2
we have
sin(x) x tan(x) .
[ dividing by 2 squeezes the area of the sector by the area
of triangles.] Because tan(x)/ sin(x) = 1/ cos(x) 1 for
x 0, the value of sinc(x) = sin(x)/x must go to 1 as
x 0. Renaming the variable x with the variable h, we
see the fundamental theorem of trigonometry
limh0
sin(h)
h
= 1
cosx
sinx
tanx
x
x
1
4 For f(x) = cos(x) we have f
(x) = 0. To see this, look at f(0 + h) f(0) = cos(h) 1.

Geometrically, we can use Pythagoras sin
2
(h)+(1cos(h))
2
h
2
to see that 22 cos(h) h
2
or 1 cos(h) h
2
/2 so that (1 cos(h))/h h/2 and this goes to 0 for h 0. We have
just nailed down an other important identity
limh0
(1cos(h)
h
= 0 .
The interpretation is that the tangent is horizontal for the cos function at x = 0. We will
call this a critical point later on.
5 From the previous two examples, we get
cos(x+h)cos(x) = cos(x) cos(h)sin(x) sin(h)cos(x) = cos(x)(cos(h)1)sin(x) sin(h)
because (cos(h) 1)/h 0 and sin(h)/h 1, we see that [cos(x + h) cos(x)]/h
sin(x).
For f(x) = cos(ax) we have f
(x) = a sin(ax).
6 Similarly,
sin(x+h)sin(x) = cos(x) sin(h)+sin(x) cos(h)sin(x) = sin(x)(cos(h)1)+cos(x) sin(h)
because (cos(h)1)/h 0 and sin(h)/h 1, we see that [sin(x+h)sin(x)]/h cos(x).
for f(x) = sin(ax), we have f
(x) = a cos(ax).
e = lim
n
(1 +
1
n
)
n
Like , the Euler number e is irrational. Here are the rst digits: 2.7182818284590452354.
If you want to nd an approximation, just pick a large n, like n = 100 and compute
(1 + 1/n)
n
. For n = 100 for example, we see 101
100
/100
100
. We only need 101
100
and then
put a comma after the rst digit to get an approximation. Interested why the limit exists:
verify hat the fractions An = (1 + 1/n)
n
increase and Bn = (1 + 1/n)
(n+1)
decrease. Since
Bn/An = (1 + 1/n) which goes to 1 for n , the limit exists. The same argument
shows that (1+1/n)
xn
= exp
1/n
(x) increases and exp
1/n
(x)(1+1/n) decreases. The limiting
function exp(x) = e
x
is called the exponential function. Remember that if we write
h = 1/n, then (1 + 1/n)
nx
= exp
h
(x) considered earlier in the course. We can sandwich the
exponential function between exp
h
(x) and (1 + h) exp
h
(x):
exp
h
(x) exp(x) exp
h
(x)(1 + h), x 0 .
For x < 0, the inequalities are reversed.
7 Lets compute the derivative of f(x) = e
x
at x = 0. Answer. We have
((1 +
1
n
)
n
1)n (e
h
1)/h ((1 +
1
n
)
n+1
1)n
Use the binominal formula to see that both the left and right hand side go to 1 if n .
Therefore f
(0) = 1. The exponential function has a graph which has slope 1 at x = 0.

8 Now, we can get the general case. It follows from e
x+h
e
x
= e
x
(e
h
1) that the derivative
of exp(x) is exp(x).
For f(x) = exp(ax), we have f
(x) = a exp(ax).
It follows from the properties of taking limits that (f(x) + g(x))
= f
(x) + g
(x). We also
have (af(x))
= af
(x). From this, we can now compute many derivatives

9 Find the slope of the tangent of f(x) = sin(3x) + 5 cos(10x) + e
5x
at the point x = 0.
Solution: f
(x) = 3 cos(3x) 50 sin(10x) + 5e

5x
. Now evaluate it at x = 0 which is
3 + 0 + 5 = 8.
Finally, lets mention an example of a function which is not everywhere dierentiable.
10 The function f(x) = |x| has the properties that f
(x) = 1 for x > 0 and f
(x) = 1 for
x =< 1. The derivative does not exist at x = 0 evenso the function is continuous there. You
see that the slope of the graph jumps discontinuously at the point x = 0.
For a function which is discontinuous at some point, we dont even attempt to dierentiate it
there. For example, we would not even try to dierentiate sin(4/x) at x = 0 nor f(x) = 1/x
3
at x = 0 nor sin(x)/|x| at x = 0. Remember these bad guys?
To the end, you might have noticed that in the boxes, more general results have appeared,
where x is replaced by ax. We will look at this again but in general, the relation f
(ax) =
af(ax) holds (if you drive twice as fast, you climb twice as fast).
Homework
1 For which of the following functions does the derivative exist for all points?
a) | sin(x)|
d) | cos(x)|
b) | exp(x)|
e) sin(1/x)
c) exp(x) + sin(15x)
f) | exp(x)| +|1 + sin(15x)|
2 a) A circle of radius x has the area f(r) = r
2
. Find
d
dr
f(r). Can you visualize why
this is the same than the circumference of the circle.
b) The sphere of radius r has the volume f(r) = 4r
3
/3. Find
d
dr
f(r)) and compare it
with the surface area of the sphere.
c) A hypersphere of radius r has the hyper volume f(r) =
2
r
4
/2. Find
d
dr
f(r),
the volume of the boundary sphere.
3 Find the derivatives of the following functions at the point x = 2.
a) f(x) = exp(x) + sin(x) + x + x
2
+ x
3
+ x
4
+ x
5
.
b) f(x) = (x
5
1)/(x 1) + cos(2x). First heal the function.
c) f(x) =
1+4x+6x
2
+4x
3
+x
4
x
2
+2x+1
. Dito, rst heal!
4 In this problem we compute the derivative of

x for x > 0. To do so, we have to nd
the limit
lim
h0
x + h
x
h
.
Hint: multiply the top and the bottom with (
x + h +
x) and simplify.
5
A rocket lifts o from Cape Canaveral. The height at
time t is h(t) = e
t
1 +
t, at least for the rst few

seconds. Find the rate of change of the height at time
t = 1. Use the previous problem to get the derivative of
t.
0.5 1.0 1.5 2.0 2.5 3.0
5
10
15
20
x
x
1
Rate of change
We compute the derivative of f(x) = 1/x by taking limits.
a) Simplify
1
x+h
1
x
.
b) Now take the limit
1
h
[
1
x+h
1
x
] when h 0.
c) Is there any point where f
(x) > 0?
Derivatives
Dierentiation rules
d
dx
x
n
= nx
n1
e
ax
= ae
ax
d
dx
cos(ax) = a sin(ax)
d
dx
sin(ax) = a cos(ax)
1 Find the derivatives of the function f(x) = sin(3x) + x
5
2 Find the derivative of f(x) = cos(7x) 8x
4
.
3 Find the derivative of f(x) = e
5x
+ cos(2x).
Lecture 8: The derivative function
In the last lecture, we have introduced the derivative f
(x) =
d
dx
f(x) as a limit of Df(x) for
h 0. We have seen that
d
dx
x
n
= nx
n1
holds for integer n. We also know already that
sin
= cos, cos
= sin and exp
= exp. We can already dierentiate a lot of functions and

evaluate the derivative f
(x) at some point x. This is the slope of the curve at x.

1 Find the derivative f
(x) of f(x) = sin(x) + cos(x)
x + 1/x + x
4
and evaluate it at
x = 1. Solution: f
(x) = cos(x) sin(x) 1/(2
x 1/x
2
+ 4x
3
. Plugging in x = 1
gives 1/2 1 + 4.
Taking the derivative at every point denes a new function, the derivative function. For
example, for f(x) = sin(x), we get f
(x) = cos(x). In this lecture, we want to understand

the new function and its relation with f. What does it mean if f
(x) > 0. What does it

mean that f
(x) < 0. Do the roots of f tell something about f
or do the roots of f
tell
something about f?
Here is an example of a function and its derivative. Can you see the relation?
To understand the relation, it is good to distinguish intervals, where f(x) is increasing or
decreasing. This are the intervals where f
(x) is positive or negative.

A function is called monotonically increasing on an interval I = (a, b) if f
(x) > 0
for all x (a, b). It is monotonically decreasing if f
(x) > 0 for all x (a, b).

Lets look at the previous example again.
Here is an interesting inverse problem called bottle calibration problem. We ll a circular
bottle or glass with constant amount of uid. Plot the height of the uid in the bottle at
time t. Assume the radius of the bottle is f(z) at height z. Can you nd a formula for
the height g(t) of the water? This is not so easy. But we can nd the rate of change g
(t).
Assume for example that f is constant, then the rate of change is constant and the height
of the water increases linearly like g(t) = t. If the bottle gets wider, then the height of the
water increases slower. There is denitely a relation between the rate of change of g and f.
Before we look at this more closely, lets try to match the following cases of bottles with the
graphs of the functions g qualitatively.
2 In each of the bottles, we call g the height of the water level at time t, when lling the bottle
with a constant stream of water. Can you match each bottle with the right height function?
a) b) d) c)
0.2 0.4 0.6 0.8 1.0
2
4
6
8
10
1)
0.2 0.4 0.6 0.8 1.0
0.1
0.2
0.3
2)
0.2 0.4 0.6 0.8 1.0
2
2
4
6
8
3)
0.2 0.4 0.6 0.8 1.0
0.1
0.2
0.3
0.4
4)
The key is to look at g
(t), the rate of change of the height function. Because [g(t +h)g(t)]
times the area f
2
is a constant times the time dierence h = dt, we have
g
=
1
f
2
.
This formula relates the derivative function of g with the thickness f(t) of the bottle at
height g. It tells that if f is large, then g
is small and if f is small, then g
is large. Finding
g from f is possible but we are not doing this now.
3 Can you nd a function f which is bounded |f(x)| 1 and such that f
(x) is unbounded?
s
Given the function f(x), we can dene g(x) = f
(x) and then take the derivative g
of g. This second derivative f
(x) is called the acceleration. It measures the rate

of change of the tangent slope. For f(x) = x
4
, for example we have f
(x) = 12x
2
.
If f
(x) > 0 on some interval the function is called concave up, if f
(x) < 0, it is
concave down.
4 Find a function f which has the property that its acceleration is constant equal to 10.
5 Can you nd a function f which is bounded |f(x)| 1 and such that f
(x) is positive
everywhere?
Homework
1 For the following functions, determine on which intervals the function is monotonically in-
creasing or decreasing.
a) f(x) = x
3
x.
b) f(x) = sin(x).
c) f(x) = e
2x
2e
x
2 Match the following functions with their derivatives. Give short explanations for each match.
a) b) c) d)
1) 2) 3) 4)
3 Match also the following functions with their derivatives. Give short explanations docu-
menting your reasoning in each case.
a) b) c) d)
1) 2) 3) 4)
4 Draw for the following functions the graph of the function f(x) as well as the graph of its
derivative f
(x). You do not have to compute the derivative analytically as a formula here
since we do not have all tools yet to compute the derivatives. The derivative function you
draw needs to have the right qualitative shape however.
a) The Gaussian bell curve or the To whom the bell tolls function
f(x) = e
x
2
b) The witch of Maria Agnesi.
f(x) =
1
1 + x
2
c) The three gorges function
f(x) =
1
x
+
1
x 1
+
1
x + 1
.
5 Below you the graphs of three derivative functions f
(x). In each case you are told that

f(0) = 1. Your task is to draw the function f(x) in each of the cases a),b),c). Your picture
does not have to be up to scale, but your drawing should display the right features.
a) b) c)
Matching functions with their derivative
0)
o)
O)
In this worksheet we want to match the
graphs of functions with their deriva-
tives and second derivatives. This is
tougher than you might think. Here is
an example:
The rst graph shows the function,
which is here the quadratic function.
The slope on the right hand side is pos-
itive and increasing, on the left hand
side the function is negative and de-
creasing. The middle graph shows the
derivative function which is linear. The
nal graph shows the derivative function
of the derivative function. It is constant
in this case.
1 Match the following functions with their derivatives and then
with the derivatives of the derivatives.
a) b) c) d)
1) 2) 3) 4)
A) B) C) D)
Lecture 9: The product rule
In this lecture, we look at the derivative of a product of func-
tions. The product rule is also called Leibniz rule named
after Gottfried Leibniz, who found it in 1684. It is a very
important rule because it allows us dierentiate many more
functions. If we wanted to compute the derivative of f(x) =
xsin(x) for example, we would have to get under the hood of
the function and compute the limit lim(f(x + h) f(x))/h.
We are too lazy for that. Lets start with the identity
f(x + h)g(x + h) f(x)g(x) = [f(x + h) f(x)] g(x + h) + f(x) [g(x + h) g(x)]
which can be written as D(fg) = Dfg
+
+fDg with g
+
(x) =
g(x + h). This quantum Leibniz rule can also be seen
geometrically: the rectangle of area (f + df)(g + dg) is the
union of rectangles with area f g, f dg and df g
+
. Divide
this relation by h to see
[f(x + h) f(x)]
h
g(x + h) f
(x) g(x)
f(x)
[g(x + h) g(x)]
h
f(x) g
(x) .
We get the extraordinarily important product rule:
f g
f dg
dfg
d
dx
(f(x)g(x)) = f
(x)g(x) + f(x)g
(x) .
1 Find the derivative function f
(x) for f(x) = x

3
sin(x). Solution: We know how to dier-
entiate x
3
and sin(x) so that f
(x) = 3x
2
sin(x) + x
3
cos(x).
2 While we know
d
dx
x
5
= 5x
4
lets compute this with the Leibniz rule and write x
5
= x
3
x
2
. We have
d
dx
x
3
= 3x
2
,
d
dx
x
2
= 2x .
The Leibniz rule gives us d/dx
5
= 3x
4
+ 2x
4
= 5x
4
.
3
Water powered JetLev systems have now gone into produc-
tion. The water is sucked up from the water surface from a
four-inch diameter polyester hose. Consider a system, where
the water is carried with you. By Newtons law the force F
satises F = p
, where p = mv is the momentum, the product

of your mass and velocity. Written out, this is
F(t) =
d
dt
(m(t)v(t)) .
How big is the acceleration v
? The product rule tells us

F = m
v + mv
which gives v
= (F m
v)/m. Since we
throw out water, m
(t) is negative and m(t) decreases, we

accelerate if the force F is kept constant.
The Leibniz rule is also called product rule. It suggests a quotient rule. One can avoid
the quotient rule by writing it as a product f(x)/g(x) = f(x) 1/g(x) and by using the
reciprocal rule:
If g(x) = 0, then
d
dx
1
g(x)
=
g
(x)
g(x)
2
.
To verify it, stare at the identity
1
g(x + h)

1
g(x)
=
g(x) g(x + h)
g(x)g(x + h)
.
Dividing it by h gives D(1/g(x)) = Dg(x)/(g(x)g
+
(x)). Taking the limit h 0 leads to
the identity. An other way to derive this is to write h = 1/g and dierentiate 1 = gh on
both sides. The product rule gives 0 = g
h + gh
so that h
= hg
/g = g
/g
2
.
4 Find the derivative of f(x) = 1/x
4
. Solution: f
(x) = 4x
3
/x
8
= 4/x
5
. The same
computation shows that
d
dx
x
n
= nx
n1
holds for all integers n.
The formula
d
dx
x
n
= nx
n1
holds for all integers n.
The quotient rule is obtained by applying the product rule to f(x) (1/g(x)) and using
the reciprocal rule:
If g(x) = 0, then
d
dx
f(x)
g(x)
=
[f
(x)g(x) f(x)g
(x)]
g
2
(x)
.
5 Find the derivative of f(x) = tan(x). Solution: because tan(x) = sin(x)/ cos(x) we have
tan
(x) =
sin
2
(x) + cos
2
(x)
cos
2
(x)
=
1
cos
2
(x)
.
6 Find the derivative of f(x) =
2x
x
2
+x
4
+1
. Solution. We apply the quotient rule and get
[(1)x
2
+ x
4
+ 1 + (2 x)(2x + 4x
3
)]/(x
2
+ x
4
+ 1).
Here are some more problems with solutions:
7 Find the second derivative of tan(x). Solution. We have already computed tan
(x) =
1/ cos
2
(x). Dierentiate this again with the quotient rule gives
d
dx
cos
2
(x)
cos
4
(x)
.
We still have to nd the derivative of cos
2
(x). The product rule gives cos(x) sin(x) +
sin(x) cos(x) = 2 cos(x) sin(x). Our nal result is
2 sin(x)/ cos
3
(x) .
8 A cylinder has volume V = r
2
h, where r is the radius and h is the height. Assume the radius
grows like r(t) = 1+t and the height shrinks like 1sin(t). Does the volume grow or decrease
at t = 0? Solution: The volume V (t) = (1 + t)
2
(1 sin(t)) is a product of two functions
f(t) = (1 + t)
2
and g(t) = (1 sin(t). We have f(0) = 1, g
(0) = 2, f
(0) = 2, g(0) = 1.
The product rule gives gives V

(0) = 1 (1) +2 1 = . The volume increases in volume
at rst.
9 On the Moscow papyrus dating back to 1850 BC, the general formula V = h(a
2
+ab+b
2
)/3
for a truncated pyramid with base length a, roof length b and height h appeared. Assume
h(t) = 1 +sin(t), a(t) = 1 +t, b(t) = 1 2t. Does the volume of the truncated pyramid grow
or decrease at rst? Solution. We could ll in a(t), b(t), h(t) into the formula for V and
compute the derivative using the product rule. A bit faster is to write f(t) = a
2
+ab +b
2
=
(1+t)
2
+(13t)
2
+(1+t)(13t) and note f(0) = 3, f
(0) = 6 then get from h(t) = (1+sin(t))

the data h(0) = 1, h
(0) = 1. So that V

(0) = (h
(0)f(0) h(0)f
(0))/3 = (1 31(6))/3 =
1. The pyramid shrinks in volume at rst.
10 We pump up a balloon and let it y. Assume that the thrust increases like t and the
resistance decreases like 1/
1 t since the balloon gets smaller. The distance traveled is

f(t) = t/
1 t. Find the velocity f
(t) at time t = 0.
Homework
1 Find the derivatives of the following functions:
a) f(x) = sin(3x) cos(10x).
b) f(x) = sin
2
(x)/x
2
.
c) f(x) = x
4
sin(x) cos(x).
d) f(x) = 1/
x.
e) f(x) = cot(x) + (1 + x)/(1 + x
2
).
2 a) Verify that for f(x) = g(x)h(x)k(x) the formula f
= g
hk + gh
k + ghk
holds.
b) Verify the following formula for derivative of f(x) = g(x)
3
: f
(x) = 3g
2
(x)g
(x).
3 a) If f(x) = sinc(x) = sin(x)/x, nd its derivative g(x) = f
(x) and then the derivative

of g(x). Then evaluate it at x = 0.
b) If you evaluate g(x) at x = 0 you obtain g(0) = f
(0) = 0. Is the result in a) not a

contradiction to the fact that for g = 0 the derivative g
is 0?
4 Find the derivative of
sin(x)
1 + cos(x) +
x
4
1+cos
2
(x)
at x = 0.
5 a) Verify that in general the derivative of g(x) = f(x)
2
is 2f(x)f
(x).
b) We have already computed the derivative of f(x) =

x in the last homework by
directly computing the limit. Lets do it using the product rule. Use part a) of this
problem to compute the derivative of
g(x) = f(x) f(x)
Use the obtained identity g
(x) = ... to get a formula for f
(x) =
d
dx
g(x).
c) Use the same method and the above homework problem 2b) in this homework set
to compute the derivative of the cube root function f(x) = x
1/3
.
This last problem 5) is a preparation for the chain rule, we see next Monday. Avoid
using the chain rule already here.
Remarks: Like quantum calculus also quantum Leibniz rule is old. . The obove picture
explaining the discrete rule (without having to consider any error terms) appears in the
article John Dawson, Wavefronts, BoxDiagrams and the Product Rule: A discovery
Approach, 11 Page 102-106, Two Year College Mathematics Journal, 1980.
The product rule
We practice the product, reciprocal and quotient rule
1 Find the derivative of the sinc-function sin(x)/x at the point
x = 0.
2 What is the slope of the graph of the function f(x) = xe
x
2
at x = 0?
3 Find the derivative of
x/x at x = 1. (Look rst!)

4 Find the derivative of 1/e
x
at x = 1. (Look rst!)
5 Assume we remember the formula sin(2x) = 2 sin(x) cos(x).
Dierentiate both sides to get a formula for cos(2x).
6 Find the derivative of x 1/(x
2
+ 1) at x = 0.
Source: XKCD
Leibniz 1684 paper. The product and quotient rule is introduced.
Lecture 10: The chain rule
In this lecture, we look at the derivative of a composition of functions. Also this rule is important.
It will allow us to compute derivatives like for f(x) = sin(x
3
) which is a composition of two
functions f(x) = x
3
and g(x) = sin(x). We can in this example not use the product rule since
we do not have a product of functions. It is a composition of functions. How do we compute the
derivative functions which are chained together like this? The answer to this question is given
by the chain rule:
d
dx
f(g(x)) = f
(g(x))g
(x) .
The chain rule follows from the identity
f( g(x+h) ) f(g(x))
h
=
[f( g(x) + (g(x+h)-g(x)) ) f(g(x))]
[g(x+h)-g(x)]
[g(x+h)-g(x)]
h
.
Write H(x) = g(x+h)-g(x) in the rst part on the right hand side
f(g(x + h)) f(g(x))
h
=
[f(g(x) + H) f(g(x))]
H

g(x + h) g(x)
h
.
As h 0, we also have H 0 and the rst part goes to f
(g(x)) and the second factor has g
(x)
as a limit.
1 Find the derivative of f(x) = (4x 1)
17
. Solution The inner function is 4x 1 which has
the derivative 4. We get therefore f
(x) = 17(4x 1)
6
4 = 28(4x 1)
6
. Remark. We could
have expanded out the power (4x1)
17
rst and avoided the chain rule. Avoiding the chain
rule is called the pain rule .
2 Find the derivative of f(x) = sin( cos(x)) at x = 0. Solution: applying the chain rule
gives cos( cos(x)) ( sin(x)).
3 For linear functions f(x) = ax + b, g(x) = cx + d, the chain rule can readily be checked.
We have f(g(x)) = a(cx + d) + b = acx + ad + b which has the derivative ac. Indeed this
is the denition of f times the derivative of g. You can convince you that the chain rule is
true also from this example since if you look closely at a point, then the function is close to
linear.
One of the cool applications of the chain rule is that we can compute derivatives of inverse
functions:
4 Find the derivative of the natural logarithm function log(x)
1
Solution Dierentiate the
identity exp(log(x)) = x. On the right hand side we have 1. On the left hand side the chain
rule gives exp(log(x)) log
(x) = xlog
(x) = 1. Therefore log
(x) = 1/x.
1
We always write log(x) for the natural log. The ln notation is old fashioned and only used in obscure places
like calculus books and calculators from the last millenium.
d
dx
log(x) = 1/x.
Denote by arccos(x) the inverse of cos(x) on [0, ] and with arcsin(x) the inverse of
sin(x) on [/2, /2].
5 Find the derivative of arcsin(x). Solution. We write x = sin(arcsin(x)) and dierentiate.
d
dx
arcsin(x) =
1
1 x
2
.
6 Find the derivative of arccos(x). Solution. We write x = cos(arccos(x)) and dierentiate.
d
dx
arccos(x) =
1
1 x
2
.
7 f(x) = sin(x
2
+ 3). Then f
(x) = cos(x
2
+ 3)2x.
8 f(x) = sin(sin(sin(x))). Then f
(x) = cos(sin(sin(x))) cos(sin(x)) cos(x).

Why is the chain rule called chain rule. The reason is that we can chain even more
functions together.
9 Lets compute the derivative of sin(
x
5
1) for example. Solution: This is a composition
of three functions f(g(h(x))), where h(x) = x
5
1, g(x) =
x and f(x) = sin(x). The chain

rule applied to the function sin(x) and
x
5
1 gives cos(
x
5
1)
d
dx
x
5
1. Apply now
the chain rule again for the derivative on the right hand side.
10
Here is the famous falling ladder problem. A stick of
length 1 slides down a wall. How fast does it hit the oor
if it slides horizontally on the oor with constant speed?
The ladder connects the point (0, y) on the wall with
(x, 0) on the oor. We want to express y as a function of
x. We have y = f(x) =
1 x
2
. Taking the derivative,
assuming x
= 1 gives f
(x) = 2x/
1 x
2
.
1
x
y
In reality, the ladder breaks away from the wall. One can calculate the force of the ladder
to the wall. The force becomes zero at the break-away angle = arcsin((2v
2
/(3g))
2/3
),
where g is the gravitational acceleration and v = x
is the velocity.
11 For the brave: nd the derivative of f(x) = cos(cos(cos(cos(cos(cos(cos(x))))))).
Homework
1 Find the derivatives of the following functions
a) f(x) = cos(
x
b) f(x) = tan(1/x
5
)
c) f(x) = exp(1/(1 + x))
d) (2 + sin(x))
5
2 Find the derivatives of the following functions at x = 1. (Problems c),d) were cut o
on the originally distributed pset and are not required. Do them nevertheless to have
more practice).
a) f(x) = x
4
log(x).
b)
x
5
+ 1
c) (1 + x
2
+ x
4
)
100
d)
5x
4
2
x
5
+1
3 a) Find the derivative of f(x) = 1/x by dierentiating the identity xf(x) = 1.
b) Find the derivative of f(x) = arccot(x) by dierentiating cot(arccot(x)) = x.
4 a) Find the derivative of f(x) =
x by dierentiating the identity f(x)

2
= x.
b) Find the derivative of f(x) = x
m/n
by dierentiating the identity f(x)
n
= x
m
.
The function f(x) = [exp(x) + exp(x)]/2 is called cosh(x).
The function f(x) = [exp(x) exp(x)]/2 is called sinh(x).
They are called hyperbolic cosine and hyperbolic sine. The rst is even, the sec-
ond is odd. You can see directly using exp
(x) = exp(x) and exp
(x) = exp(x)
that sinh
(x) = cosh(x) and cosh
(x) = sinh(x). Furthermore exp = cosh +sinh

writes exp as a sum of an even and odd function.
5 a) Find the derivative of the inverse arccosh(x) of cosh(x).
b) Find the derivative of the inverse arcsinh(x) of sinh(x).
coshx
x
sinhx
x
Apropos chain: if you look at the shape of a chain hanging at two points, then it is in
the shape of the hyperbolic cosine.
The chain rule
On this valentine day, we preview a nice application of the chain rule.
We will cover this later in the course.
The Valentine equation (x
2
+ y
2
1)
3
x
2
y
3
= 0 relates x with
y, but we can not write the curve as a graph of a function y = g(x).
Extracting y or x is dicult since they are in love. The set of points
satisfying the equation looks like a heart. Well, romance is known to
be complicated!
You can check that (1, 1) satises the Valentine equation. Near it, the
curve looks like the graph of a function g(x). Lets ll that in and look
at the function
f(x) = (x
2
+ g(x)
2
1)
3
x
2
g(x)
3
The key is that f(x) is actually zero and if we take the derivative,
then we get zero too. Using the chain rule, we can take the derivative
f
(x) = 3(x
2
+g(x)
2
1)(2x+2g(x)g
(x))2xg(x)
3
x
2
3g(x)
2
g
(x) = 0
Magically, we can solve for g
(x) =
3(x
2
+ g(x)
2
1)2x 2xg(x)
3
3(x
2
+ g(x)
2
1)2g(x) 3x
2
g(x)
2
.
Filling in x = 1, g(x) = 1, we see this is 4/3. We have computed
the slope of g without knowing g. Isnt that magic? If this was a bit
too complicated, dont worry. We will have an entire lecture on this
later in the course.
1 Compute the derivative of f(x) using the chain rule and verify
the formula above.
Lecture 11: Local extrema
Maximizing and minimizing functions is an important task. The reasons are obvious: we want
to maximize nice quantities and minimize unpleasant ones. Extremizing quantities is also the
most important principle nature follows. Important laws in physics like Newtons law, equations
describing light, or matter can be based on the principle of extremization. The intuition is that
at maxima or minima the tangent to the graph is horizontal. This leads to a zero derivative and
the notion of critical points:
A point x0 is a critical point of a dierentiable function f if f
(x0) = 0.
In some textbooks, critical points include points where f
is not dened. In this course we do not

include these points in the list of critical points. They are points outside the domain of denition
of f
and will be treated separately.

1
Find the critical points of the function f(x) = x
3
+
3x
2
24x. Solution: we compute the derivative as
f
(x) = 3x
2
+ 6x 24. The roots of f
are 2, 4.
A point is called a local maximum of f, if there exists a neighborhood U =
(p a, p + a) of p, such that f(p) f(x) for all x U. Similarly, we dene a local
minimum. Local maxima and minima together are called local extrema.
2 The point x = 0 is a local maximum for f(x) = cos(x). The reason is that f(0) = 1 and
f(x) < 1 nearby.
3 The point x = 1 is a local minimum for f(x) = (x 1)
2
. The function is zero at x = 1
and positive everywhere else.
Fermat: If f is dierentiable and has a local extremum at x, then f
(x) = 0.
Why? Assume the derivative f
(x) = c is not zero. We can assume c > 0 other-

wise replace f with f. By the denition of limits, for some large enough h, we have
f(x + h) f(x)/h c/2. But this means f(x + h) f(x) + hc/2 and x can not be a
local maximum. Since also (f(x) f(x h))/h c/2 for small enough h, we also have
f(x h) f(x) hc/2 and x can not be a local minimum.
1
2
4
The derivative of f(x) = 72x30x
2
8x
3
+3x
4
is f
(x) =
7260x24x
2
+12x
3
By plugging in integers (calculus
teachers like integer roots because students like integer
roots!) we can guess the roots x = 1, x = 3, x = 2 and
see f
(x) = 12(x 1)(x + 2)(x 3). The critical points

are 1, 3, 2.
5
We have already seen that f
(x) = 0 does not assure

that x is a local extremum. The function f(x) = x
3
is
a counter example. It satises f
(0) = 0 but 0 is not

a local extremum. It is an example of an inection
point, a point where f
changes sign.
6
Lets look at one nasty example. The function f(x) =
xsin(1/x) is continuous at 0 but there are innitely
many critical points near 0.
If f
(x) > 0, then the graph of the function is concave up. If f
(x) < 0 then the graph

of the function is concave down.
Second derivative test. If x is a critical point of f and f
(x) > 0, then f is a

local minimum. If f
(x) < 0, then f is a local maximum.

If f
(x0) > 0 then f
(x) is negative for x < x0 and positive for f
(x) > x0. This means

that the function decreases left from the critical point and increases right from the critical
point. Similarly, if f
(x0) < 0 then f
(x) is positive for x < x0 and f
(x) is positive for

x > x0. This means that the function increases left from the critical point and increases
right from the critical point.
2
has one critical point at x = 0. Its second derivative is 2 there.
8 Find the local maxima and minima of the function f(x) = x
3
3x using the second
derivative test. Solution: f
(x) = 3x
2
3 has the roots 1, 1. The second derivative
f
(x) = 6x is negative at x = 1 and positive at x = 1. The point x = 1 is therefore a

local maximum and the point x = 1 is a local minimum.
9 Find the local maxima and minima of the function f(x) = cos(x) using the second
derivative test.
10 For the function f(x) = x
5
x
3
, the second derivative test is inconclusive at x = 0. Can
you nevertheless see the critical points?
3
11 Also for the function f(x) = x
4
, the second derivative test is inconclusive at x = 0. The
second derivative is zero. Can you nevertheless see whether the critical point 0 is local
maximum of local minimum?
Finally, lets look at an example, where we can practice a bit the chain rule.
12 Find the critical points of f(x) = 4 arctan(x) + x
2
. Solution. The derivative is
f
(x) =
4
1 + x
2
+ 2x =
2x + 2x
3
+ 4
1 + x
2
.
We see that x = 1 is a critical point. There are no other roots of 2x + 2x
3
+ 4 = 0.
How did we get the derivative of arctan again? Dierentiate
tan(arctan(x)) = x
and write u = arctan(x) :
1
cos
2
( u )
arctan
(x) = 1 .
Use the identity 1 + tan
2
( u ) = 1/ cos
2
( u ) to write this as
(1 + tan
2
( u )) arctan
(x) = 1 .
But tan( u ) = tan( arctan(x) )= x so that tan
2
(u) = x
2
. And we have
(1 + x
2
) arctan
(x) = 1 .
Now solve for arctan
(x):
arctan
(x) =
1
1 + x
2
.
4
Homework
1 Find all critical points for the following functions. If there are innitely many, indicate
their structure. For f(x) = cos(x) for example, the critical points can be written as
/2 + k, where k is an integer.
a) f(x) = x
4
3x
2
.
b) f(x) = 3 + sin(x)
c) f(x) = exp(x
2
)x
2
.
d) f(x) = cos(sin(x))
2 For the following functions, nd all the maxima and minima using the second deriva-
tive test:
a) f(x) = xlog(x), where x > 0.
b) f(x) = 1/(1 + x
2
)
c) f(x) = x
2
2x + 1.
d) f(x) = 2xtan(x), where /2 < x < /2
3 Verify that a cubic equation f(x) = x
3
+ ax
2
+ bx + c always has an inection point,
a point where f
(x) changes sign.

Hint. Remember the wobbling table!
4 Depending on c, the function f(x) = x
4
cx
2
has either one or three critical points.
Find these points for a general c and use the second derivative test to see whether
they are maxima or minima. The answer will depend on c. Where does the answer
change?
5 This creative problem is motivated from an interesting observation of Kent done last
week in class. You can write down explicit formulas (of course you can experiment with
graphing software) or just draw the graph. If you think no solution exists, indicate
so.
a) Find a function which has exactly 2 local maximum and 1 local minimum.
b) Find a function which has exactly 2 local maxima and no local minimum.
Critical points and extrema
Which rectangle of xed area xy = 1 has minimal circumference 2x+
2y?
We have to extremize the function
f(x) = 2x +
2
x
.
1 Dierentiate the function f. For which x is it continuous?
2 Find the critical points of f, the places where f
(x) = 0.
3 Sketch the graph of f on the interval (0, 4].
x y1
x
y
0 4 x
fx
A related but much more dicult problem is to nd the shape with
xed area 1 which has minimal circumference. A more advanced avor
of calculus allows to solve this: the calculus of variations.
1
So, which is the winner? You might have guessed. The circle. The
rectangle example illustrates already that symmetry is often favored
in extremization problems.
Lecture 12: Global extrema
In this lecture we look at super maxima. Local maxima are great, global maxima are the greatest.
These extrema can occur at critical points of f or at the boundary of the domain, where f is
dened.
A point p is called a global maximum of f if f(p) f(x) for all x. A point p is
called a global minimum of f if f(p) f(x) for all x.
How do we nd global maxima? We just make a list of all local extrema and boundary points,
then pick the largest. Global extrema do not need to exist on the real line. The function f(x) = x
2
has a global minimum at x = 0 but no global maximum. We can however look at global maxima
on nite intervals.
1 Find the global maximum of f(x) = x
2
on the interval [1, 4]. Solution. We look for
local extrema at critical points and at the boundary. Then we compare all these extrema
to nd the maximum or minimum. The critical points are x = 0. The boundary points
are 1, 4. Comparing the values f(1) = 1, f(0) = 0 and f(4) = 16 shows that f has a
global maximum at 4 and a global minimum at 0.
Extreme value theorem A continuous function f on a nite interval [a, b] attains
a global maximum and a global minimum.
Here is the argument: Because the function is continuous, the image f([a, b]) is a closed interval
[c, d].
1
There is a point such that f(x) = c, which is a global minimum and a point where
f(x) = d which is a global maximum.
Note that the global maximum or minimum can also also on the boundary or points where the
derivative dos not exist.
2 Find the global maximum and minimum of the function f(x) = |x|. The function has no
absolute maximum as it goes to innity for x . The function has no critical point on
the domain of denition R \ {0} of the function f
. To see the minimum, we have also to

look at the point x = 0.
3
A soda can is a cylinder of volume r
2
h. The surface
area 2rh+2r
2
measures the amount of material used
to manufacture the can. Assume the surface area is 2,
we can solve the equation for h = (1 r
2
)/r = 1/r r
Solution: The volume is f(r) = (rr
3
). Find the can
with maximal volume: f
(r) = 3r
2
= 0 showing
r = 1/
3. This leads to h = 2/
3.
1
This statement needs more justication but is intuitive enough that we can accept it.
1
2
4 Take a US Letter size paper of 811 inches.
2
If we cut out 4 squares of equal size at the
corners, we can fold up the paper to a tray with width (82x) length (112x) and height
x. Find the x [0, 4] for which the volume f(x) = (8 2x)(11 2x)x = 4x
3
38x
2
+88x
is maximal. The solutions to f
(x) = 12x
2
76x + 88 = 0 are x = i(19
97)/2 which is
about 1.5 or 5. The second one is larger than 4. We see that
What is the minimal volume? This example illustrates that we might have to look at
the boundary of the interval for extrema.
Assume we have a function f which is dierentiable except at some points a1, . . . , an.
We include the end points of the domain of denition in this list. The task is to nd the
global maximum. How do we proceed?
1. Evaluate the function at the point a1, .., an.
2. Find the local maxima by looking at critical points b1, ., ..bn.
3. Find the maximum of f(a1), f(a2), ..., f(an), f(b1), ...f(bn).
5
Find the global maxima and minima of the function
f(x) = |x| 2x
2
+ x
3
on the interval [1, 2].
1) The function is dierentiable except at x = 0. On
x > 0 the function is f(x) = x 2x
2
+ x
3
.
It has derivative 1 4x +3x
2
which has the root 1/3, 1.
On x < 0 the function is f(x) = x 2x
2
+ x
3
, which
has a critical point at (2
7)/3 = 0.215.. There is

an other critical point but that one is above x < 0. So
we have the three critical points 1/3, 1, (2
7)/3.
2) The function is not dierentiable at x = 0 and has
the boundary points 1, 2.
3) If we evaluate f at the critical points, we get the val-
ues (0, 0.148, 0, 0.1125, 2, 2). The global maximum is
at x = 2.
2
The correct size is 17/2 11 inches we avoid fractions.
3
Homework
1 Find the global maxima and minima of the function f(x) = (x 2)
2
on the interval
[0, 3].
2 Find the global maximum and minimum of the function f(x) = 2x
3
3x
2
36x on
the interval [5, 5]
3 A candy manufacturer builds spherical candies. Its eectivness is A(r) V (r), where
A(r) is the surface area and V (r) the volume of a candy of radius r. Find the radius,
where f(r) = A(r) V (r) has a global maximum for r 0.
4 Lets look at the falling ladder again. But now x denotes the angle, the ladder makes
with the oor. Find the angle, where the distance f(x) of the ladder to the wall-oor
corner is maximal.
P.S. You can assume the ladder has length 1 but it will not matter how long the ladder
is.
x
fx
4
5 a) The function S(p) = p log(p) is called the entropy function.
3
Find the
probability 0 < p 1 which maximizes entropy. One of the most important principle
in all science is that nature tries to maximize entropy. In some sense we compute here
the number of maximal entropy.
b) We can write 1/x
x
= e
x log(x)
. Find the value x, where x
x
has a local maximum
which is the point where x
x
has a local minimum. .
4
3
If W = 1/p is the Wahrscheinlichkeit, the number of microstates, then S(p) = p log(W) is the expectation
of W, also written just as log(W). This relation between probability and entropy is inscribed on Bolzmanns
thombstone S = k log(W), where k is an additional constant which depends on units. The Bolzman entropy
formula has far reaching consequences to very concrete problems in chemistry.
4
The identity a
b
= e
b log(a)
one of the three properties to remember for exponentials. The other two are
a
b
a
c
= a
b+c
and (a
b
)
c
= a
bc
.
Extrema with boundaries
The following famous problem is usually asked with the Statue of
liberty. At Harvard, we of course want to use the John Harvard Statue.
It is a common situation. You want to look at a statue. If you are too
close below it, the viewing angle becomes small. If you are far away,
the viewing angle decreases again. There is an optimal distance where
the viewing angle is maximal.
At which distance x do you see most of the John Harvard Statue?
Assume the part you want to see 4 to 9 feet higher than your eyes.
1 Verify that the angle you see from the statue is
f(x) = arctan(
9
x
) arctan(
4
x
) .
2 Dierentiate f(x) to nd the minimum.
3 Are there any boundary points or points where f is not dierentiable?
4 Find the global maximum of f.
5 Is there a global minimum of f?
Here is a graph of part of the function f.
Lecture 13: Hopitals rule
The rule
In this lecture, we look at a powerful rule to compute limits. This Hopitals rule works miracles
and solves all our remaining worries about limits:
Hopitals rule. If f, g are dierentiable and f(p) = g(p) = 0 and g
(p) = 0, then
lim
xp
f(x)
g(x)
=
f
(p)
g
(p)
.
Lets see how it works:
1 Lets prove the fundamental theorem of trigonometry again:
lim
x0
sin(x)
x
= lim
x0
cos(x)
1
= 1 .
Why did we work so hard for this? Note that we used the fundamental theorem to derive
the derivatives for cos and sin at all points. In order to apply lHopital, we had to know
the derivative. Our work to establish the limit was not in vain.
The proof of the rule is almost comic in its simplicity if we compare it with how fantastically
useful it is:
Since f(p) = g(p) = 0 we have Df(p) = f(p + h)/h and Dg(p) = g(p + h)/h so that for every
h > 0 with g(p + h) = 0 the quantum lHopital rule holds:
f(p + h)
g(p + h)
=
Df(p)
Dg(p)
.
Now take the limit h 0. The left side is what we want to know, the right side is a quotient of
two limits which exist since g
(p) = 0.
1
Sometimes, we have to administer a medicine twice. To use this, lHopital can be improved in
that the condition g
(0) = 0 can be replaced by the requirement that the limit limxp f
(x)/g
(x)
exists. Instead of having a rule which replaces a limit with an other limit (we cure a disease with
a new one!) we formulate it in the way how it is actually used. The second derivative case could
easily be generalized for higher derivatives. There is no need to memorize this. Just remember
that you can check in several times to a hospital.
If f(p) = g(p) = f
(p) = g
(p) = 0 then limxp

f(x)
g(x)
=
f
(p)
g
(p)
if g
(p) = 0.
1
Some books refer to the intermediate value theorem here. This is not necessary.
1
2
2 Find the limit limx0(1 cos(x))/x
2
. Remember that this limit had also been pivotal to
compute the derivatives of trigonometric functions. Solution: dierentiation gives
lim
x0
sin(x)/2x .
This limit can be obtained with lHopital again.
lim
x0
sin(x)/(2x) = lim
x0
cos(x)/2 = 1/2 .
3 Find the limit f(x) = (exp(x
2
) 1)/ sin(x
2
) for x 0.
4 What do you get if you apply lHopital to the limit [f(x + h) f(x)]/h as h 0?
5 Find limxxsin(1/x). Solution. Write y = 1/x then sin(y)/y. Now we have a limit,
where the denominator and nominator both go to zero.
The case when both sides converge to innity can be reduced the other case by look-
ing at A = f/g = (1/g(x))/(1/f(x)) which has the limit g
(x)/g
2
(x)/f
(x)/f
2
(x) =
g
(x)/f
(x)((1/g)/(1/f))
2
= g
/f
(f
2
/g
2
) = (g
/f
)A
2
, so that A = f
(p)/g
(p). We see:
If limxp f(x) = limxp g(x) = for x p and g
(p) = 0, then
lim
xp
f(x)
g(x)
=
f
(p)
g
(p)
.
2 What is the limit limx0 x
x
? This answers the intriguing question: what is What is 0
0
?
Solution: Because x
x
= e
x log(x)
, it is enough to understand the limit xlog(x) for
x 0.
lim
x0
log(x)
1/x
.
Now the limit can be seen as the limit (1/x)/(1/x
2
) = x which goes to 0. Therefore
limx0 x
x
= 1. (We assume x > 0 to have real values x
x
)
3 Find the limit limx2
x
2
4x+4
sin
2
(x2)
.
Solution: this is a case where f(2) = f
(2) = g(2) = g
(2) = 0 but g
(0) = 2. The
limit is f
(2)/g
(2) = 2/2 = 1.
Hopitals rule always works in calculus situations, where functions are dierentiable.
The rule can fail if dierentiability of f or g fails. Here is an other rare example:
4 Deja Vue: Find
x
2
+1
x
for x . LHopital gives x/
x
2
+ 1 which in terms gives
again
x
2
+1
x
. Apply lHopital again to get the original function. We got an innite
loop. If the limit is A, then the procedure tells that it is equal to 1/A. The limit must
therefore be 1. This case can be covered easily without lHopital: divide both sides
by x to get
1 + 1/x
2
. Now, we can see the limit 1.
5
Scarecrow: given f(x) = xsin(1/x
4
)e
1/x
2
and g(x) =
e
1/x
2
. What is the limit f(x)/g(x) as x 0. Solu-
tion. Since the functions f and g are not dierentiable
at x = 0 lHopital is not appropriate. The example ap-
pears in textbooks because the limit still exists. Look at
f/g = xsin(1/x
4
) which satises |f(x)/g(x)| |x| and
converges to 0 for x 0.
2
2
It appears in http://mathworld.wolfram.com/LHospitalsRule.html
3
6 Given a dierentiable function satisfying g(0) = 0. Verify that the limit limx0 f(g(x))/g(x)
is f
(0). Solution: You check in the homework that the result is f
(g(0)).
History
The rule appeared in the rst calculus book the world has known. The book with
name Analyse des Inniment Petits pour lintelligence des Lignes Courbes appeared
in 1696 and was written by Guillaume de lHopital, a text if typeset in a modern font
would probably t onto 50-100 pages.
3
It is now clear that the mathematical con-
tent of Hopitals book is mostly due to Johannes Bernoulli who became a math-
ematical mercenary for lHopital: Cliord Truesdell write in his article The New
Bernoulli Edition,
4
about this most extraordinary agreement in the history of sci-
ence: lHopital wrote: I will be happy to give you a retainer of 300 pounds, beginning
with the rst of January of this year ... I promise shortly to increase this retainer,
which I know is very modest, as soon as my aairs are somewhat straightened out ...
I am not so unreasonable as to demand in return all of your time, but I will ask you
to give me at intervals some hours of your time to work on what I request and also
to communicate to me your discoveries, at the same time asking you not to disclose
any of them to others. I ask you even not to send here to Mr. Varignon or to others
any copies of the writings you have left with me; if they are published, I will not be at
all pleased. Answer me regarding all this ... Bernoullis response is lost, but a letter
from lHopital indicates that it was quickly accepted. From this point on, Bernoulli
was a giant enchained (Truesdell). Cliord Truesdell also mentions that the book
of lHopital has remained the standard for Calculus for a century.
3
Stewarts book with 1200 pages probably contains about 4 million characters, about 12 times more than
lHopitals book. It also contains more material of course. The OCRed text of lHopitals book of 200 pages has
300000 characters.
4
Isis, Vol. 49, No. 1, 1958, pages 54-62
4
Homework
1 For the following functions, nd the limits as x 0:
a) (x
2
x)/ sin(x)
b) (exp(x) 1)/(exp(3x) 1)
c) sin
3
(3x)/ sin
3
(5x)
d) x + log(x)x +
sin(x
2
)
sin
2
(x)
e) sin(sin(sin(exp(sin(x)))))/ sin(sin(exp(sin(x)))).
2 For the following functions, nd the limits as x 1:
a) (x
2
x 1)/(cos(x 1) 1)
b) (exp(x) e)/(exp(3x) e
3
)
For the following functions, nd the limits as x :
c) (x
2
x 1)/
x
4
+ 1
d) (x 4)/(4x + sin(x) + 8)
3 Here is an FUD attempt on lHopitals rule: Dene f(x) = x+cos(x) sin(x) and
g(x) = exp(sin(x))(x + cos(x) sin(x)).
a) Show that f
(x)/g
(x) converges to zero as x .

b) Verify that f(x)/g(x) remains in the interval [1/e, e] but does not converge.
The function is not dierentiable at . There is no problem with lHopital.
4 Take the same functions from the previous example and look at the limit
f(x)/g(x) for x 0. Now things are nice and dandy because the functions
are dierentiable at 0.
5 a) Assume a function f(x) satises f(0) = 0 and f
(0) = 0. Verify the following

formula
lim
x0
f(ax)/f(bx) = a/b .
b) Given a dierentiable function g satisfying g(0) = 0 and a dierentiable
function f. Verify that
lim
x0
f(g(x))
g(x)
= f
(0) .
Hopitals rule
1 What does lHopitals rule say about
lim
x0
exp(2x) 1
x
.
1.0 0.5 0.5 1.0
1
1
2
3
2 Apply lHopitals rule to get the limit of
f(x) =
sin(100x)
sin(200x)
for x 0.
Lecture 14: Newtons method
In the intermediate value theorem lecture, we have seen a simple method to nd a root of a
function: start with an interval [a, b] such that f(a) < 0 and f(b) > 0, then successively half the
interval always choosing the side on which the function takes dierent signs at the boundary. We
are then (b a)/2
n
close to the root in n steps. If the function is dierentiable we can do much
better and use the value of the derivative at a boundary point to get closer. If we draw a tangent
at (x, f(x)), then
f
(x) =
f(x)
x T(x)
.
because f
(x) is the slope of the tangent and the right hand side is rise over run. If we solve
for T(x) we get
The Newton map is dened as
T(x) = x
f(x)
f
(x)
.
Newtons method is the process to apply this map again and again until we are suciently close
to the root. It is an extremely fast method to nd the root of a function. Start with a point x,
then compute a new point x1 = T(x), where
T(x) = x f(x)/f
(x) .
Now iterate this again and again.
If p is a root such that f
(p) = 0, and x0 is close to p, then x1 = T(x), x2 = T

2
(x)
converges to the root p.
fx
xTx
x
0
x
1
x
2
0.2 0.4 0.6 0.8 1.0 1.2 1.4
0.5
1.0
1.5
1 If f(x) = ax + b, we reach the root in one step.
2 If f(x) = x
2
then T(x) = x x
2
/(2x) = x/2. We get exponentially fast to the root 0 but
not as fast as the method promises. Indeed, the root is also a critical point which slows us
down.
3 The Newton map brings us to innity if we start at a critical point.
Newton used the method to nd the roots of polynomials. The method is so fast that it amazes:
Starting 0.1 close to the point, we have after one step 0.01 after 2 steps 0.0001 after 3 steps
0.00000001 and after 4 steps 0.0000000000000001.
1
2
The Newton method converges extremely fast to a root f(p) = 0 if f
(p) = 0 if we
start suciently close to the root.
In 10 steps we can get a 2
10
= 1024 digits accuracy. Having a fast method to compute roots is
useful. For example in computer graphics, where things can not be fast enough. Also in number
theory, when working with integers having thousands of digits the Newton method can help.
Besides that, there is theoretical use which can explain for example the stability of planetary
motion.
4 Verify that the Newton map T(x) in the case f(x) = (x 1)
3
has the property that we
approach the root x = 1. Solution. You see that the approach is not that fast: we get
T(x) = x+(1x)/3 = (1+2x)/3. It converges exponentially fast, but not superexponential.
T The reason is that the derivative at x 1 is not zero. That slows us down.
If we have several roots, and we start at some point, to which root will the Newton
method converge? Doe it at all converge? This is an interesting question. It is also histor-
ically intriguing because it is one of the rst cases, where chaos can be observed at the
end of the 19th century.
5 Find the Newton map in the case f(x) = x
5
1. Solution T(x) = x (x
5
1)/(5x
4
).
If we look for roots in the complex like for f(x) = x
5
1 which has 5 roots in the
complex plane, the basin of attraction of each of the points is a complicated set, a so called
Newton fractal. Here is the picture:
3
6 Lets compute
2 to 12 digits accuracy - by hand! We want to nd a root f(x) = x

2
2.
The Newton map is T(x) = x (x
2
2)/(2x). Lets start with x = 1.
T(1) = 1 (1 2)/2 = 3/2
T(3/2) = 3/2 ((3/2)
2
2)/3 = 17/12
T(17/12) = 577/408
T(577/408) = 665857/470832
This is already 1.6 10
12
close to the real root!
7 To nd the cube root of 10 we have to nd a root of f(x) = x
3
10. The Newton
map is T(x) = x (x
3
10)/(3x
2
). If we start with x = 2, we get the following steps
2, 13/6, 3277/1521, 105569067476/49000820427. After three steps we have a result which
is already 2.2 10
9
close to the root.
The Newton method is an incredibly fast algorithm to get roots x0 of equations.
Simply scrumtrulescent.
4
Homework
1 Find the Newton map T(x) = x f(x)/f
(x) in the following cases

a) f(x) = x
3
b) f(x) = e
x
c) f(x) = e
x
2
d) f(x) = 2 tan(x).
2 a) The sinc function f(x) = sin(x)/x has a root between 1 and 4. We get closer to
the root by doing a Newton step starting with x = /2. Do this step
0.2 0.2 0.4 0.6 0.8 1.0 1.2
0.5
0.5
1.0
3 The Newton map is handy to compute square roots. Assume we cant to nd the
square root of 99. We have to solve
99 = x or f(x) = x
2
99 = 0. Perform two
Newton steps T(x) = x (x
2
99)/(2x) starting at x = 10.
4 a) Find the Newton step T(x) = xf(x)/f
(x) in the case f(x) = 1/x and f(x) = x

6
.
b) Find the Newton step T(x) in general if f(x) = x
, where is a real number.

5 A chaotic Newton map. Verify that the Newton map in the case f(x) = (4 3/x)
1/3
is the quadratic map T(x) = 4x(1 x). We will see a demonstration in class which
shows that this map is a true random number generator. The Newton map does not
converge.
0.2 0.4 0.6 0.8 1.0
6
4
2
The graph of the function f(x) and a few Newton steps. The function is continuous
on (0, 1). Its derivative too except at x = 2/3.
Newton Method
1 In the following graph, trace the two Newton steps already
done. Add one more!
x
0
x
1
x
2
0.2 0.4 0.6 0.8 1.0 1.2 1.4
0.5
1.0
1.5
2 In the following graph, try a few Newton steps. Let your start-
ing point x
0
be around 0.4.
0.2 0.4 0.6 0.8 1.0 1.2 1.4
2.0
1.5
1.0
0.5
0.5
1.0
1.5
2.0
3 We will together compute the square root of 2 to an accuracy
of 12 digits. Without computer.
Lecture 15: Review for rst midterm
Major points
A function is continuous, if the closeness of x, y implies the closeness of f(x), f(y).
Intermediate value theorem: f(a) > 0, f(b) < 0 implies f having a root in (a, b).
At a local extremum, f
(x) = 0. If f
(x) > 0, it is a local minimum. If f
(x) > 0,
it is a local maximum. Global extrema: compare local extrema and boundary
values.
If f
> 0 then f is increasing, if f
< 0 it is decreasing. If f
(x) > 0 it is concave

up, if f
(x) < 0 it is concave down. If f
(x) = 0 then f has a horizontal tangent.

Hoptial tells that limits limxp f(x)/g(x), where f(p) = g(p) = 0 or f(p) = g(p) =
with g
(p) = 0 are given by f
(p)/g
(p).
With Df(x) = (f(x + h) f(x))/h and S(x) = h(f(h) + f(2h) + ...f(kh)) we
have SDf(kh) = f(kh) f(0) and DS(f(kh)) = f(kh). This is a preliminary
fundamental theorem of calculus.
Roots of f(x) with f(a) < 0, f(b) > 0 can be obtained numerically by dissection or
by applying the Newton map T(x) = x f(x)/f
(x) again and again.

Algebra reminders
Healing: (a + b)(a b) = a
2
b
2
or 1 + a + a
2
+ a
3
+ a
4
= (a
5
1)/(a 1)
Denominator: 1/a + 1/b = (a + b)/(ab)
Exponential: (e
a
)
b
= e
ab
, e
a
e
b
= e
a+b
, a
b
= e
b log(a)
Logarithm: log(ab) = log(a) + log(b). log(a
b
) = b log(a)
Trig functions: cos
2
(x) + sin
2
(x) = 1, sin(2x) = 2 sin(x) cos(x), cos(2x) = cos
2
(x) sin
2
(x)
Square roots: a
1/2
=
a, a
1/2
= 1/
a
Important functions
Polynomials x
3
+ 2x
2
+ 3x + 1
Rational functions (x + 1)/(x
3
+ 2x + 1)
Trig functions 2 cos(3x)
Exponential 5e
3x
Logarithm log(3x)
Inverse trig functions arctan(x)
Important derivatives
1
2
f(x) f
(x)
f(x) = x
n
nx
n1
f(x) = e
ax
ae
ax
f(x) = cos(ax) a sin(a x)
f(x) f
(x)
f(x) = sin(ax) a cos(a x)
f(x) = tan(x) 1/ cos
2
(x)
f(x) = log(x) 1/x
Dierentation rules
Addition rule (f + g)
= f
+ g
.
Scaling rule (cf)
= cf
.
Product rule (fg)
= f
g + fg
.
Quotient rule (f/g)
= (f
g fg
)/g
2
.
Chain rule (f(g(x))
= f
(g(x))g
(x).
Easy rule simplify before deriving
Extremal problems
1 Build a fence of length x+2y = 12 with dimensions x and y with maximal area A = xy.
2 Find the largest area A = 4xy of a rectangle with vertices (x, y), (x, y), (x, y), (x, y)
inscribed in the ellipse x
2
+ 2y
2
= 1.
3 Which isosceles triangle of height h and base 2x and area xh = 1 has minimal circumference
2x + 2
x
2
+ h
2
?
4 Where is the distance
x
2
+ y
2
of the parabola y = x
2
= 2 to the point (0, 0) minimal?
5 A cone of height h = 1 + x and radius r =
1 x
2
is tightly enclosed by a unit sphere
centered at height x. Maximize the volume r
2
h/3 of the cone.
6 Maximize f(x) = sin(x) on [0, ].
Limit examples
limx0 sin(x)/x lHopital 0/0
limx0(1 cos(x))/x
2
lHopital 0/0 twice
limx0 xlog(x) lHopital /
limx1(x
2
1)/(x + 1) heal directly
limxexp(x)/(1 + exp(x)) lHopital
limx0(x + 1)/(x + 5) no work necessary
Important things
Summation and taking dierences is at the hart of calculus
The 3 major types of discontinuities are jump, oscillation, innity
The Newton method is an algorithm to nd roots
Remember the fundamental theorem of trigonometry limx0 sin(x)/x = 1.
The derivative is the limit Df(x) = [f(x + h) f(x)]/h as h 0. It is called rate of change.
The rule D(1 + h)
x/h
= (1 + h)
x/h
leads to exp
(x) = exp(x).
More Examples
1 Find limx1(x
1/4
1)/(x
1/5
1). Answer: 5/4.
2 Find limx1 sin(4x 4)/(x 1). Answer: 4.
3 Find limx2
3
7+x
x2
. Answer 1/6
4 Find arcsin
(5x
2
). Answer: 10x(1 25x
4
)
1/2
5 Is 1/ log |x| continuous at x = 0. Answer: yes
6 Is log(1/|x|) continuous at x = 0. Answer: no
Checklist
Make a list of the most important denitions and a list of the most
important results in this course.
Mind map
Produce your own mind map of the course. Here are some starting
points. On the back is a suggestion.
On the abac
Lecture 16: Mean value theorem
In this lecture, we look at the mean value theorem and a special case called Rolles theorem.
It is important later when we study the fundamental theorem of calculus. Unlike the intermediate
value theorem which applied for continuous functions, the mean value theorem involves derivatives:
Mean value theorem: For a dierentiable function f and an interval (a, b), there
exists a point p inside the interval such that
f
(p) =
f(b) f(a)
b a
.
fbfa
ba
Here are a few examples which illustrate this:
2
+ ax + b has roots at u = (a
a
2
4b)/2. The derivative
2x + a = 0 is zero for x = a/2.
2 f(x) = cx, then f
(x) = cx and f(b) f(a) = (cb ca)/(b a) = c. So, every point x has
the derivative c.
3 f(x) = arcsin(x) has the property that for any x, y in (1, 1), we have | arcsin(y)
arcsin(x)| |x y|. Solution. The derivative of arcsin(x) is 1/
1 x
2
> 1.
1.5 1.0 0.5 0.5 1.0 1.5
1.5
1.0
0.5
0.5
1.0
1.5
1 1 2 3
1
1
2
3
4 A biker drives with velocity f
(t) at position f(b) at time b and at position a at time a.

The value f(b) f(a) is the distance traveled. The fraction [f(b) f(a)]/(b a) is the
1
2
average speed. The theorem tells that there was a time when the bike had exactly the
average speed.
5 The function f(x) =
1 x
2
has a graph on (1, 1) on which every possible slope is taken.
Solution: We can see this with the intermediate value theorem because f
(x) = x/
1 x
2
gets arbitrary large near x = 1 or x = 1. The mean value theorem shows this too because
we can take intervals [a, b] = [1, 1 +c] for which [f(b) f(a)]/(b a) = f(1 +c)/ci
c/c = 1/
c gets arbitrary large.

Why is the theorem true? The function h(x) = f(a) + cx, where c = (f(b) f(a)/(b a) also
connects the beginning and end point. The function g(x) = f(x)h(x) has now the property that
g(a) = g(b). If we can show that for such a function, there exists x with g
(x) = 0, then we are

done. By tilting the picture, we have reduced the statement to a special case which is important
by itself:
Rolles theorem: If f(a) = f(b) and f is dierentiable, then there exists a critical
point p of f in the interval (a, b).
Here is the proof: If it were not true, then f
(x) = 0 and we would have f
(x) > 0 everywhere or

f
(x) < 0 everywhere. This would mean however that f(b) > f(a) or f(b) < f(a).
Here is a second proof: Fermats theorem assures that there is a local maximum or local minimum
of f in (a, b). At this point the derivative is zero. This means f
(x) = 0.
We have also seen a related fact that if f is continuous and f(a) = f(b) then there is a local
maximum or local minimum in the interval (a, b). This fact is more general and applies to every
continuous function. The derivative does not need to exist.
6 There is a point in [0, 1] where f
(x) = 0 with f(x) = x(1 x

2
)(1 sin(x)). Solution:
We have f(0) = f(1) = 0. Use Rolle.
7 Show that the function f(x) = sin(x) +x( x) has a critical point [0, ]. Solution: The
function is nonnegative and zero at 0, .It is also dierentiable and so by Rolles theorem
there is a critical point. Remark. We can not use Rolles theorem to show that there is a
local maximum even so the extremal value theorem assures us that this exist.
8 Verify that the function f(x) = 2x
3
+ 3x
2
+ 6x + 1 has only one real root. Solution:
There is one real root by the intermediate value theorem: f(1) = 4, f(1) = 12. Assume
3
there would be two roots. Then by Rolles theorem there would be a value x where g(x) =
f
(x) = 6x
2
+ 6x + 6 has a root. But there is no root of g. [The graph of g minimum at
g
(x) = 6 + 12x = 0 which is 1/2 where g(1/2) = 21/2 > 0.]

Who was the rst to nd the mean value theorem? It is not so clear. Joseph Louis Lagrange
is one candidate. Also Augustin Louis Cauchy (1789-1857) is credited for a modern formulation
of the theorem.
Joseph Louis Lagrange, 1736-1813. Augustin Louis Cauchy, 1789-1857.
What about Michel Rolle? He lived from 1652 to 1719, mostly in Paris. No picture of him
seems available. Rolle also introduced the nth root notation like
3
x.
4
Homework
1 The function f(x) = 1 |x| satises f(1) = f(1) = 0 but there is no point where
f
(x) = 0. Is this a counter example to Rolles theorem?

2 Use Rolles theorem and the intermediate value theorem to show that the function f(x) =
x
3
+ 3x + 1 has exactly one root. You do not have to nd the root.
3 We look at the function f(x) = log |x| + sin(x) on the positive real line
Use the intermediate value theorem applied to f
(x) to assure that for every M > 0

there is a positive x for which f
(x) = M. Use the the mean value theorem to assure

that we can nd for every M two values a, b such that f(b) f(a)/(b a) = M.
4 Cauchys mean value theorem states that for any two dierentiable function and any
interval (a, b), there exists c for which (f(b) f(a))g
(c) = (g(b) g(a))f
(c). To prove
this , dene the function h(x) = (f(b) f(a))(g(x) g(a)) (g(b) g(a))(f(x) f(a)).
a) Verify that h(a) = h(b) = 0.
b) Compute h
(x).
c) Use Rolles theorem to verify that there is a c for which h
(c) = 0.
Hint. If stuck, there is more explanation in
http : //en.wikipedia.org/wiki/Mean value theorem. But rst give it a shot on your
own.
5 Given the function f(x) = xsin(x) and the function g(x) = cos(x). Verify (using Cauchys
mean value theorem) that there is a point p (0, /2) for which f
(p)/g
(p) = /2. You

do not have to nd the point.
The mean value theorem
In this class, we have at various places looked at
calculus with discrete eyes, where
Df(x) = [f(x + h) f(x)]/h .
We look here at the question whether there is a dis-
crete version of Rolles theorem. You may have the
opportunity to nd a new result. Note that quantum
results hold in general for functions which are only
continuous. No dierentiability is needed.
This worksheet might give you an idea what re-
search is about. You do not need the answer yet,
whether a result works or not. It is exciting because
nobody else does now simply because nobody has
studied the question yet!
1 Quantum Rolle: Given an interval [a, b]
from which we assume that its length is larger
than h. Given a continuous function f such that
f(a) = f(b) = 0. Is it true that there is a point p
in that interval for which Df(p) = 0? Play and
doodle around with examples.
2 Quantum mean value theorem: Given
an interval [a, b] and a function f. Is it true that
there is a point p such that
Df(p) =
f(b) f(a)
b a
?
Play around with examples.
3 Argue that you can tilt the setting as in
the continuum so that if the quantum Rolle re-
sult holds, then the quantum mean value theorem
holds.
Lecture 17: Catastrophes
In this lecture, we once more cover extrema problems. We are interested how extrema change when
a parameter changes. Nature, economies, processes favor extrema. Extrema change smoothly with
parameters. How come that the outcome is often not smooth? What is the reason that political
change can go so fast once a tipping point is reached? One can explain this with mathemati-
cal models. We look at a simple example, which explains it. In reality, the situation is more
complicated. In the New York Times of February 24, 2011, Jennifer E. Sims, the director
of intelligence studies at Georgetown Universitys School of Foreign Service and senior fellow at
the Chicago Council on Global Aairs asked: Why, with the U.S. spending 80 billion dollars on
intelligence, were we apparently surprised by recent regime changes in the Middle East? Why did
change happen at all? These are complex questions. Obviously, some tipping point has been
reached and the smallest event like the conscation of a fruit stand in Tunesia or increasing food
prizes in Egypt has produced change. In these complex examples, it will never be possible to un-
derstand everything. Lets look here at a simple mathematical model which illustrates the general
principle that:
If a local minimum seizes to become a local minimum, a new stable position is
favored. This can be far away from the original situation.
To get started, lets look at an extremization problem
1
Find all the extrema of the function f(x) = x
4
x
2
. So-
lution: f
(x) = 4x
3
2x is zero for x = 0, 1/
2, 1/
2.
The second derivative is 12x
2
2. It is negative for x = 0
and positive at the other two points. We have two local
minima and one local maximum.
x
4
x
2
x
2
Now nd all the extrema of the function f(x) = x
4
x
2
2x. There is only one critical point. It is x = 1.
x
4
x
2
2 x
x
Something has happened from the rst example to the second example. The local minimum to
the left has disappeared. Assume the function f measures the prosperity of some kind and c is
1
2
a parameter. We look at the position of the rst equilibrium point of the function. Catastroph
theorists usually assume the so called Delay assumption.
A stable equilibrium is here used as an other name for a local minimum. A system
state remains in a stable equilibrium until it disappears. If that happens, the system
settles in a neighboring stable equilibrium.
A parameter value for which a stable minimum disappears is called a catastrophe.
Here is the position of the equilibrium point plotted in dependence of c.
3
c
f
A parameter value for which a local minimum disappears is called a catastrophe.
c
Bifurcation diagram: The picture shows the equilibrium points as they change in dependence
of the parameter c. The vertical axes is the parameter c, the horizontal axes is x. At the bottom
for c = 0, we have three equilibrium points, two local minima and one local maximum. At the
top for c = 1 we have only one local minimum.
Catastrophes always go for the worse in the sense that the value decreases. It is not
possible to reverse the process and have a catastrophe, where the minimum jumps
up.
Look again at the above movie of graphs. But run it backwards and use the same principle.
We do not end up at the position we started with. The new equilibrium stays the equilibrium.
Decreasing the food prizes again did not reverse the process of change in Egypt for example.
Catastrophes are in general irreversible.
We see that in real life: It is easy to screw up a relationship, get sick, have a ligament torn or
lose trust. Building up a relationship, getting healthy or gaining trust on the other hand happens
4
slowly. Ruining a country or a company or losing a good reputation of a brand is very easy. It
takes a long time to regain it.
Local minima can change discontinuously, when a parameter is changed. This can
happen with perfectly smooth functions and smooth parameter changes.
3 Lets look at f(x) = x
4
+ cx
2
, where 1 c 1. We will look at that in class.
c
Homework
In this homework, we study a catastrophe for the function
f(x) = x
6
x
4
+ cx
2
,
where c is a parameter between 0 and 1.
1 a) Find all the critical points in the case c = 0 and analyze their stability. b) Find all the
critical points in the case c = 1 and analyze their stability.
2 Plot the graph of f for at least 10 values of c between 0 and 1. You can of course use
software, a graphing calculator or Wolfram alpha. Mathematica code is below.
3 If you change from c = 0.3 to 0.6 pinpoint the value for the catastrophe and show a rough
plot of c f(xc), the value at the rst local minimum xc in dependence of c. The text
above provides this graph for an other function. It is the graph with a discontinuity.
4 If you change back from c = 0.6 to 0.3 pinpoint the value for the catastrophe (it will be
dierent from the one in the previous question).
5 Sketch the bifurcation diagram. That is, if xk(c) is the kth equilibrium point, then draw
the union of all graphs of xk(c) as a function of c (the c-axes pointing upwards). As in the
two example provided, draw the local maximum with dotted lines.
Mani pul ate [ Plot [ x6 x4 + c x2 , {x , 1, 1}] , {c , 0 , 1}]

Catastrophes
We see here graphs of the function f(x) = x
4
cx
2
for c between 0
and 1:
1 Draw the bifurcation diagram in this case. The vertical axes
is the c axes.
Lecture 18: Riemann integral
In this lecture we dene the integral
x
0
f(t) dt if f is a dierentiable function and compute it for
some basic functions.
First a reminder. We have dened the Riemann sums
Sf(x) = h[ f(0) + f(h) + f(2h) + .... + f(kh)] ,
where k is the largest integer such that kh < x. Lets write Sn if we want to stress that the
parameter h = 1/n was used in the sum. We dene the integral as the limit of these sums Snf
when the mesh size h = 1/n goes to zero.
Dene
x
0
f(t) dt = lim
n0
Snf(x) .
xk x
y
0
For any dierentiable function, the limit exists
Proof: Lets rst assume f 0 on [0, x]. Let M be such that f M and f
M on [0, x].
The Riemann sum Snf(x) is the total area of k rectangles. Let Sf(x) denote the area under the
curve. If M is the maximal slope of f on [0, x], then on each interval [j/n, (j + 1)/n] , we have
|f(x) f(j/n)| M/n so that the area error is smaller than M/n
2
. Additionally, we have a piece
above the interval [kh, x] with area M/n. If we add all the k xn roof area errors and the
side area up, we get
Sf(x) Snf(x)
kM
n
2
+
M
n

xnM
n
2
+
M
n
=
xM + M
n
.
This converges to 0 for n . The limit is therefore the area Sf(x). For a general, not
necessarily nonnegative function, we write f = g h, where g, h are nonnegative (see homework)
and have
x
0
f(x) dx =
x
0
g(x) dx
x
0
h(x) dx.
For nonnegative f, the value
x
0
f(x) dx is the area between the x-axis and the
graph of f. For general f, it is a signed area, the dierence between two areas.
Remark: the Riemann integral is dened here as the limit h
xk=kh[0,x)
f(xk). It converges
to the area under the curve for all continuous functions but since we work with dierentiable
functions in calculus we restricted to that. Not all bounded functions can be integrated naturally
1
2
like this. There are discontinuous functions like the salt and pepper function which is dened
to be f(x) = 1 if x is rational and zero else. Now Sf(x) = 1 for rational h and Sf(x) = 0 if h
is irrational. Therefore, an other integral, the Lebesgue integral is used too: it can be dened
as the limit
1
n
n
k=1
f(xk) where xk are random points in [0, x]. This Monte-Carlo integral
denition of the Lebesgue integral gives the integral 0 for the salt and pepper function because
rational numbers have zero probability.
Remark: Many calculus books dene the Riemann integral using partitions x0 < x1 < ..., xn of
points of the interval [0, x] such that the maximal distance (xk+1 xk) between neighboring xj
goes to zero. The Riemann sum is then Snf =
k
f(yk)(xk+1 xk), where yk is arbitrarily chosen
inside the interval (xk, xk+1). For continuous functions, the limiting result is the same the Sf(x)
sum done here. There are numerical reasons to allow more general partitions because it allows to
adapt the mesh size: use more points where the function is complicated and keep a wide mesh,
where the function does not change much. This leads to numerical analysis of integrals.
1 Let f(x) = c be constant everywhere. Now
x
0
f(t) dt = cx. We can see also that
cnx/n Snf(x) c(n + 1)x/n.
2 Let f(x) = cx. The area is half of a rectangle of width x and height cx so that the area
is cx
2
/2. Remark: we could also have added up the Riemann sum but thats more painful:
for every h = 1/n, let k be the largest integer smaller than xn = x/h. Then (remember
Gausss punishment?)
Snf(x) =
1
n
k
j=1
cj
n
=
ck(k + 1)/2
n
2
.
Taking the limit n and using that k/n x shows that
x
0
f(t) dt = cx
2
/2.
3 Let f(x) = x
2
. In this case,we can not see the numerical value of the area geometrically.
But since we have computed S[x
2
] in the rst lecture of this course and seen that it is
[x
3
]/3 and since we have dened Shf(x)
x
0
f(t) dt for h 0 and [x
k
] x
k
for h 0,
we know that
x
0
t
2
dt =
x
3
3
.
3
This example actually computes the volume of a pyramid which has at distance t from
the top an area t
2
cross section. Think about t
2
dt as a slice of the pyramid of area t
2
and
height dt. Adding up the volums of all these slices gives the volume.
Linearity of the integral (see homework)
x
0
f(t)+g(t) dt =
t
0
f(t) dt+
x
0
g(t) dt
and
x
0
f(t) dt =
x
0
f(t) dt.
Upper bound: If 0 f(x) M for all x, then
x
0
f(t) dt Mx.
4

x
0
sin
2
(sin(sin(t)))/x dt x. Solution. The function f(t) inside the interval is nonnega-
tive and smaller or equal to 1 The graph of f is therefore contained in a rectangle of width
x and height 1.
We see that if two functions are close then their dierence is a function which is included
in a small rectangle and therefore has a small integral:
If f and g satisfy |f(x) g(x)| c, then
x
0
|f(x) g(x)| dx cx .
We know identities like Sn[x]
n
h
=
[x]
n+1
h
n+1
and Sn exp
h
(x) = exp
h
(x) already. Since [x]
k
h
[x]
k
0 we have Sn[x]
k
h
Sn[x]
k
0 and from Sn[x]
k
h
= [x]
k+1
h
/(k + 1).
The other equalities are the same since exp
h
(x) = exp(x) 0. This gives us:
x
0
t
n
dt =
x
n+1
n+1
x
0
e
t
dt = e
x
1
x
0
cos(t) dt = sin(x)
x
0
sin(t) dt = 1 cos(x)
4
Homework
1 a) Find the integral
x
0
t
5
+ 4t
3
+ e
t
dt.
b) Calculate
10
0
t
3
t + t
2
dt.
c) Find
3
5
cos(t) dt.
2 Verify that the following statements hold for dierentiable functions f, g and a < b < c
and any real number . You can argue geometrically with areas.
b
a
f(x) dx +
c
b
f(x) dx =
c
a
f(x) dx.
b
a
f(x) dx +
b
a
g(x) dx =
b
a
f(x) + g(x) dx.
b
a
f(x) dx =
b
a
f(x) dx.
0 m f(x) M implies (b a)m
b
a
f(x) dx (b a)M.
3 a) Verify that every dierentiable function f can be written as a dierence of two non-
negative functions. To do so, show that g(x) = max(f(x), 0) and h(x) = max(f(x), 0)
have the property that f(x) = g(x) h(x) and that g(x) 0 and h(x) 0.
b) Draw the graphs of the two functions g(x), h(x) in the case f(x) = sin(3x) where
0 x 2.
4 a) The region enclosed by the graph of x and the graph of x
3
has a propeller type
shape as seen in the picture. Find its (positive) area.
b) What is the integral
2
0
| sin(x)| dx?
5 a) Find
3
0
|x 1| dx. Distinguish cases.
b) Find
3
0
f(x) dx for f(x) = |x |x 1|| |x 2|.
Riemann sums
We look at the function f(x) = exp(x
2
). It is a function for which
the integral

x
0
f(t) dt is not elementary. We can not express it with
polynomials, trig, exponential functions or their inverses. We want
here to get estimates for

1
0
exp(x
2
) dt.
1
1
2
1
3
1
Lecture 19: Fundamental theorem
In this lecture we prove the fundamental theorem of calculus for dierentiable functions. This
will allow us in general to compute integrals of functions which appear as derivatives.
We have seen earlier that with Sf(x) = h(f(0) + + f(kh)) and Df(x) = (f(x + h) f(x))/h
we have SDf = f(x) f(0) and DSf(x) = f(x) if x = nh. This becomes now:
Fundamental theorem of calculus: Assume f is dierentiable. Then
x
0
f
(t) dt = f(x) f(0) and

d
dx
x
0
f(t) dt = f(x)
Proof. Using notation of Euler we write A B for A and B are close meaning A B 0 for
h 0. From DSf(x) = f(x) for x = kh we have DSf(x) f(x) for kh < x < (k + 1)h because
f is continuous. We also know
x
0
Df(t) dt
x
0
f
(t) dt because Df(t) f
(t) uniformly for

0 t x by the denition of the derivative and the assumption that f
is continuous. We also
know SDf(x) = f(x) f(0) for x = kh. By denition of the Riemann integral Sf(x)
x
0
f(t) dt
and so SDf(x)
x
0
Df(t) dt.
f(x) f(0) SDf(x)
x
0
Df(t) dt
x
0
f
(t) dt
as well as
f(x) DSf(x) D
x
0
f(t) dt
d
dx
x
0
f(t) dt .
1

5
0
3t
7
dt =
x
8
8
|
5
0
=
5
8
8
. You can always leave such expressions as your nal result. It is even
more elegant than the actual number 390625/8.
2

/2
0
cos(t) dt = sin(x)|
/2
0
= 1. This is an important example which should become routine
in a while.
3

x
0
1 + t dt =
x
0
(1+t)
1/2
dt = (1+t)
3/2
(2/3)|
x
0
= [(1+x)
3/2
1](2/3). Here the diculty
was to see that the 1 + t in the interior of the function does not make a big dierence.
Keep such examples in mind.
4 Also in this example
2
0
cos(t + 1) dt = sin(x + 1)|
2
0
= sin(3) sin(1) the additional term
+1 does not make a big dent.
5

/4
/6
cot(x) dx. This is an example where the anti derivative is dicult to spot. It is easy if
we know where to look for: the function log(sin(x)) has the derivative cos(x)/ sin(x). So, we
know the answer is log(sin(x))|
/4
/6
= log(sin(/4))log(sin(/6)) = log(1/
2)log(1/2) =
log(2)/2 + log(2) = log(2)/2.
6 The example
2
1
1/(t
2
9) dt is a bit challenging. We need a hint and write 6/(x
2
9) =
1/(x+3) 1/(x3). The function f(x) = log |x+3| log |x3| has therefore 6/(x
2
9)
as a derivative. We know therefore
2
1
6/(t
2
9) dt = log |3 + x| log |3 x||
2
1
=
log(5) log(1) log(4) + log(2) = log(5/2). The original task is now (1/6) log(5/2).
7

x
0
cos(sin(x)) cos(x) dx = sin(sin(x)) because the derivative of sin(sin(x)) is cos(sin(x)) cos(x).
The function sin(sin(x)) is called the antiderivative of f. If we dierentiate this function,
we get cos(sin(x)) cos(x).
8 Find

0
sin(x) dx. Solution: This has a very nice answer.
1
2
Here is an important notation, which we have used in the example and which might at rst look
silly. But it is a handy intermediate step when doing the computation.
F|
b
a
= F(b) F(a).
We give reformulations of the fundamental theorem in ways in which it is mostly used:
If f is the derivative of a function F then
b
a
f(x) dx = F(x)|
b
a
= F(b) F(a) .
In some textbooks, this is called the second fundamental theorem or the evaluation part of
the fundamental theorem of calculus. The statement
d
dx
x
0
f(t) dt = f(x) is the antiderivative
part of the fundamental theorem. They obviously belong together and are two dierent sides of
the same coin.
Here is a version of the fundamental theorem, where the boundaries are functions of x.
Given functions g, h and if F is a function such that F
= f, then
g(x)
h(x)
f(t) dt = F(g(x)) F(h(x)) .
9

x
2
x
4
cos(t) dt = sin(x
2
) sin(x
4
).
The function F is called an antiderivative. It is not unique but the above formula does
always give the right result.
Lets look at a list of important antiderivatives. You should have as many antideriatives
hard wired in your brain. It really helps. Here are the core functions you should know.
They appear a lot.
function anti derivative
x
n x
n+1
n+1
x
x
3/2
2
3
e
ax e
ax
a
cos(ax)
sin(ax)
a
sin(ax)
cos(ax)
a
1
x
log(x)
1
1+x
2
arctan(x)
log(x) xlog(x) x
Make your own table!
3
Meet Isaac Newton and Gottfried Leibniz. They have discovered the fundamental
theorem of calculus. You can see from the expression of their faces how honored they are
to nd themselves on the same handout with Austin Powers and Doctor Evil. Culture
clash ...
4
Homework
1 For any of the following functions f, nd a function F such that F
= f.
a) e
x
+ sin(3x) + x
3
+ 5x.
b) (x + 4)
3
.
c) 1/x + 1/(x 1).
d) cos(x
2
)2x + sin(x
3
)3x
2
+ 1/
x
2 Find the following integrals by nding a function g satisfying g
= f. We will learn
techniques to nd the function. Here, we just use our knowledge about derivatives:
a)
3
2
5x
4
+ 4x
3
dx.
b)
/2
/4
sin(3x) + cos(x) dx.
c)
/2
/4
1
sin
2
(x)
dx.
d)
3
2
1
x1
dx.
3 Evaluate the following integrals:

a)
2
1
2
x
dx.
b)
1
1
cosh(x) dx. (Remember cosh(x) = (e
x
+ e
x
)/2.)
c)
1
0
1
1+x
2
dx.
d)
2/3
1/3
1
1x
2
dx.
4 a) Compute F(x) =
x
3
0
sin(t) dt, then nd F
(x).
b) Compute G(x) =
cos(x)
sin(x)
exp(t) dt then nd G
(x)
5
a) Be clever: Evaluate the following integral:
2
0
sin(sin(x)) dx
Give the answer and the reason in a short sentence.
b) Be evil: Take a function F of your choice. Find
its derivative and call it f. Now pose an integration
problem to nd
b
a
f(x) dx. Submit this problem to
[email protected] I will select the most evil one.
A good problem should lead to a short function f but
the integral F should be dicult to nd or guess. These
problems will make perfect exam problems for the sec-
ond midterm .... (evil laugh).
You can submit your version of Problem 5b) electronically by email ([email protected].
Just send the function in the subject line. Mail can otherwise be empty). For any
submission, independent how clever or evil, 10 points maximal will be added to your
score (maxing up at 50). So, if your HW score of Lecture 19 is 45 and you submitted
a function, it will be bumped to 50. If your HW score is 50 already you get nothing
... (more evil laugh).
Fundamental theorem
Find the following integrals
1

3
1
x
3
dx
2

1
2
1 x
5
dx
3

1
0
1
x
2
+1
dx
4

4
2
1
x
dx
Now a bit harder ones:
5

4
1
x
1/3
dx
6

3
1
1 + x
5
dx
7

2
1
1
x
+
5
x
2
dx
8

2
1
1
1+x
5
dx
Lecture 20: Antiderivatives
We have looked at
x
0
f(t) dt and seen that it is the signed area under the curve. We have
seen that the area of the region below the curve is counted in a negative way. There is something
else to mention:
For x < 0, we dene
x
0
f(t) dt as
0
x
f(t) dt. This is compatible with the funda-
mental theorem
b
a
f
(t) dt = f(b) f(a).

We call g(x) =
x
0
f(t) dt +C an anti-derivative of g. The constant C is arbitrary
and not xed. As we will see below, we can often adjust the constant such that
some condition is satised.
The fundamental theorem of calculus assured us that
The antiderivative is the inverse operation of the derivative. Two dierent anti
derivatives dier by a constant.
Finding the anti-derivative of a function is much harder than nding the derivative. We will learn
some techniques but it is in general not possible to give anti derivatives for even very simple
functions.
1 Find the anti-derivative of f(x) = sin(4x) + 20x
3
+ 1/x. Solution: We can take the anti-
derivative of each term separately. The antiderivative is F(x) = cos(4x)/4 + 4x
4
+
log(x) + C.
2 Find the anti derivative of f(x) = 1/ cos
2
(x) + 1/(1 x). Solution: we can nd the anti
derivatives of each term separately and add them up. The result is F(x) = cot(x)+log |1
x| + C.
3
We mentioned Galileo Galileo, who measured free fall
motion with constant acceleration. Assume s(t) is the
position of the ball at time t. Assume the ball has zero
velocity initially and is located at height s(0) = 20. We
know that the velocity is v(t) is the derivative of s(t)
and the acceleration a(t) is constant equal to 10. So,
v(t) = 10t + C is the antiderivative of a. By looking
at v at time t = 0 we see that C = v(0) is the initial
velocity and so zero. We know now v(t) = 10t. We
need now to compute the anti derivative of v(t). This is
s(t) = 10t
2
/2 + C. Comparing t = 0 shows C = 20.
Now s(t) = 205t
2
. The graph of s is a parabola. If we
give the ball an additional horizontal velocity, such that
time t is equal to x then s(x) = 20 5x
2
is the visible
trajectory. We see that jumping from 20 meters leads
to a fall which lasts 2 seconds.
1
2
0.5 1.0 1.5 2.0
5
10
15
20
4 The total cost is the antiderivative of the marginal cost of a good. Both the marginal
cost as well as the total cost are a function of the quantity produced. For instance, suppose
the total cost of making x shoes is 300 and the total cost of making x+4 shoes is 360 for all
x. The marginal cost is 60/4 = 15 dollars. In general the marginal cost changes with the
number of goods. There is additional cost needed to produce one more shoe if 300 shoes
are produced. Problem: Assume the marginal cost of a book is f(x) = 5 x/100 and
that producing the rst 10 books costs 1000 dollars. What is the total cost of producing
100 books? Answer: The anti derivative 5 x/100 of f is F(x) = 5xx
2
/100 +C where
C is a constant. By comparing F(10) = 1000 we get 50 100/100 + C = 1000 and so
C = 951. the result is 951+510010
000/100 = 1351. The average book prize has gone

down from 100 to 13.51 dollars.
A function f is called elementary, if it can be constructed using addi-
tion, subtraction, multiplication, division, compositions from polynomials or
roots. In other words, an elementary function is built up with functions like
x
3
,
, exp, log, sin, cos, tan and arcsin, arccos, arctan.

5 The function f(x) = sin(sin(+
x+x
2
)) +log(1+exp((x
6
+1)/(x
2
+1)) +(arctan(e
x
))
1/3
is an elementary function.
6 The anti derivative of the sinc function is called the sine-integral
Si(x) =
x
0
sin(t)
t
dt .
The function Si(x) is not an elementary function.
3
2 4 6 8 10
1.2
1.4
1.6
1.8
7 The oset logarithmic integral is dened as
Li(x) =
x
2
dt
log(t)
It is a specic anti-derivative. It is a good approximation of the number of prime numbers
less than x. The graph below illustrates this. The second stair graph shows the number
(x) of primes below x. For example, (10) = 4 because 2, 3, 5, 7 are the only primes below
it. The function Li is not an elementary function.
10 15 20
2
4
6
8
8 The error function
erf(x) =
2
x
0
e
t
2
dt
is important in statistics. It is not an elementary function.
4
0.5 1.0 1.5 2.0
0.2
0.4
0.6
0.8
1.0
The Mathematica command Integrate uses about 500 pages of Mathematica code and
600 pages of C code.
1
Before software was doing this, tables of integrals like Gradshteyn
and Ryzhiks work were used. This 1200 page book is still useful and contains some
integrals, which computer algebra systems have trouble with.
Numerical evaluation
What do we do when we have can not nd the integral analytically? We can still compute it
numerically. Here is an example: the function sin(sin(x)) also does not have an elementary anti-
derivative. But we can compute the integral numerically with a computer algebra system like
Mathematica:
NIntegrate [ Sin [ Sin [ x ] ] , { x , 0 , 1 0 } ]

Pillow problems
We do not assign homework over spring break. If you have time, here are some integration riddles.
We will learn techniques to deal with them. If you can not crack them, no problem. Maybe pick
one or two and keep thinking about it over spring break. They make also good pillow problems,
problems to think about while falling asleep. Try it. Sometimes, you might know the answer in
the morning. Maybe you can guess a function which has f(x) as a derivative.
1 f(x) = log(x)/x.
2 f(x) =
1
x
4
1
.
3 f(x) = tan
2
(x).
4 f(x) = cos
4
(x).
5 f(x) =
1
xlog(x)
.
1
http://reference.wolfram.com/legacy/v3/MainBook/A.9.5.html
Anti derivatives
Here are some trickier anti derivative puzzles. We still have no inte-
gration techniques and must rely on intuition and experiments to nd
the derivatives.
It is often a puzzle because we can try to combine derivatives
of known functions to get the given function.
1 Find the anti-derivative of the function
f(x) =
1 + x
1 x
Hint. First compute the anti derivative of
g(x) =
1
1 x
.
Can you combine f and g in some way to make it t?
2 Find the anti derivative of the function
f(x) = sin(x
3
)3x
2
.
Hint. Think about the chain rule.
f(x) = sin(sin(x)) cos(x) .
Hint. Think about the chain rule.
f(x) = 2xsin(x) + x
2
cos(x) .
Hint. Think about the product rule.
f(x) = e
e
e
e
e
e
x
e
e
e
e
e
x
e
e
e
e
x
e
e
e
x
e
e
x
e
x
.
Hint. There is no hint.
Lecture 21: Area computation
If f(x) 0, then
b
a
f(x) dx is the area under the graph of f(x) and above the
interval [a, b] on the x axes.
As you have seen in a homework, any function can be written as f(x) = f
+
(x) f
(x), where
f
+
(x) 0 and f
(x) 0. This means that we can write any integral
b
a
f(x) dx as the dierence
of the area above the graph minus the area below the graph.
b
a
f(x) dx =
b
a
f
+
(x) f
(x) dx .
Here is the most common situation:
If a region is enclosed by two graphs f g and x is also enclosed between a and b
then its area is
b
a
g(x) f(x) dx.
1 Find the area of the region enclosed by the x-axes, the y-axes and the graph of the cos
function. Solution:
/2
0
cos(x) dx = 1.
2 Find the area of the region enclosed by the graphs f(x) = x
2
and f(x) = x
4
.
3 Find the area of the region enclosed by the graphs f(x) = 1 x
2
and g(x) = x
4
.
1
2
4 Find the area of the region enclosed by a half circle of radius 1. Solution: The half circle
is the graph of the function f(x) =
1 x
2
. The area under the graph is
1
1
1 x
2
dx .
Finding the anti-derivative is not so easy. We will nd techniques to do so, for now we
pop it together: we know that arcsin(x) has the derivative 1/
1 x
2
and x
1 x
2
has
the derivative
1 x
2
x
2
/
1 x
2
. The sum of these two functions has the derivative
1 x
2
(1 x
2
)/
1 x
2
= 2
1 x
2
. We nd the anti derivative to be (x
1 x
2
+
arcsin(x))/2. The area is therefore
x
1 x
2
+ arcsin(x)
2
|
1
1
=

2
.
5 Find the area of the region between the graphs of f(x) = 1|x|
1/4
and g(x) = 1+|x|
1/4
.
3
6 Find the area under the curve of f(x) = 1/x
2
between 6 and 6. Solution.
6
6
x
2
dx =
x
1
|
6
6
= 1/6 1/6 = 1/3. There is something shy with this computation because
f(x) is nonnegative so that the area should be positive. But we obtained a negative answer.
What is going on?
7 Find the area between the curves x = 0 and x = 2 + sin(y), y = 2 and y = 0. Solution:
We turn the picture by 90 degrees so that we compute the area under the curve y = 0, y =
2 + sin(x) and x = 2 and x = 0.
4
8 The grass problem. Find the area between the curves |x|
1/3
and |x|
1/2
. Solution. This
example illustrates how important it is to have a picture. This is good advise for any
word problem in mathematics.
Use a picture of the situation while doing the computation.
Homework
1 Find the area of the region enclosed by the graphs f(x) = x
3
and g(x) =
|x|.
2 Find the area of the region enclosed by the four lines y = x, y = 3 x, x = 1.
3 Find the area of the region enclosed by the curves y = 4, y = 2, x = 3 + sin(3y), x =
2 + sin(2y).
4 Write down an integral which gives the area of the elliptical region 4x
4
+y
2
1. Evaluate
the integral numerically using Wolfram alpha, Mathematica or any other software.
5 The graphs sin(x) and cos(x) 1 intersect at x = 0, 2 and a point between. They dene
a humming bird region,consisting of a larger region and a tail region. Find the area of
each and assume the bird has its eye closed.
Area Computation
In this worksheet we look at other regions. In order to nd the area
we have to turn our heads.
1 Lets compute the area of the region enclosed by the lines x = 0, x =

y, y = 0 and
y = 4. In order to solve such an area problem, we have to draw a picture. We started doing
that. Find ways to nd the area.
2 Lets compute the area of the region enclosed by the lines x = 0, x = y
2
, y = 0 and y = 2.
Now its your turn to draw a picture and compute the area.
Lecture 22: Volume computation
To compute the volume of a solid, we cut it into slices perpendicularily along a line
x. If A(x) is the area of the slice and the body is enclosed between a and b then
V =
b
a
A(x) dx is the volume. Think of A(x)dx as the volume of a slice. The
integral adds them up.
1 Compute the volume of a pyramid with square base length 2 and height 2. Solution: we
can assume the pyramid is built over the square 1 x 1 and 1 y 1. The cross
section area at height h is A(h) = (2 h)
2
. Therefore,
V =
2
0
(2 h)
2
dh = 8/3
This is base area 4 times height 2 divided by 3.
A solid of revolution is a surface enclosed by the surface obtained by rotating the
graph of a function f(x) around the x axis.
The area of the cross section at x of a solid of revolution is A(x) = f(x)
2
. The
volume of the solid is
b
a
f(x)
2
dx.
1
2
2 Find the volume of a round cone of height 2 and where the circular base has the radius
1. Solution. This is a solid of revolution obtained by rotation the graph of f(x) = x/2
around the x axes. The area of a cross section is x
2
/4. Integrating this up from 0 to 2
gives
2
0
x
2
/4 dx =
x
3
4 3
|
2
0
=
2
3
.
This is the height 2 times the base area divided by 3.
3 Find the volume of a half sphere of radius 1. Solution: The area of the cross section at
height h is (1 h
2
).
4 We rotate the graph of the function f(x) = sin(x) around the x axes. But now we cut out
a slice of 60 = /3 degrees out. Find the volume of the solid.
Solution: The area of a slice without the missing piece is sin
2
(x). The integral
0
sin
2
(x) dx
is /2 as derived in the lecture. Having cut out 1/6th the area is (5/6) sin
2
(x). The
volume is
0
(5/6) sin(x)
2
dx = (5/6)
2
/2.
3
Homework
1 Find the volume of the paraboloid for which the radius at position x is 1x
2
and x ranges
from 0 to 1.
2 A catenoid is the surface obtained by rotating the graph of f(x) = cosh(x) around the
x-axes. We have seen that the graph of f is the chain curve, the shape of a hanging chain.
Find the volume of of the solid enclosed by the catenoid between x = 1 and x = 1.
Hint. You might want to check rst the identity 2 cosh(x)
2
= 1 + cosh(2x) using the
denition cosh(x) = (exp(x) + exp(x))/2.
3 A tomato is given by z
2
+ x
2
+ 4y
2
= 1. If we slice perpendicular to the y axes, we get a
circular slice z
2
+ x
2
1 4y
2
of radius
1 4y
2
.
a) Find the area of this slice.
b) Determine the volume of the tomato.
c) Fix yourself a tomato salad by cutting a fresh tomato into slices and eat it, except for
one slice which you staple to your homework paper as proof that you really did it.
4 As we have seen in the movie of the rst class,
Archimedes was so proud of his formula for the vol-
ume of a sphere that he wanted the formula on his tomb
stone. He wrote the volume of a half sphere of radius
1 as the dierence between the volume of a cylinder of
radius 1 and height 1 and the volume of a cone of base ra-
dius 1 and height 1. Relate the cross section area of the
cylinder-cone complement with the cross section area of
the sphere to recover his argument! If stuck, draw in
the sand or soak in the bath tub for a while eating your
tomato salad. There is no need to streak and scream
Eureka when the solution is found.
4
5 Volumes were among the rst quantities, Mathematicians wanted to measure and compute.
One problem on Moscow Eqyptian papyrus dating back to 1850 BC explains the general
formula h(a
2
+ ab + b
2
)/3 for a truncated pyramid with base length a, roof length b and
height h.
a) Verify that if you slice the frustrum at height z, the area is (a + (b a)z/h)
2
.
b) Find the volume using calculus.
Here is the translated formulation from the papyrus:
1 2
You are given a truncated pyramid of 6 for the vertical height by 4 on the base by
2 on the top. You are to square this 4 result 16. You are to double 4 result 8. You
are to square 2, result 4. You are to add the 16, the 8 and the 4, result 28. You are
to take one-third of 6 result 2. You are to take 28 twice, result 56. See it is 56. You
will nd it right.
1
Howard Eves, Great moments in mathematics, Volume 1, MAA, Dolciani Mathematical Expositions, 1980,
page 10
2
Image Source: http://www-history.mcs.st-and.ac.uk/HistTopics/Egyptian papyri.html
Math 1A: introduction to functions and calculus O. Knill and B. Luko, 2011
Volume Computation
1 Find the volume of the solid that is formed by rotating the graph of y = x
2
around the
x-axis, for 1 x 3.
2 Find the volume of the solid that is formed by rotating the graph of y = x
2
around the
y-axis, for 1 y 3.
3 Derive the formula for the volume of a sphere (
4
3
r
3
).
4 Find the volume of the solid of revolution for which the radius at height z is 2 |z| and
1 z 1.
5 The solid of revolution for which the radius at position x is x
4
+ 1 and x [2, 2] is
taken only above the xy plane as in the picture. Find the volume.
Lecture 23: Improper integrals
In this lecture, we look at integrals on innite intervals or integrals, where the function can get
innite at some point. These integrals are called improper integrals. The area under the curve
can remain nite or become innite.
1 What is the integral

1
1
x
2
dx ?
Since the anti-derivative is 1/x, we have
1
x
|
1
= 1/+ 1 = 1 .
To justify this, compute the integral
b
1
1/x
2
dx = 1 1/b and see that in the limit b ,
the value 1 is achieved.
In a previous lecture, we have seen a chocking example similar to the following one:
2
1
1
1
x
2
dx =
1
x
|
1
1
= 1 1 = 2 .
This does not make any sense because the function is positive so that the integral should
be a positive area. The problem is this time not at the boundary 1, 1. The sore point is
x = 0 over which we have carelessly integrated over.
The next example illustrates the problem with the previous example better:
3 The computation
1
0
1
x
2
dx =
1
x
|
1
0
= 1 +.
indicates that the integral does not exist. We can justify by looking at integrals
1
a
1
x
2
dx =
1
x
|
1
a
= 1 +
1
a
which are ne for every a > 0. But this does not converge for a 0.
Do we always have a problem if the function goes to innity at some point?
4 Find the following integral
1
0
1
x
dx .
1
2
Solution: Since the point x = 0 is problematic, we integrate from a to 1 with positive a
and then take the limit a 0. Since x
1/2
has the antiderivative x
1/2
/(1/2) = 2
x, we
have
1
a
1
x
dx = 2
x|
1
a
= 2
1 2
a = 2(1
a) .
There is no problem with taking the limit a 0. The answer is 2. Even so the region is
innite its area is nite. This is an interesting example. Imaging this to be a container for
paint. We can ll the container with a nite amount of paint but the wall of the region
has innite length.
5 Evaluate the integral
1
0
1/
1 x
2
dx. Solution: The antiderivative is arcsin(x). In this
case, it is not the point x = 0 which produces the diculty. It is the point x = 1. Take
a > 0 and evaluate
1a
0
1
1 x
2
dx = arcsin(x)|
1a
0
= arcsin(1 a) arcsin(0) .
Now arcsin(1 a) has no problem at limit a 0. Since arcsin(1) = /2 exists. We get
therefore the answer arcsin(1) = /2.
6 Rotate the graph of f(x) = 1/x around the x-axes and compute the volume of the solid
between 1 and . The cross section area is /x
2
. If we look at the integral from 1 to a
xed R, we get
R
1
x
2
dx =
x
|
R
1
= /R + .
This converges for R . The volume is . This famous solid is called Gabriels
trumpet. This solid is so prominent because if you look at the surface area of the small
slice, then it is larger than dx2/x. The total surface area of the trumpet from 1 to R is
therefore larger than
R
1
2/x dx = 2(log(R) log(1)). which goes to innity. We can
ll the trumpet with a nite amount of paint but we can not paint its surface.
3
Finally, lets look at the following example
0
sin(x) dx. Solution. There is no problem at the boundary 0 nor
at any other point. We have to investigate however, what happens at . Therefore, we
look at the integral
b
0
sin(x) dx = cos(x)|
b
0
= 1 cos(b). We see that the limit b
does not exist. The integral uctuates between 0 and 2.
The next example leads to a topic in a follow-up course. It is not covered here, but could
make you curious:
8 What about the integral
I =

0
sin(x)
x
dx ?
Solution. The anti derivative is the Sine integral Si(x) so that we can write
b
0
sin(x)/x dx =
Si(b). It turns out that the limit b exists and is equal to /2 but this is a topic for a
4
second semester course like Math 1b. The integral can be written as an alternating series,
which converges and there are many ways to compute it:
1
Lets summarize the two cases of improper integrals: innitely long intervals and a point where
the function becomes innite.
1) To investigate the improper integral
a
f(x) dx we look at the limit
b
a
f(x) dx
for b .
1) To investigate improper integral
b
0
f(x) dx where f(x) is not continuous at 0,
we take the limit
b
a
f(x) dx for a 0.
Homework
2
1
5
x1
+ cos(x) dx.
2 Evaluate the following integrals
a)
1
0
x/
1 x
2
dx.
b)
1
0
1/
1 x
2
dx.
Hint: For a) think about the chain rule d/dxf(g(x)) = f
(g(x))g
(x)
4
3
(x
2
)
1/3
dx. To make sure that the integral is ne, check whether
0
3
and
4
0
work.
4 The integral
1
2
1/x dx does not exist. We can however take a positive b > 0 and look at
b
2
1/x dx +
1
b
1/x dx = log |b| log | 2|) + (log |1| log |b|) = log(2) .
This value is called the Cauchy principal value of the integral. Find the principal value
of
5
4
3/x
3
dx
using the same process as before, by cutting out [a, a] and then taking the limit a 0.
5 Could we have given a principal value integral value to
1
1
1
x
2 dx? If yes, nd the value. If
not, tell why not.
1
Hardy, Mathematical Gazette, 5, 98-103, 1909.
Inproper integrals
1 Find the value of the improper integral

1
1
x
11
dx
2 Find the following improper integral
1
0
1
1 x
.
3 We have met the Maria Agnesi function
f(x) =
1
1 + x
2
early in the course already. Evaluate the integral
I =
1
1 + x
2
dx
The function g(x) =
1
I
1
1+x
2
is a probability distribution called
Cauchy distribution. It is a nonzero function which has the
property that

g(x) dx = 1.
Lecture 24: Applications of integration
You have seen these integration applications:
the computation of area
the computation of volume
position from acceleration
cost from marginal cost
Here are some more:
probabilities and distributions
averages and expectations
nding moments of inertia
work from power
Probability
In probability theory functions are used as observables or to dene probabilities.
Assuming our probability space to be the real line, an interval [a, b] is called
an event. Given a nonnegative function f(x) which has the property that
f(x) dx = 1, we call
P[A] =
b
0
f(x) dx
the probability of the event. The function f(x) is called the probability density
function.
The most famous and most important probability density is the normal distribution:
The normal distribution has the density
f(x) =
1
2
e
x
2
/x
.
It is the distribution which appears most often if data can take both positive and negative values.
The reason why it appears so often is that if one observes dierent unrelated quantities with the
same statistical properties, then their sum, suitably normalized becomes the normal distribution.
If we measure errors for example, then these errors often have a normal distribution.
a b
1 The probability density function of the exponential distribution is dened as f(x) = e
x
for x 0 and f(x) = 0 for x < 0. It is used to used measure lengths of arrival times like
the time until you get the next phone call. The density is zero for negative x because there
is no way we can travel back in time. What is the probability that you get a phone call
between times x = 1 and times x = 2 from now? The answer is
2
1
f(x) dx.
1
2
a b
Assume f is a probability density function (PDF). The antiderivative F(x) =
f(t) dt is called the cumulative distribution function (CDF).

2 For the exponential function the cumulative distribution function is
f(x) dx =
x
0
f(x) dx = e
x
|
x
0
= 1 e
x
.
The probability density function f(x) =
1
1
1+x
2
is called the Cauchy distribution.
3 Find its cumulative distribution function. Solution:
F(x) =
f(t) dt =
1
arctan(x)|
x
= (
1
arctan(x) +
1
2
) .
a b
Average
Here is an example for computing the average.
4 Assume the level in a honey jar over [0, 2] containing crystallized honey is given by a
function f(x) = 3 + sin(3x)/5 + x(2 x)/10. In order to restore the honey, it is placed
into hot water. The honey melts to its normal state. What height does it have? Solution:
The average height is
2
0
f(x) dx/(2) which is the area divided by the base length.
In probability theory we would call f(x) a random variable and the average of f with
E[f] the expectation.
3
Moment of inertia
If we spin a wire of radius L of mass density f(x) around an axes, the moment of
inertia is dened as I =
L
0
x
2
f(x) dx.
The signicance is that if we spin it with angular velocity w, then the energy is Iw/2.
5 Assume a wire has density 1 + x and length 3. Find its moment of inertia. Solution:
6 Flywheels have a comeback for powerplants to absorb energy. If there is not enough
power, the ywheels are charged, in peak times, the energy is recovered. They work with
80 percent eciency. Assume a ywheel is a cylinder of radius 1, density 1 and height 1,
then the moment of inertia integral is
1
0
z
2
f(z) dx, where f(z) is the mass in distance
z.
Work from power
If P(t) is the amount of power produced at time t, then
T
0
P(t) dt is the work=energy
produced in the time interval [0, T].
Energy is the anti-derivative of power.
7 Assume a power plant produces power P(t) = 1000 +exp(t) +t
2
t. What is the energy
produced from t = 1 to t = 10? Solution.
4
Wouldnt be nice to have one of those bikes
with interactive training environments in the
gym, allowing to ride in the Peruvian or Swiss
Mountains, the California coast or in the Italian
Tuscany?
Additionally, there should be some computer game
features, racing other riders through beaches,
deserts or Texan highways (could be on google
earth). Training would be so much more enter-
taining. Business opportunities everywhere. The
rst oering such training equipment will make a
fortune. Until then we are stuck with TV programs
which really suck.
Homework
1 The probability distribution which describes the time you have to wait for your next email
is f(x) = 3e
3x
. What is the probability that you get your next email in the next 2 hours,
that is between x = 0 and x = 2?
2 Assume the probability distribution for the waiting time to the next warm day is f(x) =
(1/4)e
x/4
, where x has days as unit. What is the probability to get a warm day between
tomorrow and after tomorrow that is between x = 1 and x = 2?
3 A rod modeled over the interval [0, 4] has temperature f(x) = 5 + x
2
3x at position x.
Find the average temperature.
4 A CD Rom has radius 6. If we would place the material at radius x onto one point, we
get a density of f(x) = 2x. Find the moment of inertia I of the disc. If we spin it with
an angular velocity of w = 20 rounds per second. Find the energy E = Iw
2
/2.
Without credit: Explode a CD: http://www.powerlabs.org/cdexplode.htm. Careful!
5 a) You are on a stationary bike in the Hemenway gym and pedal with power
P(t) = 200 + 100 sin(10t)
t
300
+
t
2
19440
(in Watts=W). The periodic uctuations come from a hilly route. The linear term is the
tiring eect and the quadratic term is due to endorphins kicking in eventually. What
energy (Joules J=W s) have you produced in the time t [0, 1800] (s=seconds)?
b) Since we do math not physics, we usually ignore all the units but this one is just too
much fun. If you divide the result by 4184, you get kilo calories = food calories. Eating
an apple gives you about 80 food calories. How many apples can you eat after your half
hour workout, just to get even?
Applications of integration
1 Find the cumulative distribution function
F(x) =
f(t) dt .
of the exponential distribution in the case f(x) = 2 exp(2x).
2 Find the moment of inertia of a rod which has density f(x) = x
and length 10.
L
0
x
2
f(x) dx .
3 A light bulb produces 100W. How much energy in kw/Hours
does it use in 1 year? Assume you pay 10 cents for each kW/h.
How much does it cost?
Lecture 25: Related rates
Before we continue with integration, we include a short ash-back on dierentiation. This allows
us to solidify the chain rule
d
dx
f(g(x)) = f
(g(x))g
(x)
which will be very useful for the integration technique called substitution. Since the chain rule
is often perceived as a dicult concept in calculus, it is good to come back to it again. We take
the opportunity also to review a bit our dierentiation skills and to take some fresh breath before
launching into more advanced integration techniques.
1 Assume we inate a balloon and pump 5 volume units per unit time into it. If the balloon
has radius 7, what is the rate of change of the radius? Solution. Let V (r) be the volume
and r(t) the radius at time t. Since V (r(t)) = 4r(t)
3
/3 , we have by the chain rule
5 = d/dtV (r(t)) = 4r(t)
2
r
(t) .
This relation allows us to compute r
(t) = 5/(4r
2
) = 5/(47
2
).
2 Hydrophilic water gel spheres made from polyacrylamide polymer can expand 300
times their original size as you see in class. Assume they have initially a diameter of 1
(cm) and that they expand in 10 hours to its 300 fold volume. Find the rate of change of
the radius in time when they have a volume of 100 (cm
3
). Solution. We have the same
rule V = 4r
3
/3 . The problem gives us d/dtV (r(t)) = 300/10 = 30. The rest is now the
same as in the previous problem: 30 = 4r
2
r
. Since r = 100 we get r
= 30/(4100
2
).
3 The upper part of a wine glass has a shape y = x
2
with 0 y 2. We assume the glass
is half full, meaning that the wine level is at y = 1. We taste the wine with 1 ml/sec using
a straw, ignoring any political and behavioral correctness. How fast does the wine level
sink at that moment?
Solution: The area of the wine layer at height y is A(y) = x
2
= y. The volume is
V (y) =
y
0
y dy = y
2
/2 . We know
1 = d/dtV (y(t)) = V

(y)y
(t) = yy
(t)
so that y
(t) = 1/(y) and for y = 1 this is 1/.

1
2
4 A person of height 6 feet is located at x = 6 and walks with constant speed 1. A lamp
at x = 0 is at height 10 feet. With what speed does the shadow of the person proceed
on the oor? Solution: If the person is at position x, the shadows length L satises
L/6 = (L + x)/10 which is L = 9. The relation L/6 = (L + x)/10 means L = 3x/2 so
that L
= 3x
/2 = 3/2.
5 Romeo and Juliet have meet secretly at position (0, 0) and rush home. Romeo runs
with speed 4 meters/seconds to the east. Assume their distance satises l(t) = t
3
. After
10 seconds, they wave back to each other. With what speed does Juliet run at this time?
Solution. What do we know? x(t) = 4t is the position of Romeo and l(t) = t
3
. If y(t) is
the y position of Juliet, the law we use is Pythagoras l
2
= x
2
+ y
2
so that y(t) =
l
2
x
2
and y(10) =
1000 100 =
900 = 30. Now dierentiate the law to get 2ll
= 2xx
+2yy
.
We know all quantities at time t = 10: we know l = 1000, l
= 300, x = 40, x
= 4, y = 30
and compute y
= (2000 300 80 4)/60 = 29984/3.

3
We have seen the ladder example twice already:
6 A ladder has length 1. Assume slips on the ground away with constant speed 2 in the
x-direction. What is the speed of the top part of the ladder sliding down the wall at
the time when x = y? Solution We know x
(t) = 2 and that x(t), y(t) are related

by x
2
(t) + y
2
(t) = 1 . Dierentiation gives 2x(t)x
(t) + 2y(t)y
(t) = 0. We get y
(t) =
x
(t)x(t)/y(t) = 2 1 = 1.
7 A kid slides down a slide of the shape y = 2/x . Assume at height y = 2 we have
dy/dt = 7. What is dx/dt? Solution: dierentiate the relation to get y
= 2x
/x
2
. At
y = 2 we have x = 1. Now solve for x
to get x
= y
x
2
/2 = 7/2.
Image source: http://www.dmfco.com
8 A canister of oil releases oil at a constant rate 5. With what rate does the radius of
the oil spill increase, when the radius is 1? Solution. We have A(r) = r
2
and so
5 = A
(r) = 2rr
. Solving for r
gives r
= 5/(2r) which is 5/(2).

Related rates problems link quantities by a rule . These quantities can depend on
time. To solve a related rates problem, dierentiate the rule with respect to time
and solve for the unknown quantity.
Related rates problems are not so easy. The diculty comes from the fact that they
are often word problems which rst have to be parsed. We have to nd the rule and
dierentiate it. In all the problems on this handout, the rule is boxed. It is important
to understand which variables depend on time. If a term x
3
appears for example and x
depends on time, then d/dtx
3
= 3x
2
x
.
4
Homework
1 The ideal gas law pV = T relates pressure p and vol-
ume V and temperature T. Assume the temperature
T = 50 is xed and the volume is at V = 2 and de-
creased by V

= 3. Find the rate p
with which the

pressure increases.
2 Assume the total production rate P of a new tablet
computer product for kids is constant 100 and given
by the famous Cobb-Douglas formula P = L
1/3
K
2/3
where L = 64 is the labor and K = 125 is the cost.
Assume labor is increased at a rate L
= 2. What is the
cost change K
?
3 You observe an airplane at height h = 10
000 meters
directly above you and see that it moves with rate
= 5
degree per second (which is 5/180 radiants per second).
What is the speed x
of the airplane directly above you

where x = 0? Hint: Use tan() = x/h and make a
picture to gure out what is.
4 An isosceles triangle with base 2a and height h has
xed area A = ah = 1 . Assume the height is decreased
by a rate h
= 2. With what rate does a increase if

h = 1/2?
a
h
5 There are cosmological models which see our universe
as a four dimensional sphere which expands in space
time. Assume the volume V =
2
r
4
/2 increases at a
rate d/dtV (r(t)) = 100
2
r
2
. What is r
if the current
radius is r = 47 (billion light years).
Related rates
1 An underwater oil spill releases oil at the constant amount.
The area A(r) of the oil increases with A
(r(t)) = 2. If the radius

is r = 4, what is the rate of change of r?
2 The resistance R, voltage U and current I are related by
U = RI .
Assume the temperature increases, the resistance R(t) increases
by a constant amount R
= 2. If the voltage stays constant U = 4

what is the rate of change of I?
1 2
1
1
Lecture 26: Implicit dierentiation
We have seen an implicit dierentiation example in the Valentines day lecture and will repeat this
topic more. Implicit dierentiation is also crucial to nd the derivative of inverse functions. We
will review this here because this will give us handy tools for integration.
The chain rule, related rates and implicit dierentiation are all the same concept,
but viewed from dierent angles. You can see implicit dierentiation as a special
case of related rates where one of the quantities is time meaning that this is the
variable with respect to which we dierentiate.
1 Points (x, y) in the plane which satisfy x
2
+ 9y
2
= 10 form an ellipse. Find the slope of
the tangent line at the point (1, 1).
Solution: We want to know the derivative dy/dx. We have 2x + 18yy
= 0. Using
x = 1, y = 1 we see y
= 2x/(18y) = 1/9.
Remark. We could have looked at this as a related rates problem where x(t), y(t) are
related and x
= 1 Now 2xx
+ 9 2yy
= 0 allows to solve for y
= 2xx
/(9y) = 2/9.
2 The points (x, y) which satisfy the equation x
4
3x
2
+ y
2
= 0 forms a gure 8 called
lemniscate of Gerono. It contains the point (1,
2). Find the slope of the curve at that

point. Solution: We dierentiate the law describing the curve with respect to x. This
gives
5x
3
6x + 2yy
= 0
We can now solve for y
= (6x 5x
3
)/(2y) = 1/2.
1
2
3 The Valentine equation (x
2
+ y
2
1)
3
x
2
y
3
= 0 contains the point (1, 1). Near (1, 1),
we have y = y(x) so that (x
2
+ y(x)
2
1)
3
x
2
y(x)
3
= 0. Find y
at the point x = 1.
Solution Take the derivative
0 = 3(x
2
+ y
2
1)(2x + 2yy
) 2xy
3
x
2
3y
2
y
(x)
and solve for
y
=
3(x
2
+ y
2
1)2x 2xy
3
3(x
2
+ y
2
1)2y 3x
2
y
2
.
For x = 1, y = 1, we get 4/3.
4 The energy of a pendulum with angle x and angular velocity y is
y
2
cos(x) = 1
is constant. What is y
? We could solve for y and then dierentiate. Simpler is to

dierentiate directly and get yy
+ sin(x) = 0 so that y
= sin(x)/y. At the point

(/2, 1) for example we have y
= 1.
3
What is the dierence between related rates and implicit dierentiation?
Implicit dierentiation is the special case of related rates where one of the variables
is time.
Derivatives of inverse functions
Implicit dierentiation has an important application: it allows to compute the derivatives
of inverse functions. It is good that we review this, because we can use these derivatives
to nd anti-derivatives.
5 Find the derivative of log(x) by dierentiating exp(log(x)) = x.
Solution:
1 =
d
dx
x =
d
dx
exp(log(x))
= exp(log(x))
d
dx
log(x) = xlog
(x) .
Solve for log
(x) = 1/x. Since the derivative of log(x) is 1/x. The anti-derivative of 1/x is
log(x) + C.
6 Find the derivative of arccos(x) by dierentiating cos(arccos(x)) = x.
Solution:
1 =
d
dx
x =
d
dx
cos(arccos(x))
= sin(arccos(x)) arccos
(x) =
1 cos
2
(arccos(x)) arccos
(x)
=
1 x
2
arccos
(x) .
Solving for arccos
(x) = 1/
1 x
2
. The anti-derivative of arccos(x) is 1/
1 x
2
.
7 Find the derivative of arctan(x) by dierentiating tan(arctan(x)) = x.
Solution: This is a derivative which we have seen several times by now. We use the
identity 1/ cos
2
(x) = tan
2
(x) + 1 to get
1 =
d
dx
x =
d
dx
tan(arctan(x))
=
1
cos
2
(x)
arctan
(x)
= (1 + tan
2
(arctan(x))) arctan
(x) .
4
Solve for arctan
(x) = 1/(1 + x
2
). The anti-derivative of arctan(x) is 1/(1 + x
2
).
8 Find the derivative of f(x) =
x by dierentiating (
x)
2
= x.
Solution:
1 =
d
dx
x =
d
x
(
x)
2
= 2
xf
(x)
so that f
(x) = 1/(2
x).
Homework
1 The equation y
2
= x
2
x denes the graph of the function f(x) =
x
2
x. Find the slope
of the graph at x = 2 directly by dierentiating f. Then use the implicit dierentiation
method and dierentiate y
2
= x
2
x assuming y(x) is a function of x and solving for y
.
2 The equation x
2
+ y
2
= 5 denes a circle. Find the slope of the tangent at (1, 2).
3 The equation x
100
+y
100
= 1 +2
100
denes a curve which looks close to a square. Find the
slope of the curve at (2, 1).
4 Derive again the derivative of arccot(x) as we did before in this course and also during the
rst midterm.
5 a) The relation sin(x y) 2 cos((/2)xy) = 0 relates x(t) and y(t). Assume x
= 1 at
(1, 1) what is y
? This is a related rates problem.

b) Now do it directly. Since x
= 1 we can use x as the variable. Find y
(x) by implicit
dierentiation. You should get the same result as in a).
Lecture 26: April First Worksheet
Implicit dierentiation
1 Find the slope of y
(x) if 2x
3
y
3
= y at the point (1, 1).
2 Find the derivative of y(x) = x
1/5
by dierentiating y
5
= x.
3 The equation y = x relates two quantities and denes y in
terms of x. Assume x = 2 nd
d
dx
y .
Hint. This problem tries hard to be a April rst joke but does
not quite succeed.
Lecture 27: Review for second midterm
Major points
The intermediate value theorem assures that there is x (a, b) with f
(x) =
(f(b) f(a))/(b a). A special case is Rolles theorem, where f(b) = f(a).
Catastrophes are parameter values where a local minimum disappears. Typically
the system jumps then to a lower minimum.
Denite integrals F(x) =
x
0
f(x) dx are dened as a limit of Riemann sums
Sn/n.
A function F(x) satisfying F
= f is called the anti-derivative of f. The general

anti-derivative is F + C where C is a constant.
The fundamental theorem of calculus tells d/dx
x
0
f(x) dx = f(x) and
x
0
f
(x) dx = f(x) f(0).

The integral
b
a
g(x) f(x) dx is the signed area between the graphs of f and
g. Places, where f < g are counted negative.
The integral
b
a
A(x) dx is a volume if A(x) is the area of a slice of the solid
perpendicular to a point x on an axes.
Write improper integrals as limits of denite integrals
1
f(x) dx =
limR
R
1
f(x) dx. We similarly treat points, where f is discontinuous.
Besides area, volume, total cost, or position, we can compute averages, inertia
or work using integrals.
If x, y are related by F(x(t), y(t)) = 0 and x(t) is known we can compute y
(t) using
the chain rule. This is related rates.
If f(g(t)) is known we can computer g
(x) using the chain rule. This works for

inverse functions. This is implicit dierentiation.
To determine the catastrophes for a family fc(x) of functions, determine the critical
points in depence of c and nd values c, where a critical point changes from a local
minimum to a local maximum.
1
2
Important integrals
cos(x) sin(x).
sin(x) cos(x).
tan(x) 1/ cos
2
(x).
1/(1 + x
2
) arctan(x).
1/
1 x
2
arcsin(x)
exp(x) exp(x)
log(x) xlog(x) x
1/x log(x)
arccot(x) 1/(1 + x
2
).
1/
1 x
2
arccos(x)
Improper integrals
1
1/x
2
dx Prototype of rst type improper integral which exists.
1
1/x dx Prototype of rst type improper integral which does not exist.
1
0
1/x dx Prototype of second type improper integral which does not exist.
1
0
1/
x dx Prototype of second type improper integral which does exist.

The fundamental theorem
d/dx
x
0
f(x) dx = f(x)
x
0
f
(x) dx = f(x) f(0).

This implies
b
a
f
(x) dx = f(b) f(a)

Without limits of integration, we call
f(x) dx the anti derivative. It is dened up to a constant.

For example
sin(x) dx = cos(x) + C.
Applications
Calculus applies directly if there are situations where one quantity is the derivative of the other.
function anti derivative
acceleration velocity
velocity position
function area under the graph
length of cross section area of region
area of cross section volume of solid
marginal prize total prize
power work
probability density function cumulative distribution function
Tricks
Whenever dealing with an area or volume computation, make a picture.
In related rates problems, make sure you understand what are variables and what are constants.
For volume computations, nd the area of the cross section A(x) and integrate.
For area computations nd the length of the slice f(x) and integrate.
Lecture 27: Review Problems
Denite integral
1 The following integral denes the area of a region. Draw it:

/2
x sin(x) dx .
Catastrophes
2 Lets look at the family of functions f
c
(x) = x
5
+cx
3
. You see
three graphs. They display the function for c = 1, c = 0 and
c = 1. What can you say about catastrophes?
1.0 0.5 0.5 1.0
0.15
0.10
0.05
0.05
0.10
0.15
1.0 0.5 0.5 1.0
0.4
0.2
0.2
0.4
1.0 0.5 0.5 1.0
10
5
5
10
Area
3 Find the area of the region bound by y = 2 x, x = y, y = 0
and y = 1.
Volumes
4 If we rotate the witch of Agnesi y = (1 + x
2
)
1
around the
x axes, we obtain a solid. Find its volume. Hint. To nd the
integral, compute the derivative of x/(1 + x
2
) and get inspired.
Related Rates
5 The curve x
2
y
2
= 3y is and example of a hyperbola. If
x(t) = 2 + t. Find the related rate y
near (2, 1).

6 4 2 0 2 4 6
6
4
2
0
2
4
6
Lecture 28: Substitution
If we dierentiate the function sin(x
2
) and use the chain rule, we get cos(x
2
)2x. By the fundamental
theorem of calculus, the anti derivative of cos(x
2
)2x is sin(x
2
). We know therefore
cos(x
2
)2x dx = sin(x
2
) + C .
Spotting the chain rule
How can we see the integral without knowing the result already? Here is a very important case:
If we can spot that f(x) = g(u(x))u
(x), then the anti derivative of f is G(u(x))

where G is the anti derivative of g.
1 Find the anti derivative of
e
x
4
+x
2
(4x
3
+ 2x) .
Solution: The derivative of the inner function is to the right.
2 Find

x
5
+ 1x
4
dx .
Solution. The derivative of x
5
+ 1 is 5x
4
. This is almost what we have there but the
constant can be adapted. The answer is (1/5)(x
5
+ 1)
3/2
.
log(x)
x
.
Solution: The derivative of log(x) is 1/x. The antiderivative is log(x)
2
/2.
cos(sin(x
2
)) cos(x)2x .
Solution. We see the derivative of sin(x
2
) appear on the right. Therefore, we have
sin(sin(x
2
)).
In the next three examples, substitution is actually not necessary. You can just write
down the anti derivative, and adjust the constant. It uses the following speedy rule:
If
f(ax + b) dx = F(ax + b)/a where F is the anti derivative of f.

1
2
5

x + 1 dx. Solution: (x + 1)
3/2
(2/3).
6

1
1+(5x+2)
2
dx. Solution: arctan(5x + 2)(1/5).
Doing substitution
Spotting things is sometimes not easy. The method of substitution helps to formalize this. To do
so, identify a part of the formula to integrate and call it u then replace an occurrence of u
dx with
du.
f( u(x) ) u(x) dx =
f( u ) du .
Here is a more detailed description: replace a prominent part of the function with a new variable
u, then use du = u
(x)dx to replace dx with du/u
. We aim to end up with an integral
g(u) du
which does not involve x anymore. Finally, after integration of this integral, replace the variable
u again with the function u(x). The last step is called back-substitution.
7 Find the anti-derivative
log(log(x))/x dx .
Solution Replace log(x) with u and replace u
dx = 1/xdx with du. This gives
log(u) du =
u log(u) u = log(x) log log(x)) log(x).
8 Solve the integral
x/(1 + x
4
) dx .
Solution Substitute u = x
2
, du = 2xdx to get (1/2)
du/(1 + u
2
) du = (1/2) arctan(u) =
(1/2) arctan(x
2
).
sin(
x)/
x .
Here are some examples which are not so straightforward:
sin
3
(x) dx .
3
Solution. We replace sin
2
(x) with 1 cos
2
(x) to get
sin
3
(x) dx =
sin(x)(1 cos
2
(x)) dx = cos(x) + cos
3
(x)/3 .
x
2
+ 1
x + 1
dx .
Solution: Substitute u =
x + 1. This gives x = u
2
1, dx = 2udu and we get
2(u
2
1)
2
+ 1 du.
x
3
x
2
+ 1
dx .
Trying u =
x
2
+ 1 but this does not work. Try u = x
2
+ 1, then du = 2xdx and
dx = du/(2
u 1). Substitute this in to get

u 1
3
2
u 1
u
du =
(u 1)
2
u
=
u
1/2
/2u
1/2
/2 du = u
3/2
/3u
1/2
=
(x
2
+ 1)
3/2
3
(x
2
+1)
1/2
.
Denite integrals
When doing denite integrals, we could nd the antiderivative as described and then ll in the
boundary points. Substituting the boundaries directly accelerates the process since we do not
have to substitute back to the original variables:
b
a
g(u(x))u
(x) dx =
u(b)
u(a)
g(u) du .
Proof. This identity follows from the fact that the right hand side is G(u(b)) G(u(a)) by the
fundamental theorem of calculus. The integrand on the left has the anti derivative G(u(x)). Again
by the fundamental theorem of calculus the integral leads to G(u(b)) G(u(a)).
Top: To keep track which bounds we consider it can help to write
x=b
x=a
f(x) dx.
2
0
sin(x
3
1)x
2
dx. Solution.
x=2
x=0
sin(x
3
+ 1)x
2
dx .
Solution: Use u = x
3
+ 1 and get du = 3x
2
dx. We get
u=7
u=1
sin(u)du/3 = (1/3) cos(u)|
7
1
= [cos(7) + cos(1)]/3 .
Also here, we can see the integrals directly
To integrate f(Ax + B) from a to b we get [F(Ab + B) F(Aa + B)]/A, where F
is the anti-derivative of f.
14

1
0
1
5x+1
dx = [log(u)]/5|
6
1
= log(6)/5.
15

5
3
exp(4x 10) dx = [exp(10) exp(2)]/4.
4
Homework
1 Find the following anti derivatives.
a)
5xsin(x
2
) dx
b)
e
x
5
+x
(x
4
+ 1/5) dx
c) cos(cos(x)) sin(x)
d) e
tan(x)
/ cos
2
(x).
2 Find the following denite integrals.
a)
5
3
x
5
+ x(x
4
+ 1/5) dx
b)
0
sin(x
2
)x dx.
c)
e
1/e
log(x)
x
dx.
d)
1
0
x/
1 + x
2
dx.
3 a) Find the integral
1
0
3x
1 x
4
dx using a substitution and interpreting the result
using a known area.
b) Find the moment of inertia of a rod with density f(x) =
x
3
+ 1 between x = 0
and x = 4.
4 a) Integrate
1
0
arcsin(x)
1 x
2
dx .
b) Find the denite integral
6e
e
dx
log(x)x
.
5 a) Find the indenite integral
x
5
x
2
+ 1
dx .
b) Find the anti-derivative of
f(x) =
1
x(1 + log(x)
2
)
.
Substitution
1
sin(2x + 3) dx
2
1
(x + 8)
5
dx
3
log(5x)
x
dx
4
x
2
+ 1
dx
5
e
x
(e
x
+ 5)
2
; dx
Here is an situation, where substitution appears in an application
lets look at the probability density function. The integral
m =
xf(x) dx
is called the mean of the distribution.
6 Find the mean of the probability density function
f(x) =
1
e
(x3)
2
/2
.
Lecture 29: Integration by parts
If we integrate the product rule (uv)
= u
v +uv
we obtain an integration rule called integration

by parts. It is a powerful tool, which complements substitution. As a rule of thumb, always try
rst to simplify a function and integrate directly, then give substitution a rst shot before trying
integration by parts.
u(x) v (x)dx = u(x)v(x)
(x)v(x) dx.
1 Find
xsin(x) dx. Solution. Lets identify the part which we want to dierentiate and
call it u and the part to integrate and call it v
. The integration by parts method now

proceeds by writing down uv and subtracting a new integral which integrates u
v :
x sin(x) dx = x (cos(x))
1 (cos(x) ) dx = xcos(x) + sin(x) + C dx .

2 Find
xe
x
dx. Solution.
x exp(x) dx = xexp(x)
1 exp(x) dx = xexp(x) exp(x) + C dx .

3 Find
log(x) dx. Solution. There is only one function here, but we can look at it as
log(x) 1
log(x) 1 dx = log(x)x
1
x
x dx = xlog(x) x + C .
4 Find
xlog(x) dx. Solution. Since we know from the previous problem how to integrate
log we could proceed like this. We would get through but what if we do not know? Lets
dierentiate log(x) and integrate x:
log(x) x dx = log(x)
x
2
2

1
x
x
2
2
dx
which is log(x)x
2
/2 x
2
/4.
We see that it is better to dierentiate log rst.
5 Marry go round: Find I =
sin(x) exp(x) dx. Solution. Lets integrate exp(x) and

dierentiate sin(x).
= sin(x) exp(x)
cos(x) exp(x) dx .
Lets do it again:
= sin(x) exp(x) cos(x) exp(x)
sin(x) exp(x) dx.

We moved in circles and are stuck! Are we really. We have derived an identity
I = sin(x) exp(x) cos(x) exp(x) I
which we can solve for I and get
I = [sin(x) exp(x) cos(x) exp(x)]/2 .
1
2
Tic-Tac-Toe
Integration by parts can bog you down if you do it sev-
eral times. Keeping the order of the signs can be daunt-
ing. This is why a tabular integration by parts
method is so powerful. It has been called Tic-Tac-
Toe in the movie Stand and deliver. Lets call it Tic-
Tac-Toe therefore.
6 Find the anti-derivative of x
2
sin(x). Solution:
x
2
sin(x)
2x cos(x)
2 sin(x)
0 cos(x)
The antiderivative is
x
2
cos(x) + 2xsin(x) + 2 cos(x) + C .
7 Find the anti-derivative of (x 1)
3
e
2x
. Solution:
(x 1)
3
exp(2x)
3(x 1)
2
exp(2x)/2
6(x 1) exp(2x)/4
6 exp(2x)/8
0 exp(2x)/16
The anti-derivative is
(x 1)
3
e
2x
/2 3(x 1)
2
e
2x
/4 + 6(x 1)e
2x
/8 6e
2x
/16 + C .
2
cos(x). Solution:
x
2
cos(x)
2x sin(x)
2 cos(x)
0 sin(x)
x
2
sin(x) + 2xcos(x) 2 sin(x) + C .
Ok, we are now ready for more extreme stu.
3
7
cos(x). Solution:
x
7
cos(x)
7x
6
sin(x)
42x
5
cos(x)
120x
4
sin(x)
840x
3
cos(x)
2520x
2
sin(x)
5040x cos(x)
5040 sin(x)
0 cos(x)
F(x) = x
7
sin(x)
+ 7x
6
cos(x)
42x
5
sin(x)
210x
4
cos(x)
+ 840x
3
sin(x)
+ 2520x
2
cos(x)
5040xsin(x)
5040 cos(x) + C .
Do this without this method and you see the value of the method.
1 2 3
:
I myself learned the method from the movie Stand and
Deliver, where Jaime Escalante of the Gareld High
School in LA uses the method. It can be traced down
to an article of V.N. Murty. The method realizes in a
clever way an iterated integration by parts method:
fgdx = fg
(1)
f
(1)
g
2
+ f
(2)
g
(3)
. . .
(1)
n
f
(n+1)
g
(n1)
dx
which can easily shown to be true by induction and jus-
ties the method: the f function is dierentiated again
and again and the g function is integrated again and
again. You see, where the alternating minus signs come
from. You see that we always pair a kth derivative with
a k + 1th integral and take the sign (1)
k
.
Coee or Tea?
1
V.N. Murty, Integration by parts, Two-Year College Mathematics Journal 11, 1980, pages 90-94.
2
David Horowitz, Tabular Integration by Parts, College Mathematics Journal, 21, 1990, pages 307-311.
3
K.W. Folley, integration by parts, American Mathematical Monthly 54, 1947, pages 542-543
4
When doing integration by parts, We want to try rst to dierentiate Logs, Inverse trig functions,
Powers, Trig functions and Exponentials. This can be remembered as LIPTE which is close to
lipton (the tea).
For coee lovers, there is an equivalent one: Logs, Inverse trig functions, Algebraic functions,
Trig functions and Exponentials which can be remembered as LIATE which is close to latte
(the coee).
Whether you prefer to remember it as a coee latte or a lipton tea is up to you.
There is even a better method, the opportunistic method:
Just integrate what you can integrate and dierentiate the rest.
An dont forget to consider integrating 1, if nothing else works.
Homework
1 Integrate
x
2
log(x) dx.
2 Integrate
x
5
sin(x) dx
x
6
exp(x) dx. (*)

xlog(x) dx.
sin(x) exp(x) dx.

(*) If you want to go for the record. Lets see who can integrate the largest x
n
exp(x)! It has to
be done by hand, not with a computer algebra system although.
Integration by parts
1 Find the anti-derivative of log(2x)
x:
2 Stand and deliver!
Find the anti-derivative of x
3
sin(2x):
x
3
sin(2x)

Lecture 30: Numerical integration
Before covering two more integration techniques, we briey look at some numerical techniques.
There are variations of basic Riemann sums but speed up the computation.
Riemann sum with nonuniform spacing
A more general Riemann sum is obtained by choosing n points in [a, b] and dening
Sn =
f(yj)(xj+1 xj) =
yj
f(yj)xj
where yj is in (xj, xj+1).
This generalization allows to use a small mesh size where the function uctuates a lot.
The sum
f(xj)xj is called the left Riemann sum, the sum
f(xj+1)xj the
right Riemann sum.
xk x
y
xk x
y
If x0 = a, xn = b and maxjxj 0 for n then Sn converges to
b
a
f(x) dx.
1 If xj xk = 1/n and zj = xj, then we have the Riemann sum as we dened it earlier.
2 You numerically integrate sin(x) on [0, /2] with a Riemann sum. What is better, the left
Riemann sum or the right Riemann sum? Look also at the interval [/2, ]? Solution:
you see that in the rst case, the left Riemann sum is smaller than the actual integral. In
the second case, the left Riemann sum is larger than the actual integral.
Trapezoid rule
The average between the left and right hand Riemann sum is called the Trapezoid
rule. Geometrically, it sums up areas of trapezoids instead of rectangles.
1
2
xk x
y
The Trapezoid rule does not change things much in the case of equal spacing xk = a +(b a)k/n.
1
2n
[f(x0) + f(xn)] +
1
n
n1
k=1
f(xk) .
Simpson rule
The Simpson rule computes the sum
Sn =
1
6n
n
k=1
[f(xk) + 4f(yk) + f(xk+1)] ,
where yk are the midpoints between xk and xk+1.
The Simpson rule is good because it is exact for quadratic functions: you can check for f(x) =
ax
2
+ bx + c that the formula
1
v u
v
u
f(x) dx = [f(u) + 4f((u + v)/2) + f(v)]/6
holds. To prove it just run the following two lines in Mathematica: (== means is equal)
f [ x ] := a x2 + b x + c ;
Simplify [ ( f [ u]+f [ v]+4 f [ ( u+v)/2])/6==Integrate [ f [ x ] , { x , u , v }] /( vu ) ]

This actually will imply (as you will see in Math 1b) that the numerical integration for functions
which are 4 times dierentiable gives numerical results which are n
4
close to the actual integral.
For 100 division points, this can give accuracy to 10
8
already.
There are other variants which are a bit better but need more function values. If xk, yk, zk, xk+1
are equally spaced, then
The Simpson 3/8 rule computes
1
8n
n
k=1
[f(xk) + 3f(yk) + 3f(zk) + f(xk+1)] .
This formula is again exact for quadratic functions: for f(x) = ax
2
+ bx + c, the formula
1
v u
v
u
f(x) dx = [f(u) + 3f((2u + v)/3) + 3f((u + 2v)/3) + f(v)]/6
holds. If you are interested, run the two Mathematica lines:
f [ x ] := a x2 + b x + c ; L=Integrate [ f [ x ] , { x , u , v }] /( vu ) ;
Simplify [ ( f [ u]+f [ v]+3 f [ ( 2 u+v)/3] +3 f [ ( u+2v)/3])/8==L]

This 3/8 method can be slightly better than the rst Simpson rule.
3
Mean value method
The mean value theorem shows that for xk = k/n, there
are points yk [xk, xk+1] such that f(yj) = F
(yj) =
f(xj+1) f(xj) and so
1
n
n
k=1
f(yk) = F(xn) F(x0) .
This is a version of the fundamental theorem of calculus
which is exact in the sense that for every n, this is a
correct formula. Lets call yk the Rolle points.
The Rolle point is close
to the interval midpoint.
For any partition xk on [a, b] with x0 = a, xn = b, there is a choice of Rolle points
yk [xk, xk+1] such that the Riemann sum
k
f(yk)(x)k is equal to
b
a
f(x) dx.
For linear functions the Rolle points are the midpoints. In general, the deviation g(t) from the
midpoint is small if the interval is [x0 t, x0 + t]. One can estimate g(t) to be of the order
t
2 f
(x0)
6f
(x0)
. We could modify the trapezoid rule and replace the line through the points by a Taylor
polynomial. The Rolle point method is useful for functions which can have poles.
1
Monte Carlo Method
A powerful integration method is to chose n random points xk in [a, b] and look at the sum divided
by n. Because it uses randomness, it is called Monte Carlo method.
The Monte Carlo integral is the limit Sn to innity
Sn =
1
n
n
k=1
f(xk) ,
where xk are n random values in [a, b].
The law of large numbers in probability shows that the Monte Carlo integral is equivalent to the
Lebesgue integral which is more powerful than the Riemann integral. Monte Carlo integration
is interesting especially if the function is complicated.
3 Lets look at the salt and pepper function
f(x) =
1 x rational
0 x irrational
The Riemann integral with equal spacing k/n is equal to 1 for every n. But this is only
because we have evaluated the function at rational points, where it is 1.
The Monte Carlo integral gives zero because if we chose a random number in [0, 1] we hit
an irrational number with probability 1.
1
I have not found the method yet in the literature. I used it myself when working on a problem in probability
theory where functions with poles appeared.
4
The Salt and Pepper function and the Boston Salt and Pepper bridge (Anne Heywood).
4
The following two lines evaluate the area of the Man-
delbrot fractal using Monte Carlo integration. The
function F is equal to 1, if the parameter value c of the
quadratic map z z
2
+c is in the Mandelbrot set and 0
else. It shoots 100
000 random points and counts what

fraction of the square of area 9 is covered by the set.
Numerical experiments give values close to the actual
value around 1.51.... One could use more points to get
more accurate estimates.
F[ c ] : =Block[ { z=c , u=1},Do[ z=N[ z2+c ] ; I f [ Abs[ z ] >3 , u=0; z =3] , {99}] ; u ] ;
M=105; Sum[ F[ 2.5+3 Random[ ] +I (1.5+3 Random[ ] ) ] , {M}] ( 9. 0/M)

Homework
1 Use a computer to generate 20 random numbers xk in [0,1]. Sum up the square x
2
k
of these
numbers and divide by 20. Compare your result with
1
0
x
2
dx. Remark. If using a
program, increase the value of n as large as you can. Here is a Mathematica code:
n=20; Sum[ Random[ ] 2 , { n}] / n

Here is an implementation in Perl. Its still possible to cram the code into one line:
#! / usr / bi n/ p e r l
$n=20; $s =0; for ( $i =0; $i <$n ; $i ++){$f=rand ( ) ; $s+=$f $f ; } print $s /$n ;

2 a) Use the Simpson rule to compute
0
sin(x) dx using n = 2 intervals [0, /2] and [/2, ].
On each of these intervals [a, b] compute the Simpson sum [f(a) + 4f((a + b)/2) + f(b)]/6
with f(x) = sin(x). Compare with the actual integral.
b) Now use the 3/8 Simpson rule to estimate
0
f(x) dx using n = 1 intervals [0, ]. Again
compare with the actual integral.
Instead of adding more numerical methods exercices, we want to practice a bit more
integration. The challenge in the following problems is to nd out which integration
method is bets suited. This is good preparation for the nal, where we will not reveal
which integration method is the best.
3 Integrate tan(x)/ cos(x) from 0 to /6.
4 Find the antiderivative of xsin(x) exp(x).
5 Find the antiderivative of x/ sin(x)
2
.
Numerical methods
velocity v
time t
accident
Foto taken by Oliver (while be-
ing in grad school) from his
Paraglider near Lauterbrunnen
in Switzerland.
1 A paraglider starts a ight in the mountain. The velocity is
given in the above graph. Find out, whether the paraglider lands
lower or higher than where it started.
Hint To estimate integrals take the average of the number A of
squares entirely below the graph and the number B of squares
containing part of the region below the graph. The result A + B
is a good estimate for the area below the graph.
2 Review: Integrate x
1/3
log(x) dx
3 Review: Integrate log(x
5
)(1/x) dx
Lecture 31: Partial fractions
The partial fraction method will be covered in detail
follow up calculus courses like Math 1b. Here we just
look at some samples to see whats out there. We have
learned how to integrate polynomials like x
4
+ 5x + 3.
What about rational functions? We will see here that
they are a piece of cake - if you have the right guide of
course ...
What we know already
Lets see what we know already:
We also know that integrating 1/x gives log(x). We can for example integrate
1
x 6
dx = log(x 6) + C .
We also have learned how to integrate 1/(1 + x
2
). It was an important integral:
1
1 + x
2
dx = arctan(x) + C .
Using substitution, we can do more like
dx
1 + 4x
2
=
du/2
1 + u
2
= arctan(u)/2 = arctan(2x)/2 .
We also know how to integrate functions of the type x/(x
2
+c) using substitution. We can
write u = x
2
+ c and get du = 2xdx so that
x
x
2
+ c
dx =
1
2u
du =
log(x
2
+ c)
2
.
Also functions 1/(x + c)
2
can be integrated using substitution. With x + c = u we get
du = dx and
1
(x + c)
2
dx =
1
u
2
du =
1
u
+ C =
1
x + c
+ C .
The partial fraction method
We would love to be able to integrate any rational function
f(x) =
p(x)
q(x)
,
where p, q are polynomials. This is where partial fractions come in. The idea is to write a
rational function as a sum of fractions we know how to integrate. The above examples have shown
that we can integrate a/(x + c), (ax + b)/(x
2
+ c), a/(x + c)
2
and cases, which after substitution
are of this type.
The partial fraction method writes p(x)/q(x) as a sum of functions of the above
type which we can integrate.
This is an algebra problem. Here is an important special case:
1
2
In order to integrate
1
(xa)(xb)
dx, write
1
(x a)(x b)
=
A
x a
+
B
x b
.
and solve for A, B.
In order to solve for A, B, write the right hand side as one fraction again
1
(x a)(x b)
=
A(x b) + B(x a)
(x a)(x b)
.
We only need to look at the nominator:
1 = Ax Ab + Bx Ba .
In order that this is true we must have A+B = 0, Ab Ba = 1. This allows us to solve for A, B.
Examples
1 To integrate
2
1x
2
dx we can write
2
1 x
2
=
1
1 x
+
1
1 + x
and integrate each term
2
1 x
2
= log(1 + x) log(1 x) .
2 Integrate
52x
x
2
5x+6
dx. Solution. The denominator is factored as (x 2)(x 3). Write
5 2x
x
2
5x + 6
=
A
x 3
+
B
x 2
.
Now multiply out and solve for A, B:
A(x 2) + B(x 3) = 5 2x .
This gives the equations A + B = 2, 2A 3B = 5. From the rst equation we get
A = B 2 and from the second equation we get 2B + 4 3B = 5 so that B = 1 and
so A = 1. We have not obtained
5 2x
x
2
5x + 6
=
1
x 3

1
x 2
and can integrate:
5 2x
x
2
5x + 6
dx = log(x 3) log(x 2) .
Actually, we could have got this one also with substitution. How?
3 Integrate f(x) =
1
14x
2 dx. Solution. The denominator is factored as (1 2x)(1 + 2x).
Write
A
1 2x
+
B
1 + 2x
=
1
1 4x
2
.
We get A = 1/4 and B = 1/4 and get the integral
f(x) dx =
1
4
log(1 2x)
1
4
log(1 + 2x) + C .
3
Hopitals method
There is a fast method to get the coecients:
If a is dierent from b, then the coecients A, B in
p(x)
(x a)(x b)
=
A
x a
+
B
x b
,
are
A = lim
xa
(x a)f(x) = p(a)/(a b), B = lim
xb
(x b)f(x) = p(b)/(b a) .
Proof. If we multiply the identity with x a we get
p(x)
(x b)
= A +
B(x a)
x b
.
Now we can take the limit x a without peril and end up with A = p(a)/(x b).
Cool, isnt it? This Hopital method can save you a lot of time! Especially when you deal
with more factors and where sometimes complicated systems of linear equations would have to be
solved. Remember
Math is all about elegance and does not use complicated methods if simple ones are
available.
Here is an example:
4 Find the anti-derivative of f(x) =
2x+3
(x4)(x+8)
. Solution. We write
2x + 3
(x 4)(x + 8)
=
A
x 4
+
B
x + 8
Now A =
24+3
4+8
= 11/12, and B =
2(8)+3
(84)
= 13/12. We have
2x + 3
(x 4)(x + 8)
=
(11/12)
x 4
+
(13/12)
x + 8
.
The integral is
11
12
log(x 4) +
13
12
log(x + 8) .
Here is an example with three factors:
5 Find the anti-derivative of f(x) =
x
2
+x+1
(x1)(x2)(x3)
. Solution. We write
x
2
+ x + 1
(x 1)(x 2)(x 3)
=
A
x 1
+
B
x 2
+
C
x 3
Now A =
1
2
+1+1
(12)(13)
= 3/2 and B =
2
2
+2+1
(21)(23)
= 7 and C =
3
2
+3+1
(31)(32)
= 13/2. The
integral is
3
2
log(x 1) 7 log(x 2) +
13
2
log(x 3) .
And because we like it extreme, here is a larger example:
6 Find the anti-derivative of
f(x) =
1
x(x 1)(x 2)(x 3)(x 4)(x 5)(x 6)(x 7)(x 8)(x 9)
.
Ask your friends whether they have done a partial fraction example with 10th degree
polynomial in the denominator. I bet they didnt do any. Since I have never seen such an
4
example in a text book, look at this example as a rst:
Solution. We write
f(x) =
A0
x
+
A1
x 1
+
A2
x 2
+
A3
x 3
+
A4
x 4
+
A5
x 5
+
A6
x 6
+
A7
x 7
+
A8
x 8
+
A9
x 9
.
The constants are
A0 =
1
(0 1)(0 2)(0 3)(0 4)(0 5)(0 6)(0 7)(0 8)(0 9)
=
1
362880
A1 =
1
(1 0)(1 2)(1 3)(1 4)(1 5)(1 6)(1 7)(1 8)(1 9)
=
1
40320
A2 =
1
(2 0)(2 1)(2 3)(2 4)(2 5)(2 6)(2 7)(2 8)(2 9)
=
1
10080
A3 =
1
(3 0)(3 1)(3 2)(3 4)(3 5)(3 6)(3 7)(3 8)(3 9)
=
1
4320
A4 =
1
(4 0)(4 1)(4 2)(4 3)(4 5)(4 6)(4 7)(4 8)(4 9)
=
1
2880
A5 =
1
(5 0)(5 1)(5 2)(5 3)(5 4)(5 6)(5 7)(5 8)(5 9)
=
1
2880
A6 =
1
(6 0)(6 1)(6 2)(6 3)(6 4)(6 5)(6 7)(6 8)(6 9)
=
1
4320
A7 =
1
(7 0)(7 1)(7 2)(7 3)(7 4)(7 5)(7 6)(7 8)(7 9)
=
1
10080
A8 =
1
(8 0)(8 1)(8 2)(8 3)(8 4)(8 5)(8 6)(8 7)(8 9)
=
1
40320
A9 =
1
(9 0)(9 1)(9 2)(9 3)(9 4)(9 5)(9 6)(9 7)(9 8)
=
1
362880
.
The integral is
log(x)
362880
+
log(x 1)
40320

log(x 2)
10080
+
log(x 3)
4320

log(x 4)
2880
+
log(x 5)
2880
log(x 6)
4320
+
log(x 7)
10080

log(x 8)
40320
+
log(x 9)
362880
.
Homework
1

2dx
x
2
4
.
2

5dx
4x
2
+1
.
3

x
3
x+1
x
2
1
dx.
4

3x
2
(x
2
+x+1)(x1)
dx
5

1
(x+1)(x1)(x+7)(x3)
dx. Use Hopitals method of course!
Hint for 3). Subtract rst a polynomial.
Hint for 4). Find the nominator of
Ax+B
x
2
+x+1
+
C
x1
and set it 3x
2
. To do so, multply
out.
Partial fractions
1 Integrate
1
1+x
.
2 Integrate
9
(x1)
2
.
3 Integrate
7
x
2
+1
.
4 Integrate
1
1x
4
Hint: write the last one rst in the form
A/(x
2
1) + B/(1 + x
2
)
Lecture 32: Trig substitutions
Trig substitution is a special case of substitution, where x is a trigonometric function of u or
u is a trigonometric function of x. Also this topic is covered more in follow up courses like Math
1b. This lecture allows us to practice more the substitution method.
Here is an important example:
1 The area of a half circle of radius 1 is given by the integral
1
1
1 x
2
dx .
Solution. Write x = sin(u) so that cos(u) =
1 x
2
. dx = cos(u)du. We have
sin(/2) = 1 and sin(/2) = 1 the answer is
/2
/2
cos(u) cos(u)du =
/2
/2
(1 + cos(2u))/2 =

2
.
Lets generalize this a bit and do the same computation for a general radius r:
2 Compute the area of a half disc of radius r which is given by the integral
r
r
r
2
x
2
dx .
Solution. Write x = r sin(u) so that r cos(u) =
r
2
x
2
and dx = r cos(u) du and
r sin(/2) = r and and r sin(/2) = r. The answer is
/2
/2
r
2
cos
2
(u) du = r
2
/2 .
Here is an example, we know how to integrate
3 Find the integral
dx
1 x
2
.
We know the answer is arcsin(x). How can we do that without knowing? Solution. We
can do it also with a trig substitution. Try x = sin(u) to get dx = cos(u) du and so
cos(u) du
cos(u)
= u = arcsin(x) + C .
1
2
Here is an example, where tan(u) is the right substitution. You have to be told that
rst. It is hard to come up with the idea:
4 Find the following integral:
dx
x
2
1 + x
2
by using the substitution x = tan(u). Solution. Then 1 + x
2
= 1/ cos
2
(u) and dx =
du/ cos
2
(u). We get
du
cos
2
(u) tan
2
(u)(1/ cos(u))
=
cos(u)
sin
2
(u)
du = 1/ sin(u) = 1/ sin(arctan(x)) .
Trig substitution is based on the trig identity :
cos
2
(u) + sin
2
(u) = 1
Depending on whether you divide this by sin
2
(u) or cos
2
(u) we get
1 + tan
2
(u) = 1/ cos
2
(u), 1 + cot
2
(u) = 1/ sin
2
(u)
These identities are worth remembering. Lets look at more examples:
5 Evaluate the following integral
x
2
/
1 x
2
dx .
Solution: Substitute x = cos(u), dx = sin(u) du and get

cos
2
(u)
sin(u)
sin(u)du =
cos
2
(u) du =
u
2
sin(2u)
4
+C =
arcsin(x)
2
+
sin(2 arcsin(x))
4
+C .
dx
(1 + x
2
)
2
.
Solution: we make the substitution x = tan(u), dx = du/(cos
2
(u)). Since 1 + x
2
=
cos
2
(u) we have
dx
(1 + x
2
)
2
=
cos
2
(u) du = (u/2) +
sin(2u)
4
+ C =
arctan(u)
2
+
sin(2 arctan(u)
4
+ C .
Here comes an other prototype problem:
7 Find the anti derivative of 1/ sin(x). Solution: We use the substitution u = tan(x/2)
which gives x = 2 arctan(u), dx = 2du/(1 + u
2
). Because 1 + u
2
= 1/ cos
2
(x/2) we have
2u
1 + u
2
= 2 tan(x/2) cos
2
(x/2) = 2 sin(x/2) cos(x/2) = sin(x) .
Plug this into the integral
1
sin(x)
dx =
1 + u
2
2u
2du
1 + u
2
=
1
u
du = log(u) + C = log(tan(
x
2
)) + C .
Unlike before, where x is a trig function of u, now u is a trig function of x. This example
shows that the substitution u = tan(x/2) is magic. Because of the following identities
3
u = tan(x/2)
dx =
2du
(1+u
2
)
sin(x) =
2u
1+u
2
cos(x) =
1u
2
1+u
2
it allows us to reduce any rational function involving trig functions to rational functions.
Any function p(x)/q(x) where p, q are trigonometric polynomials can be integrated
using elementary functions.
It is usually a lot of work but here is an example:
8 To nd the integral
cos(x) + tan(x)
sin(x) + cot(x)
dx
for example, we replace dx, sin(x), cos(x), tan(x) = sin(x)/ cos(x), cot(x) = cos(x)/ sin(x)
with the above formulas we get a rational expression which involves u only This gives us
an integral
p(u)/q(u) du with polynomials p, q. In our case, this would simplify to
2u (u
4
+ 2u
3
2u
2
+ 2u + 1)
(u 1)(u + 1) (u
2
+ 1) (u
4
4u
2
1)
du
The method of partial fractions provides us then with the solution.
4
Homework
1 Find the antiderivative:

1 4x
2
dx .
(1 x
2
)
3/2
dx .

1 x
2
x
2
dx .
4 Integrate
1
1 + sin(x)
using the substitution x = tan(u).
Hint. Look at the example in this handout.
5 Compute
dx
cos(x)
using the substitution u = tan(x/2).
Hint. Look at the example in this handout text and use the identity (1 u
2
)/(1 + u
2
) =
cos(x).
Trig Substitutions
1 Integrate
1 + x
2
. Hint. Use x = tan(u).
2 Integrate
1 x
2
. Hint. Use x = cos(u).
3 Integrate
x
2
1. Hint. Use x = 1/ cos(u).
4 Integrate
arccos(x)
1x
2
. Hint. Use x = cos(u).
Lecture 33: Calculus and Music
A music piece is a function
Calculus plays a role in music because every music piece just is a function. If you have a
loudspeaker with a membrane at position f(t) at time t, then you can listen to the music. The
pressure variations in the air are sound waves which reach your ear, where your eardrum oscillates
with the function f(t T) + g(t) where g(t) is background noise and T is a time delay for the
sound reach your ear. Plotting and playing works the same way. In Mathematica, we can play a
function with
Play[ Sin [ 2 Pi 1000 x 2] , { x , 0 , 1 0 } ]

This function contains all the information about the music piece. A music .WAV le contains
sampled values of the function. A sample rate of 44100 per second is usual. In .MP3 les
essential values are encoded in a compressed way. We take this lecture as an opportunity to
review some facts about functions. We especially see that log, exp and trigonometric functions
play an important role in music.
The wave form and hull
A periodic signal is the building block of sound. Assume g(x) is a 2 periodic function, we can
generate a sound of 440 Hertz when playing the function f(x) = g(440 2x). If the function does
not have a smaller period, then we hear the A tone with 440 Hertz.
A periodic function g is called a wave form.
0.2 0.4 0.6 0.8 1.0
1.0
0.5
0.5
1.0
0.2 0.4 0.6 0.8 1.0
1.0
0.5
0.5
1.0
0.5 1.0 1.5 2.0
0.4
0.2
0.2
0.4
0.2 0.4 0.6 0.8 1.0
3
2
1
1
2
3
The wave form makes up the timbre of a sound which allows to model music instruments with
macroscopic terms like attack, vibrato, coloration, noise, echo, reverbation and other character-
istics.
The upper hull function is dened as the interpolation of successive local maxima
of f. The lower hull function is the interpolation of the local minima.
For the function f(x) = sin(100x) for example, the upper hull function is g(x) = 1 and the lower
hull function is g(x) = 1. For f(x) = sin(x) sin(100x) the upper hull function is approximately
g(x) = | sin(x)| and the lower hull function is approximately g(x) = | sin(x)|.
1
2
1 2 3 4 5
1.0
0.5
0.5
1.0
We can not hear the actual function because
the function changes too fast that we can no-
tice individual vibrations. But we can hear
the hull function. Simplest examples are
change of dynamics in music like creshendi
or diminuendi or a vibrato. We can gener-
ate a beautiful hull by playing two frequen-
cies which are close. You hear interference.
The scale
Western music uses a discrete set of frequencies. This scale is based on the exponential function.
The frequency f is an exponential function of the scale s. On the other hand, if the frequency is
known then the scale number is a logarithm.
The Midi numbering of musical notes is
s = 69 + 12 log
2
(f/440)
1 What is the frequency of the Midi tone 100? Solution. We have to solve the above
equation for f and get the piano scale function
f(s) = 440 2
(s69)/12
.
Evaluated at 100 we get 2637.02 Hz.
The piano scale function
f(s) = 440 2
(s69)/12
.
is an exponential function f(s) = be
as
which satises f(s + 12) = 2f(s).
2 Find the discrete derivative Df(x) = f(x+1)f(x) of the Piano scale function. Solution:
The function is of the form f(x) = A2
ax
. We have f(x+1) = 2
a
f and so Df(x) = (2
a
1)f
with a = 1/12. Lets get reminded that such discrete relations lead to the important
property
d
dx
exp(ax) = a exp(x) for the exponential function.
mi di f r equency [ m ] := N[ 440 2( (m 69) /12) ]

The classical piano covers the 88 Midi tone scale from 21 to 108. The lowest frequency
is 27.5Hz, the sub-contra-octave A, the highest 4186.01Hz, the 5-line octave C.
40 60 80 100
1000
2000
3000
4000
3
Here are some mathematical operations which one can do with a piece of music. We will
demonstrate some during class.
Decomposition in overtones: low and high pass lter It turns out that every
wave form can be written as a sum of sin and cos functions. Our ear does this Fourier
decomposition automatically. We can here melodies. Here is an example of a decom-
position: f(x) = sin(x) + sin(2x)/2 + sin(3x)/3 + sin(4x)/4 + sin(5x)/5. With innitely
many terms, one can also discribe discontinuous functions.
Filtering and tuning: pitch and autotune An other advantage of a decomposition of a
function into basic building blocks is that one can leave out frequencies which are not good.
Examples are low pass or high pass lters. A popular lter is autotune which does not
lter but moves the frequencies around so that you can no more sing wrong. If 440 Herz (A)
and 523.2 Herz (C) for example were the only allowed frequencies, the lter would change
a function f(x) = sin(2441x) + 4 cos(2521x) to g(x) = sin(2440x) + 4 cos(2523.2x).
This ltering is done on the wave form scale.
Mixing dierent functions: rip and remix If f and g are two functions which repre-
sent songs, we can look at (f +g)/2 which is the average of the two songs. In real life this
is done using tracks. Dierent instruments can be recorded independently for example
and then mixed together. One can for example get guitare g(t), voice v(t) and piano p(t)
and form f(t) = ag(t) + bv(t) + c(p(t), where the constants a, b, c are chosen.
Dierentiate functions: reverb and echo If f is a song and h is some time interval, we
can look at g(x) = Df(x) = [f(x + h) f(x)]/h. Such a dierentiation is easy to achieve
with a real song. It turns out that for small h, like of order of h = 1/1000, the song does
not change much. The reason is that a frequency sin(kx) or hearing the derivative cos(kx)
produces the same song. However, if we allow h to be larger, then a reverb or echo eect
is produced.
Other relations with math
Symmetries. Symmetries play an important role in art and science. In geometry we
know rotational, translational symmetries or reection symmetries. Like in geometry, sym-
metries play a role both in Calculus as well as in Music. We see some examples in the
presentation.
Mathematics and music have a lot of overlap. Besides wave form analysis and music ma-
nipulation operations and symmetry, there are encoding and compression problems,
Diophatine problems like how good frequency rations are approximated by rationals:
Why is the chromatic scale based on the twelfth root of 2 so good? Indian music for
example uses microtones and a scale of 22. The 12 tone scale is good because many
powers 2
k/12
are close to rational numbers. I once dened the scale tness function
M(n) =
n
k=1
minp,q|2
k/n
p
q
|G(p, q)
which is a measure on how good a music scale is. It uses Eulers gradus suavis (=degree
of pleasure) function G(n, m) of a fraction n/m which is G(n, m) = 1+E(nm/gcd(n, m)
2
),
where the Euler gradus function E(n) =
p|n
(p 1) and p runs over all prime factors
p of n. The picture to the left shows Eulers function G(n, m), the right hand side the
scale tness function in dependence on n. You see that n = 12 is clearly the winner.
This analysis could be rened to include scales like Stockhausens 5
k/25
scale. You can
4
listen to the Stockhausens scale with f(t) = sin(2t100 5
[t]/25
), where [t] is the largest
integer smaller than t. Our familiar 12-tone scale can be admired by listening to f(t) =
sin(2t100 2
[t]/12
).
5 10 15 20 25 30
0.01
0.02
0.03
0.04
0.05
3 The perfect fth 3/2 has the gradus suavis 1 + E(6) = 1 + 2 = 3 which is the same than
the perfect fourth 4/3 for which 1 + E(12) = 1 + (2 1)(3 1). You can listen to the
perfect fth f(x) = sin(1000x) + sin(1500x) or the perfect fourth sin(1000x) + sin(1333x)
and here is a function representing an accord with four notes sin(1000x) + sin(1333x) +
sin(1500x) + sin(2000x).
Homework
1 Modulation. How do the following function sound? Listen to them for 10 seconds then
draw the hull function.
a) f(x) = sin(1000x) sin(1001x)
b) f(x) = sin(x) + cos(tan(1000
x))
c) f(x) =

xcos(10000x)
d) f(x) = cos(x) sin(e
2x
)/2
Here is how to play a function with Mathematica. It will play for 9 seconds:
Play[ Cos [ x ] Sin [ Exp[ 2 x ] ] / x , {x , 0 , 10}]

Hint. You can play functions online with Wolfram Alpha. Here is an example:
pl ay s i n (1000 x)

2 Amplitude modulation (AM): If you listen to f(x) = sin(x) sin(1000x) you hear an
amplitude change. Draw the hull function. How many increase in amplitudes to you hear
in 10 seconds?
3 Frequency Modulation (FM): If we play f(x) = xsin(1000 sin(x)), there are points,
where the frequency is low. This is a frequency change. Draw the hull function.
4 Smoothness: If we play the function f(x) = tan(sin(3000 sin(x))), the sound sounds
pretty nice. If we change that to f(x) = tan(2 sin(3000 sin(x))), the sound is awful. Can
you see why? To answer this, you might want to plot a similar function where 3000 is
replaced by 3.
5 A mystery sound: How would you describe the sound f(x) = sin(1/ sin(23x))? Our
ear can not hear frequencies below 20 Hertz. Why can one still hear something? To answer
this, you might want to plot the function from x = 0 to x = 10.
Calculus in Music
1 How do you think the function
f(x) = sin(10000
x)
sounds?
2 What about
f(x) = sin(10000x
2
)?
3 And how about
f(x) = arctan(x) sin(tan(x)1000x))?
4 And nally
sin(x) sin(1000x)?
Lecture 34: Calculus and Statistics
In this lecture, we look at an application of calculus to statistics. We have already dened the
probability density function f and its anti-derivative, the cumulative distribution function.
This lecture is given by Brian Luko. This document is what I had prepared for this lecture.
Brians handout will be the ocial course note.
Functions
In statistics, functions appear at many places. First of all for random variables. Then for probabil-
ity density functions and cumulative probability density functions. In order to compute quantities
like expectation and variance, we have to integrate.
The expectation of probability density function f is
m =
xf(x) dx .
The variance of probability density function f is
(x
2
m)f(x) dx
where m is the expectation. The square root of the variance is the standard
deviation.
The expectation of the normal distribution
f(x) =
1
2
2
e
(xm)
2
/(2
2
)
dx
is equal m. The standard deviation is .
The expectation of the geometric distribution f(x) = e
ax
a
xe
ax
a dx = 1/a .
The variance of the geometric distribution f(x) = e
ax
a is 1/a
2
and the standard
deviation 1/a.
To see this, compute (remember Tic Tac Toe!)
x
2
e
ax
a dx = 2/a
3
.
x
2
e
ax
2x e
ax
/a
2 e
ax
/a
2
0 e
ax
/a
3
1
2
More Problems
1 The uniform distribution on [a, b] is the probability density function which is zero for x
outside the interval [a, b] and equal to 1/(b a) for x [a, b]. Find the mean and standard
deviation of this density.
2 The Laplace distribution is also called the double exponential distribution. It has the
density 1/(2a)e
a|x|
. Find the mean and standard deviation of the Laplace distribution.
3 The Logarithmic distribution on [1, 2] has the density C log(x), where C is a constant
which makes it a density. What is the constant C?
4 The Rayleigh distribution is the probability density which is 0 for x < 0 and xe
x
2
/(2s
2
)
/s
2
for x > 0). Verify that its mean is s
/2 and its variance is (4 )

2
/2.
5 The Maxwell distribution is the probability density which is zero for x < 0 and
2
2
x
2
e
x
2
(/2a
2
)
/a
3
for x 0. Verify that its mean is 4a/
2 and its variance is a

2
(3
8)/.
Math 1A: introduction to functions and calculus April 20, 2011, Brian Luko
A random variable X is a variable that can take one of many values depending
on the outcome of some random process.
For example, C could be the random variable that represents the number of heads after ipping
a coin twice. Then the probability of getting 0 for C is P(C = 0) =
1
4
. Likewise, P(C = 1) =
1
2
and P(C = 2) =
1
2
. If a random variable X is discrete, taking on integer values, it must always
be true that
k
P(X = k) = 1;
in other words, the sum of the probabilities of all possible outcomes is 1 (100%).
The expected value of a random variable X, denoted E[X], is the mean, or our
single best guess (expectation) for what any given value of X might be. For a
discrete random variable X, the expectation is
k
k P(X = k),
1 For our variable C evaluates to 0P(C = 0)+1P(C = 1)+2P(C = 2) = 0
1
4
+1
1
2
+2
1
4
= 1;
in other words, if we ip a coin twice many times, on average well get 1 head for each pair
of ips.
The variance of a random variable X, denoted Var[X], is a measure of how spread
apart a distribution is (how likely we are to get values of X that are far from the
mean). It is
k
(k E[X])
2
P(X = k),
which for our variable C evaluates to (0 1)
2
P(C = 0) + (1 1)
2
P(C = 1) + (2
1)
2
P(C = 2) =
1
4
+
1
4
=
1
2
.
Note that we are basically just adding up (k E[X])
2
(which is 0 when k is at the mean of X
and gets larger when k gets further away from the mean) and weighting it by the probability of
having X = k.
Now lets consider a continuous random variable X that could take on any real number.
2 As an example, H might be the height, in inches, of a randomly chosen Harvard student.
Note that since H is continuous, P(H = h) = 0 for any particular height h, so it makes
more sense to talk about P(h H < h + ) for some small > 0.
For a random variable, the probability density function for that random variable
is a function f(x) such that
P(a X b) =
b
a
f(x) dx.
You can probably see where we are going with this: instead of the sum of all prob-
abilities being 1, it will instead be the case that
f(x) dx = 1.
1
2
Similarly, the expected value of a continuous distribution X is
E[X] =
xf(x) dx
and the variance is
Var[X] =
(x E[X])
2
f(x) dx.
1 Our height distribution H would have what is called a normal probability density func-
tion. (Many natural quantities follow a normal distribution.)
The normal probability density function is
f(x) =
1
2
e
(x)
2
/2
2
.
If X has a normal distribution, then
E[X] =
xf(x) dx = and Var[X] =
(x )
2
f(x) dx =
2
.
2 Not every distribution is normal, though. For example, incomes are not normally distributed:
most people have relatively moderate incomes, but no one has a negative income and there are a
few people that have very high incomes. Some people (see below) argue that income follows an
exponential distribution, a distribution with probability distribution function f(x) = e
x
(where > 0 is a constant).
Exponential distributions have mean
E[X] =

0
xe
x
dx =
1
and variance
Var[X] =
x
1
2
e
x
dx =
1
2
.
Another quantity of interest is the cumulative distribution function, which is
F(x) = P(X x) =
x
0
f(x) dx.
For our income function, the mean household income in the US in 1997 is a random variable I
that is exponentially distributed with mean $35,200
1
, so = 1/35200. The probability that a
randomly selected person makes $100,000 or less is
P(I 100000) =
100000
0
1
35200
e
t/35200
dt = 1 e
100000/35200
= 94%.
3 The probability density function f(x) = (a 1)/x
a
represents a power law distribution,
where a > 1 is a constant parameter that changes the shape of the distribution.
1
Dragulescu & Yakovenko (2001). Evidence for the exponential distribution of income in the USA. The Euro-
pean Physical Journal B, 20, 585-559.
3
Homework
1 The uniform distribution on [a, b] is a distribution where any real number between a
and b is equally likely to occur. The probability density function is f(x) = 1/(b a) for
a x b and 0 elsewhere. Verify that f(x) is a valid probability density function (i.e.,
check that it integrates to 1).
2 Find the mean of the uniform distribution on [a, b].
3 Explain why the mean you found in problem 2 makes sense intuitively.
4 The Cauchy distribution is important in physics. It has a probability density function
of
f(x) =
1
b
(x m)
2
+ b
2
.
Verify that f(x) is a valid probability density function.
5 Find the cumulative distribution function F(x) for the Cauchy distribution.
In this lecture, we look at an application of calculus to statistics. We have already dened the
probability density function f and its anti-derivative, the cumulative distribution function.
This lecture is given by Brian Luko. This document is what I had prepared for this lecture.
Brians handout will be the ocial course note.
Functions
In statistics, functions appear at many places. First of all for random variables. Then for probabil-
ity density functions and cumulative probability density functions. In order to compute quantities
like expectation and variance, we have to integrate.
The expectation of probability density function f is
m =
xf(x) dx .
The variance of probability density function f is
(x
2
m)f(x) dx
where m is the expectation. The square root of the variance is the standard
deviation.
The expectation of the normal distribution
f(x) =
1
2
2
e
(xm)
2
/(2
2
)
dx
is equal m. The standard deviation is .
The expectation of the geometric distribution f(x) = e
ax
a
xe
ax
a dx = 1/a .
The variance of the geometric distribution f(x) = e
ax
a is 1/a
2
and the standard
deviation 1/a.
To see this, compute (remember Tic Tac Toe!)
x
2
e
ax
a dx = 2/a
3
.
x
2
e
ax
2x e
ax
/a
2 e
ax
/a
2
0 e
ax
/a
3
1
2
More Problems
1 The uniform distribution on [a, b] is the probability density function which is zero for x
outside the interval [a, b] and equal to 1/(b a) for x [a, b]. Find the mean and standard
deviation of this density.
2 The Laplace distribution is also called the double exponential distribution. It has the
density 1/(2a)e
a|x|
. Find the mean and standard deviation of the Laplace distribution.
3 The Logarithmic distribution on [1, 2] has the density C log(x), where C is a constant
which makes it a density. What is the constant C?
4 The Rayleigh distribution is the probability density which is 0 for x < 0 and xe
x
2
/(2s
2
)
/s
2
for x > 0). Verify that its mean is s
/2 and its variance is (4 )

2
/2.
5 The Maxwell distribution is the probability density which is zero for x < 0 and
2
2
x
2
e
x
2
(/2a
2
)
/a
3
for x 0. Verify that its mean is 4a/
2 and its variance is a

2
(3
8)/.
Lecture 35: Calculus and Economics
In this lecture we look more at applications of calculus and focus mostly on economics. This is
an opportunity to review extrema problems.
Marginal and total cost
Recall that the marginal cost was dened as the derivative of the total cost. Both, the marginal
cost and total cost are functions of the quantity of goods produced.
1 Assume the total cost function is C(x) = 10x + 0.01x
2
. Find the marginal cost and the
place where the total cost is maximal. Solution. Dierentiate.
2 You sell spring water. The marginal cost to produce depends on the season and given by
f(x) = 10 10 sin(2x). For which x is the total cost maximal?
3 The following example is adapted from the book Dominik Heckner and Tobias Kretschmer:
Dont worry about Micro, 2008, where the following strawberry story appears: (verbatim
citation in italics):
Suppose you have all sizes of strawberries, from very
large to very small. Each size of strawberry exists twice
except for the smallest, of which you only have one.
Let us also say that you line these strawberries up from
very large to very small, then to very large again. You
take one strawberry after another and place them on a
scale that sells you the average weight of all strawberries.
The rst strawberry that you place in the bucket is very
large, while every subsequent one will be smaller until
you reach the smallest one. Because of the literal weight
of the heavier ones, average weight is larger than mar-
ginal weight. Average weight still decreases, although
less steeply than marginal weight. Once you reach the
smallest strawberry, every subsequent strawberry will be
larger which means that the rate of decease of the aver-
age weight becomes smaller and smaller until eventually,
it stands still. At this point the marginal weight is just
equal to the average weight.
Lets recall that if F(x) is the total cost function in dependence of the quantity x, then F
= f
is called the marginal cost.
The function g(x) = F(x)/x is called the average cost.
A point where f = g is called a break even point.
4 If f(x) = 4x
3
3x
2
+1, then F(x) = x
4
x
3
+x and g(x) = x
3
x
2
+1. Find the break even
point and the points where the average costs are extremal. Solution: To get the break
even point, we solve f g = 0. We get f g = x
2
(3x4) and see that x = 0 and x = 4/3
are two break even points. The critical point of g are points where g
(x) = 3x
2
4x. They
agree:
1
2
0.5 1.0 1.5
1
1
2
3
The following theorem tells that the marginal cost is equal to the average cost if and only if the
average cost has a critical point. Since total costs are typically concave up, we usually have break
even points are minima for the average cost. Since the strawberry story illustrates it well, lets
call it the strawberry theorem:
Strawberry theorem: We have g
(x) = 0 if an only if f = g.
Proof.
g
= (F(x)/x)
= F
/x F/x
2
= (1/x)(F
F/x) = (1/x)(f g) .
Volume extremization
1 Assume the cost to heat a room is V (x) + A(x) L(x) where V is its volume, A is the
surface area and L(x) = x is proportional to length x. A conference center hall is is
eighth of a sphere. Its volume, surface area and length are
V (x) =
4x
3
3
1
8
, A(x) = (
4
8
+
3
4
)x
2
, L(x) = x .
The costs are /6x
3
+ (3/4 + 4/8)x
2
x. To extremize the cost, we can minimize
f(x) = x
3
/6 + 5x
2
/4 x .
The minimum is achieved at x = (5 +
3)/2.
2 A cone shaped solar loudspeaker has to be a cone of volume . For optimal charging
features, the sum of vertical and horizontal shadow areas hr +r
2
need to be extremized.
Can you get a minimum or maximum? Solution. Lets rst compute the volume of a cone
3
with maximal radius r and height h. At height z, the radius is rz/h. At z the surface area
is A(z) = (hz/r)
2
so that the volume is
V =
h
0
(r
2
z
2
/h
2
)dz = r
2
h/3 = .
This means h = 3/r
2
and hr = 3/r. The cross section is f(r) = r
2
+ 3/r. Setting
f
(r) = 0, we get the critical point (3/(2))

1/3
.
Source: Grady Klein and Yoram Bauman, The Cartoon Introduction to
Economics: Volume One Microeconomics, published by Hill and Wang.
4
Homework
1 Verify the Strawberry theorem in the case f(x) = cos(x).
2 The production function in an oce gives the production Q(L) in dependence of labor
L. Assume Q(L) = 500L
3
3L
5
. Find L which gives the maximal production.
This can be typical: For smaller groups, production usu-
ally increases when adding more workforce. After some
point, bottle necks occur, not all resources can be used at
the same time, management and bureaucracy is added,
each individuum has less impact and feels less responsi-
ble, meetings slow down production etc. In this range,
adding more people will decrease the productivity.
3 Marginal revenue f is the rate of change in total revenue F. As total and marginal
cost, these are functions of the cost x. Assume the total revenue is F(x) = 5xx
5
+9x
3
.
Find the point, where the total revenue has a local maximum.
4 To nd the line y = mx through the points (3, 4), (6, 3), (2, 5). We have to minimize the
function
f(m) = (3m4)
2
+ (6m3)
2
+ (2m5)
2
.
5 For any a we look at the solid obtained by rotating the graph of the function f(r) =
a sin(r/a) around the axes over the interval [0, /a]. For which a is the volume locally
maximal?
P.S. You can see the graph of the volume V (a) in dependence of a below. There are many
local maxima. The problem is to nd them.
0.5 1.0 1.5 2.0
1
2
3
4
5
6
7
Calculus in Economics
The fact that extremisation is a big deal in economics is already in
the word. We look at more examples. Assume we have a couple of
data points and we want to nd the best line y = mx through this
in the sense that the sum of the squares of the points to the line is
minimal. This leads to an extremal problem which is a special case of
a data tting problem. It would be more adequate to t with lines
y = mx + b or more generally with functions but then we have more
variables and run into multivariable calculus or linear algebra
problems much outside the scope of this course. But if we have only
one parameter, we get a single variable calculus problem.
1 Find the best line y = mx through the points (1, 1), (3, 2), (2, 5).
We have to minimize the function.
f(m) = (m1)
2
+ (3m2)
2
+ (2m5)
2
Find the minimum. Solution. 17/14
Lets take a dierent set of data points and look at the problem to
t functions of the form y = x + b.
2 Find the best line y = x+b through the points (1, 2), (2, 5), (1, 2), (4
Source: Dominik Heckner and Tobias Kretschmer: Dont worry
about Micro, 2008, pages 271-274.
Lecture 36: Articial intelligence
Today, we study the intriguing question in AI:
What does it take to build an articial calculus teacher?
Machines assist us already in many domains: heavy work is done by machines and robots,
accounting by computers and ghting by drones. Lawyers and doctors are assisted by articial
intelligence. There is no reason why teaching is dierent. The web has become a gigantic brain
to which virtually any question can be asked or googled: Dr Know in Spielbergs movie AI
is humbled: enter symptoms for an illness and get a diagnosis, enter a legal question and nd
previous cases. Enter a calculus problem and get an answer. Building an articial calculus
teacher involves calculus itself: such a bot must connect dots on various levels: understand
questions, read and grade papers and exams, write good and original exam questions, know about
learning and pedagogy. Ideally, it should also have ideas like to make a lecture on articial
intelligence. But rst of all, our AI friend needs to know calculus and be able to generate and
solve calculus problems.
1
Generating calculus problems
Having been involved in a linear algebra book project once, helping to generating solutions to
problems, I know that some calculus books are written with help of computer algebra systems.
They generate problems and solutions. This applies mostly to drill problems. In order to generate
problems, we rst must build random functions. Our AI engine soa knew how to generate
random problems with solutions. Random functions are involved when asked give me an example
of a function. This is easy: the system would generate functions of reasonable complexity:
Call the 10 functions {sin, cos, log, exp, tan, sqt, pow, inv, sca, tra } basic functions.
Here sqt(x) =

x and inv(x) = 1/x
k
for a random integer k between 1 and 3, pow(x) = x
k
for
a random integer k between 2 and 5. sca(x) = kx is a scalar multiplication for a random nonzero
integer k between 3 and 3 and tra(x) = x+k translates for a random integer k between 4 and 4.
Second, we use addition, subtraction multiplication, division and composition to build more com-
plicated functions:
A basic operation is an operation from the list {f g, f + g, f g, f/g, f g }.
The operation x
y
is not included because it is equivalent to exp(xlog(y)) = exp (x log). We can
now build functions of various complexities:
1
In the academic year of 2003/2004, thanks to a grant from the Harvard Provost, I could work with under-
graduates Johnny Carlsson, Andrew Chi and Mark Lezama on a calculus chat bot. We spent a couple of
hours per week to enter mathematics and general knowledge, build interfaces to various computer algebra systems
like Pari, Mathematica, Macsyma and build a web interface. We fed our knowledge to already known chat bots
and newly built ones and even had various bots chat with each other. We conceptionally explored the question of
automated learning of the bots from the conversations as well as to add context to the conversation, since bots
needs to remember previous topics mentioned to understand some questions. We learned how immense the task is.
In the mean time it has become business. Companies like Wolfram research have teams of mathematicians and
computer scientists working on content for the Wolfram alpha engine. Having recently seen a group at work here
in Cambridge on Mass Av, I guess they generate probably in one day as much content as our Soa group could do
in a week for our pet project.
1
2
A random function of complexity n is obtained by taking n random basic functions
f1, . . . , fn, and n random basic operators 1, . . . , n and forming fn n fn1 n1
2 f1 1 f0 where f0(x) = x and where we start forming the function from the
right.
1 Visitor: Give me an easy function: Soa looks for a function of complexity one: like
xtan(x), or x + log(x), or 3x
2
, or x/(x 3).
2 Visitor: Give me a function: Soa returns a random function of complexity two:
xsin(x) tan(x), or e
x
+

x or xsin(x)/ log(x) or tan(x)/x
4
.
3 Visitor: Give me a dicult function: Soa builds a random function of complexity four
like x
4
e
cos(x)
cos(x)+tan(x), or x
xe
x
+log(x)+cos(x), or (1+x)(xcot(x)log(x))/x
2
,
or (x + sin(x + 3) 3) csc(x)
Now, we can build a random calculus problem. To give you an idea, here are some templates for
integration problems:
A random integration problem of complexity n is a sentence from the sentence
list { Integrate f(x) = F(x), Find the anti derivative of F(x), What is the
integral of f(x) = F(x)?, You know the derivative of a function is f
(x) = F(x).
Find f(x). }, where F is a random function of complexity n.
4 Visitor Give me a dierentiation problem. Soa: Dierentiate f(x) = xsin(x)
1
x
2
.
The answer is
2
x
3 + sin(x) + xcos(x).
5 Visitor: Give me a dicult integration problem. Soa: Find f if f
(x) =
1
x
+
_
3 sin
2
(x) + sin(sin(x))
_
cos(x). The answer is log(x) + sin
3
(x) cos(sin(x)).
6 Visitor: Give me an easy extremization problem. Soa: Find the extrema of f(x) =
x/ log(x). The answer is x = e.
7 Visitor: Give me an extremization problem. Soa: Find the maxima and minima of
f(x) = x x
4
+ log(x). The extrema are
_
_
9 +
3153
_
2/3
8
3
6 +
_
8
3
6
_
9 +
3153
_
2/3
_
1 + 6
_
2
9+
31538
3
6(9+
3153)
_
22
5/6
3
3
6
_
9 +
3153
.
The last example shows the perils of random generation. Even so the function had decent com-
plexity, the solution was dicult. Solutions can even be transcendental. This is not a big deal:
just generate a new problem. By the way, all the above problems and solutions have been gen-
erated by Soa. The dirty secret of calculus books is that there are maybe a thousand dierent
type of questions which are usually asked. This is a reason why textbooks have become boring
clones of each other and companies like Aleks exist which constantly mine the web and course
sites like this and homework databases like webwork which contain thousands of pre-compiled
problems in which randomness is already built in.
Automated problem generation is the fast food of teaching and usually not healthy.
But like fast food has evolved, we can expect more and more computer assisting
in calculus teaching.
Be assured that for this course, the problems have been written by hand (I sometimes use Mathe-
matica to see whether answers are reasonable). Handmade problems can sometimes a bit rough
but hopefully some were more interesting. I feel that it is not fair to feed computer generated
problems to humans. It is possible to write a program giving an answer to Write me a nal
exam, but the exam would be uninspiring.
3
Corner detection
How do we detect corners in pictures? This is necessary to understand pictures, drawings. It
might also be needed to see whether a given function is reasonably shaped. There should not be
too many wiggles for example. There are various techniques to measure that. One of the best
methods in computer vision uses the notion of curvature:
Given a function f(x), dene the curvature as
k(x) =
f
(x)
(1 + f
(x)
2
)
3/2
.
Is is a measure on how much the curve is bent at the point (x, f(x)). Positive curvature means
the curve is concave up, otherwise concave down.
8 For a quadratic function f(x) = x
2
, we have (x) = 1/(1 +x
2
). We see that the curvature
is maximal at the lowest part of the parabola.
9 For the function f(x) =
1 x
2
, we have f
(x) = 2x/
1 x
2
and f
(x) = (1x
2
)
3/2
.
We have (1 + f
(x)
2
) = 1/(1 x
2
) and k(x) = 1.
10 Problem: Find the curvature for the graph of f(x) = x
5
/5 x. Where is the curvature
maximal?
2 1 1 2
1
2
3
4
1.0 0.5 0.5 1.0
1.0
0.5
0.5
1.0
1.0 0.5 0.5 1.0
6
4
2
2
4
6
Connecting the dots
We want to connect points P1, . . . , Pn by a smooth graph. This connecting the dots problem
is quite frequent. Our brain does this automatically. We need to see a few glances to see the
motion of an object and predict where it will end. We need to connect dots if we drive a car, if we
interpret a picture etc. On a more abstract level, we need to connect dots in the landscape of ideas
whenever we solve a problem. We want to go from A to B and need to construct intermediate
steps.
Here is a simple method found by G. Chaikin in 1974
2
which generates a smooth curve through
a few points.
Given a sequence of n points P1, ..., Pn dene a new sequence of 2n 2 points
R2, ..., R2n1 by
R2i =
3
4
Pi +
1
4
Pi+1, R2i+1 =
1
4
Pi +
3
4
Pi+1
for i = 1, ..., n 1.
2
G. Chaikin, An algorithm for high speed curve generation. Computer Graphics and Image Processing 3 (1974),
346-349.
4
One such a step denes a Chaikin step. The limiting curve is called the Chaikin curve dened
by the original points. The picture should explain how we get the new points from the old ones:
divide each segment into 4 pieces and use the two outer points to get new points.
The Chaikin steps produce a smooth curve approximating a given set of points.
The pictures show curves in two and three dimensions after applying the method a few times.
The method can be used for example to study the complexity of random knots. To answer the
question stated initially: like artists have become better using computers it could well be that
AI will assist teachers in the future and help them to be more ecient.
In any way, the AI dragon breathing down our necks will force us all to stay creative.
Homework
1 A calculus bot wants to build a dierentiation problem by combining log and sin and exp.
Dierentiate all of the 6 combinations log(sin(exp(x))), log(exp(sin(x))),exp(log(sin(x))),
exp(sin(log(x))), sin(log(exp(x))) and sin(exp(log(x))).
2 Four of the 6 combinations of log and sin and exp can be integrated as elementary functions.
Do these integrals.
3 Find the curvature of the sin curve at x = 0, x = /2 and x = 3/2.
4 Draw the points (0, sin(0)), (/2, sin(/2)), (, sin(), (3/2, sin(3/2)), (2, sin(2)) and
connect them with lines. Now do Chaikin iteration for at least 2 generations on paper.
5 Answer each of the following 5 human questions in one sentence:
a) What is calculus for you?
b) What is the nicest application of calculus?
c) Who invented calculus and why?
d) What is the fundamental theorem and why is it useful?
Math 1A: introduction to functions and calculus Soa, 2011
This worksheet was authored by Soa
1
, an articial intelligence cal-
culus teacher and student! The bot could also learn evenso only in a
primitive way. It had to be told learn: .... This entire LaTeX le was
generated automatically, (except for this introduction section which
has, (thanks to this parenthesis) become self-aware and so articially
intelligent.)
Derivatives
Dierentiate the following functions: Level 1
1 a) f(x) = x tan(x)
b) f(x) = x + tan(x)
c) f(x) = xlog(x)
d) f(x) = e
x
x
e) f(x) = cos(x)
1
Written in the academic year 2003/2004, thanks to a grant from the Harvard Provost together with Johnny
Carlsson, Andrew Chi and Mark Lezama. Soa was a chat bot which would use computer algebra systems
to solve calculus problems while chatting, similar to Wolfram Alpha now. The later is of course much more
sophisticated. Ours was maybe a 25 week * 4 people * 15 hour = half a person-year project
Integrals
Integrate the following functions: Level 1
1 a) f(x) =
1
x
2
+ 1
b) f(x) = sec
2
(x)
c) f(x) = 1 sin(x)
d) f(x) = sec
2
(x) + 1
e) f(x) =
1
2
x
Derivatives
1 a) f(x) = x
3/2
sec(x)
b) f(x) = e
x
(x + sin(x))
c) f(x) = 0
d) f(x) = e
x
x
e) f(x) =
e
x
log(e
x
)
Integrals
1 a) f(x) =
x
3
3
x
4
b) f(x) =
3
x3
2
c) f(x) = e
x
(cos(x) sin(x))
d) f(x) = x xtan
2
(x) tan(x)
e) f(x) = e
x
(sin (e
x
) + e
x
cos (e
x
))
Derivatives
1 a) f(x) = x
2
(x tan(x))
b) f(x) =
x log(x) + xtan(x)
c) f(x) =
e
x
(x1)x
2
d) f(x) =
e
x
(x+log(x))
x
e) f(x) = x
3
(x + log(x))
Integrals
1 a) f(x) =
4x
4
+1
x
b) f(x) = e
x
((x log(x) + log(x) + 1) sin(x) + xlog(x) cos(x))
c) f(x) = e
2x
(1 2x) sec
2
(x)
d) f(x) = e
x
+ sec
2
(x)
e) f(x) =
(log(x)2) cos(
x)
xlog(x) sin (
x)
xlog
2
(x)
Derivatives
1 a) f(x) =
sin
3
(x)+sin(x)+sin(sin(x))
x+1
b) f(x) =
x
3
x
3
x 4
csc(x)
c) f(x) =
e
3x
9x
cot(x)
d) f(x) = e
x
((x 3)(x 2) cos(x))
e) f(x) =
e
x
3x
3x
Integrals
1 a) f(x) =

x(x6)+2(x2)x sin(x)+2(x4) cos(x)

2x
3
b) f(x) = x
4
+(x2)
4x
3
cos(x) sec
2
(x)
sin(x) tan(x)
c) f(x) =
e
1
x
(3 sec
2
(
1
x
3
)+(3x+1)x
2
tan(
1
x
3
))
x
7
d) f(x) =
2 tan(log(x))+
sin(log(x))
cos(log(x))
2x
e) f(x) = 6 tan
5
(x) sec
2
(x)
Derivatives
1 a) f(x) =
1
x
3/2
+
(
x+log(
x)) cos (
x) cot(
x)
x+3
b) f(x) = (x + 3) sin
3
2
(x)(tan(sin(x)) log(tan(sin(x))))
c) f(x) =
1
x
2

2x(
x
log(x)
log(x))
x+1
+ sin (x)
d) f(x) = x
5
x
4
x + e
x
cos(x)
x
3
e) f(x) = sec(x)
sin
6
(x) + sin(sin(x)) + tan(sin(x))
e
x
Integrals
1 a) f(x) =
sec(x)
sec(x) log(tan(x))
tan(x)sin
tan(x)
+3 sin
tan(x)
8
8
tan (x) log

2
(ta
b) f(x) =
1
2
e
x
6
log
4
(x)
+
1
(log(x)2)
2
x
2e
x
1
log
3
(x)
+
1
2(log(x)2)
c) f(x) =
2x
5
log(x) sec
2
(x)+2x
4
(3 log(x)1) tan(x)+log
2
(x)
x
2
log
2
(x)
d) f(x) =
5x
7/2
+20x
6
24x
4
+2
2x
2
e) f(x) =
2x
2
cos(x)+xlog(x) sin(x)+(6 log(x)2) cos(x)

2 x
4
cos(x)
Lecture 37: The lighter side of Calculus
First some serious nal thoughts:
The holographic picture
Any knowledge can be organized in a holographic way, where the amount of detail is a parameter.
A 1 second is version is
Calculus is great.
Calculus in 10 seconds would be
Calculus establishes that two operations on functions f are related: rst the de-
rivative of f which is the rate of change of f second the integral of f which is the
area under the graph of f.
In 1 minute , I would say something similar than in the synopsis provided by the Harvard registrar
for this single variable calculus course:
The development of calculus by Newton and Leibniz is a major achievement of
the past millennium. The core of this course introduces dierential and integral
calculus. Dierential calculus studies the rate of change f
of a function, integral
calculus treats accumulation
x
0
f(x) dx which can be interpreted as area under
the curve. The fundamental theorem of calculus links the two: it tells that
x
0
f
(t) dt = f(x) f(0),

d
dx
x
0
f(t) dt = f(x) .
The subject can be applied to problems from other scientic disciplines like eco-
nomics (the strawberry theorem for total, average and marginal costs), reasoning
for relating quantities (like estimating the speed of an airplane from angle change,
psychology (catastrophes explaining revolutionary changes or ips in human per-
ception), geometry (volume and area computation), statistics (distribution and cu-
mulative distribution functions) or everyday life (the wobbly table theorem).
Here is an attempt to summarize the most important points of this course in 3 minutes :
1
2
1. Calculus relates two fundamental operations, the derivative measuring the rate
of change of a function and the derivative which measures the area under the graph.
2. Taking derivatives is done with the chain, product and quotient rules, taking
intervals with substitution including trig substitution, integration by parts and par-
tial fraction rules.
3. Basic functions are polynomials and exp, log, sin, cos, tan. We can add, subtract,
multiply, divide and compose functions, we can dierentiate and integrate them.
4. A function is continuous at p if f(x) f(b) for x p. It is dierentiable at p
if (f(x + h) f(x))/h has a limit for h 0. Limits from the right and left should
agree.
5. To extremize a function, we look at points where f
(x) = 0. If f
(x) > 0 we
have a local minimum, if f
(x) < 0 we have a local maximum. There can be critical

points f
(x) = 0 without being extremum like x

3
.
6) To relate how dierent quantities change in time, we dierentiate the formula
relating the quantities using the chain rule. If there is a third time variable, then
this is the story of related rates, if one of the variables is the parameter, then this
is implicit dierentiation.
We will have a 90 minutes review of all the material before the midterm. It would not t on
the 4 pages which I strictly allocated as a handout for each lecture. We have seen a 37+13 hour
version of the material in the lectures and problem sessions for this course. It would be possible
to teach this course using 100 hours . One could explore the material more from a historical
perspectives for example and read original sources. One could do projects, use more computer
algebra systems, practice visualization and visualize things. You have studied maybe 300 hours
for this course including homework, reading, and discussing the material. Years would be needed
to study it more on a research level. New calculus is constantly developed. I myself have been
working mostly on more probabilistic versions of calculus which allows to bypass some of the
diculties when discretizing calculus. The loss of symmetries obtained by discretization can be
compensated dierently.
The future of calculus
Calculus will undoubtably look dierent in 50 years. Many changes have already started, not only
on the context level, also from outside: Calculus books will be gone, electronic paper which will
be almost indistinguishable from real paper has replaced it. Text, computations, graphics are all
uid in that we can at any point adjust the amount of details. Similarly than we can zoom into a
map or picture by pinching the screen, we can triple pinch a text or proof or picture. As we do so,
more details are added, more steps of a calculation added, more information included into a graph
etc. Every picture is interactive can turn in a movie, an animation, parameters can be changed,
functions deformed with the nger. Every picture is a little laboratory. Questions can be asked
directly to the text and answers provided. The text can at any time be set back to an ocial
textbook version of the course. The teacher has the possibility to set global preferences and toss
around topics. Examples, homework problems and exam problems will be adjusted automatically
disallowing for example to treat integration by parts before the product rule. Much of this is not
science ction, there are electronic interactive books already now available for tablet computers
which have impressive experimental and animation features. Impossible because it is too dicult
to achieve? Remember the last lecture 36. We will have AI on our side and much of this grunt
work to compress and expand knowledge can be done computer assisted.
Calculus courses after 1a
3
To prepare for this course, I set myself the task to formulate the main topic in one short sentence
and then single out 4 major goals for the course, then build titles for each lecture etc Here are 4
calculus courses at Harvard drawn out at the level of a 4 point summary. At other schools of
higher education, there are similar courses.
The course 1A from extremization to the fundamental theorem
functions polynomials, exp, log, trig functions
limits velocity, tangents, innite limits
derivatives product, chain rule with related rates, extremization
integrals techniques, area, volume, fundamental theorem
The course 1B from series and integration to dierential equations
integration integration: parts, trig substitution, partial fractions, indenite
series convergence, Power, Taylor and Dirichlet series
di equations separation of variables, systems like exponential and logistic equations
systems di eq equilibria, nullclines, analysis
The course 21A geometry, extremization and integral theorems in space
geometry analytic geometry of space, geometric objects, distances
dierentiation curves and surfaces, gradient, curl, divergence
integration double and triple integrals, other coordinate systems
integral theorems line and ux integrals, Green, Stokes and Gauss
The course 21B matrix algebra, eigensystems, dynamical systems and Fourier
equations and maps Gauss-Jordan elimination, kernel, image, linear maps
matrix algebra determinants, eigenvalues, eigenspaces, diagonalization
dynamical systems dierence and dierential equations with various techniques
fourier theory Fourier series and dynamical systems on function spaces
There is also a 19a/19b track. The 19a course focuses on models and applications in biology, the
19b course replaces dierential equations from 21b with probability theory. The Math 20 course
covers linear algebra and multivariable calculus for economists in one semester but covers less
material than the 21a/21b track.
The lighter side of calculus
Soa, our bot had also to know a lot of jokes, especially about math. Here are some relevant to
calculus in some way. I left out the inappropriate ones.
1 Why do you rarely nd mathematicians at the beach? Because they use sine and cosine
to get a tan.
2 Theorem: The less you know, the more you make. Proof: We know Power = Work/Time.
Since Knowledge = Power and Time = Money we know Knowledge = Work/Money. Solve
for Money to get Money = Work/Knowledge. If Knowledge goes to zero, money approaches
innity.
3 Why do they never serve beer in a calculus class? Because you cant drink and derive.
4 Descartes comes to a bar. Barmen: An other beer? Descartes: I think not. And disap-
pears.
5 If its zero degrees outside today. Tomorrow it will be twice as cold. How cold will it be?
6 There are three types of calculus teachers: those who can count and those who can not.
7 Calculus is like love; a simple idea, but it can be complicated.
4
8 A mathematician and an engineer are on a desert island with two palm trees and coconuts.
The engineer climbs up, gets its coconut gets down and eats. The mathematician climbs
up the other, gets the coconut, climbs the rst tree and deposits it. Ive reduced the
problem to a solved one.
9 Pickup line: You are so x
2
. Can I be x
3
/3, the area under your curves?
10 The Evolution of calculus teaching:
1960ies: A peasant sells a bag of potatoes for 10 dollars. His costs are 4/5 of his selling
price. What is his prot?
1970ies: A farmer sells a bag of potatoes for 10 dollars. His costs are 4/5 of his selling
price, that is, 8 dollars. What is his prot?
1980ies: A farmer exchanges a set P of potatoes with a set M of money. The cardinality
of the set M is equal to 10, and each element of M is worth one dollars Draw ten big dots
representing the elements of M. The set C of production costs is composed of two big dots
less than the set M. Represent C as a subset of M and give the answer to the question:
What is the cardinality of the set of prots?
1990ies: A farmer sells a bag of potatoes for 10 dollars. His production costs are 8
dollars, and his prot is 2 dollars. Underline the word potatoes and discuss it with your
classmates.
2000ies: A farmer sells a bag of potatoes for 10 dollars. His or her production costs
are 0.80 of his or her revenue. On your calculator, graph revenue vs. costs and run the
program POTATO to determine the prot. Discuss the result with other students and
start blog about other examples in economics.
2010ies: A farmer sells a bag of potatoes for 10 dollars. His costs are 8 dollars. Use the
Potato theorem to nd the prot. Then watch the wobbling potato movie.
11 Q: What is the rst derivative of a cow? A: Prime Rib!
12 Q: What does the zero say to the eight? A: Nice belt!
13 Theorem. A cat has nine tails. Proof. No cat has eight tails. Since one cat has one
more tail than no cat, it must have nine tails.
14 Q: How can you tell that a mathematician is extroverted? A: When talking to you, he
looks at your shoes instead of at his.
15 Q: What does the little mermaid wear? A: An algae-bra.
16 In a dark, narrow alley, a function and a dierential operator meet: Get out of my way -
or Ill dierentiate you till youre zero! Try it - Im e
x
... Same alley, same function, but
a dierent operator: Get out of my way - or Ill dierentiate you till youre zero! Try
it - Im e
x
... Too bad... Im d/dy.
17 Q: How do you make 1 burn? A: Fire dierentiation at a log.
18 An investment rm hires. In the last round, a mathematician, an engineer, and a business
guy are asked what starting salary expectations they had: mathematician: Would 30,000
be too much? engineer: I think 60,000 would be OK. Finance person: What about
300,000? Ocer: A mathematician will do the same work for a tenth! Business guy:
I thought of 135,000 for me, 135,000 for you and 30,000 for the mathematician to do the
work.
19 Theorem. Every natural number is interesting. Proof. Assume there is an uninteresting
one. Then there is smallest one. But as the smallest, it is interesting. Contradiction!
Lecture 37: Calculus and the world
Last Lecture
Calculus and the world
There would have more to tell. One could make an entire course
lled with applications of calculus. We have seen lectures on music,
statistics, economics and computer science. Here are more
ideas. It would be nice to have a few weeks to work them out. The
number of applications explodes even more when doing multivariable
calculus or linear algebra.
Calculus and Sports
Optimization and analysis of motion. Which path needs least energy?
Calculus of motion in various sports.
Calculus and Biology
Exponential growth and decay. Populations grow exponentially. Ra-
dioactive particles decay fast.
Calculus and Physics
Chaos theory. How far into the future can we predict a system. Take
a map an iterate it. Take a calculator and iterate.
Calculus and Art
We can use functions to generate new art forms using functions.
Calculus and Cosmology
How did the universe evolve. The Lorentz contraction. Is it realistic
that we will ever meet an other civilisation.
Calculus and Medicine
Catastrophes happen also in our body. An example is the story of
Period doubling in the heart.
Calculus and Finance
The mathematics of Finance is complex and is done with stochastic
dierential equations, chaos theory and power law heuristics.
Calculus and Romance
When is the optimal time to marry? If you choose too early, you dont
know what is out there. If you chose too late, you will have to compare
with too many previous cases.
Calculus and Friendship
Book by Strogatz:
Calculus and Psychology
We have seen the catastrophic change of perception. Psychology needs
a lot of statistics.
Calculus and Politics
Game theory and Equilibria. The calculus of conict.
Calculus and Philosophy
Is calculus consistent. Can calculus be built in dierent ways? What
is truth? Can we take limits?
Calculus and architecture
The topic is much linked that most calculus books feature architecture
on their book covers.
Calculus and History
The calculus wars between Newton and Leibniz.
Lecture 38: Review since second midterm
Related rates
Implicit dierentiation and related rates are manifestations of the chain rule.
A) related rates: we have an equation F(x, y) = c relating two variables x, y which depend on
time t. dierentiate the equation with respect to t using the chain rule and solve for y
.
B) implicit dierentiation: we have an equation F(x, y(x)) = c relating y with x. Dierentiate
the equation with respect to x using the chain rule and solve for y
.
Examples:
A) x
3
+y
3
= 1, x(t) = sin(t), then 3x
2
x
+3y
2
y
= 0 so that y
= x
2
x
/y
2
= sin
2
(t) cos(t)/(1
sin
3
(t))
1/3
.
B) Same example but x(t) = x: y
= x
2
/y
2
= sin
2
(t)/(1 sin
3
(t))
1/3
.
Substitution
Substitution replaces
f(x) dx with
g(u) du with u = u(x), du = u
(x)dx. Special cases:

A) The antiderivative of f(x) = g(u(x))u
(x), is G(u(x)) where G is the anti derivative of g.

B)
f(ax + b) dx = F(ax + b)/a where F is the anti derivative of f.

Examples:
A)
sin(x
5
)x
4
dx =
sin(u) du/5 = cos(u)/5 + C = cos(x

5
)/5 + C.
B)
log(5x +7) dx =
log(u) du/5 = (u log(u) u)/5 +C = (5x +7) log(5x+7) (5x +7) +C.
Integration by parts
A) Direct:
x sin(x) dx = x (cos(x))
1 (cos(x) ) dx = xcos(x) + sin(x) + C dx .

B) Tic-Tac-Toe: To integrate x
2
sin(x)
x
2
sin(x)
2x cos(x)
2 sin(x)
0 cos(x)
x
2
cos(x) + 2xsin(x) + 2 cos(x) + C .
C) Merry go round: Example I =
sin(x)e
x
dx. Use parts twice and solve for I.
1
2
Partial fractions
A) Make a common denominator on the right hand side
1
(xa)(xb)
=
A(xb)+B(xa)
(xa)(xb)
.
and compare coecients 1 = Ax Ab + Bx Ba to get A + B = 0, Ab Ba = 1 and solve for
A, B.
B) If f(x) = p(x)/(x a)(x b) with dierent a, b, the coecients A, B in
p(x)
(xa)(xb)
=
A
xa
+
B
xb
can be obtained from
A = lim
xa
(x a)f(x) = p(a)/(a b), B = lim
xb
(x b)f(x) = p(b)/(b a) .
Examples:
A)
1
(x+1)(x+2)
dx =
A
x+1
dx+
B
x+2
dx. Find A, B by multiplying out and comparing coecients
in the nominator.
B) Directly write down A = 1 and B = 1, by plugging in x = 2 after multiplying with x 2.
or plugging in x = 1 after multiplying with x 1.
Improper integrals
A) Integrate over innite domain.
B) Integrate over singularity.
Examples:
A)
0
1/(1 + x
2
) = arctan() arctan(0) = /2 0 = /2.
B)
1
0
1/x
2/3
dx = (3/1)x
1/3
|
1
0
= 3.
Trig substitutions
A) In places like
1 x
2
, replace x by cos(u).
B) Use u = tan(x/2), dx =
2du
(1+u
2
)
, sin(x) =
2u
1+u
2 , cos(x) =
1u
2
1+u
2 to replace trig functions by
polynomials.
Examples:
A)
1
1
1 x
2
dx =
/2
/2
cos(u) cos(u)du =
/2
/2
(1 + cos(2u))/2 =

2
.
B)
1
sin(x)
dx =
1+u
2
2u
2du
1+u
2
=
1
u
du = log(u) + C = log(tan(
x
2
)) + C.
Applications, keywords to know
Music: hull function, piano function
Economics: average cost, marginal cost and total cost. Strawberry theorem, t points
Computer science: curvature and Chaikin steps
Statistics: probability density function, cumulative distribution function, expectation, variance.
Geometry: area between two curves, volume of solid
Numerical methods: trapezoid rule, Simpson rule, Newton Method
Psychology: critical points and Catastrophes.
Physics: position, velocity and acceleration.
Gastronomy: turn table to prevent wobbling, bottle calibration.
Lecture 39: Checklists
Integrals to know well
sin(x)
cos(x)
tan(x)
log(x)
exp(x)
1/x
x
n
1/ cos
2
(x)
1/ sin
2
(x)
1/(1 + x
2
)
1/
1 x
2
Applications to know
Since there are few questions on what has to be known about applications and denitions (this
list only covers application parts):
Derivative Limit of dierences Dhf = [f(x + h) f(x)]/h.
Integral Limit of Riemann sums Shf = [f(0) + f(h) + ...f(kh)]h.
Newton step T(x) = x f(x)/f
(x).
Marginal cost the derivative F
of the total cast F.

Average cost F/x where F is the total cost.
Velocity Derivative of the position.
Acceleration Derivative of the velocity.
Curvature f
(x)/(1 + f
(x))
3/2
.
Probability distribution function nonnegative function with total
f(x)dx = 1.
Cumulative distribution function anti-derivative of the probability distribution
function.
Expectation
xf(x) dx, where f is the probability density function.

Piano function frequencies f(k) = 440 2
k/12
for integer k.
Hull function The interpolation of local maxima.
Catastrophe A parameter c at which a local minimum disappears.
Not on your ngertips
1
2
The following concepts have appeared but do not need to be learned by heart:
Entropy
f(x) log(f(x)) dx.

Moment of inertia
x
2
f(x) dx.
Monte Carlo integration Sn =
1
n
n
k=1
f(xk) , where xk are random in [a, b].
Simpson rule Sn =
1
6n
n
k=1
[f(xk) + 4f(yk) + f(xk+1)].
Chaikin step R2i =
3
4
Pi +
1
4
Pi+1, R2i+1 =
1
4
Pi +
3
4
Pi+1.
Cocktail party stu The story of exp.
Bottles How to calibrate bottles. The calibration formula.
Soa The name of the calculus bot.
Wobbly chair theorem One can turn a chair on any lawn to stop it.
Warthog The name of the warthog which appears in practice exam 2.

Maths Harvard

Uploaded by

Copyright:

Available Formats

Maths Harvard

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Maths Harvard

Uploaded by

Copyright:

Available Formats

Math 1A: introduction to functions and calculus Oliver Knill, 2011

Lecture 1: What is Calculus?

http : //www. wol f ramal pha . com/ i nput /? i=Pl ot+s i n ( x)

8 = 2.828.... Since the function 2

(x) = 0 and see that

(x) has a ge-

(0) = 1 because the dif-

(x) = 0. To see this, look at f(0 + h) f(0) = cos(h) 1.

(0) = 1. The exponential function has a graph which has slope 1 at x = 0.

(x). From this, we can now compute many derivatives

(x) = 3 cos(3x) 50 sin(10x) + 5e

(x) = 1 for x > 0 and f

t, at least for the rst few

= sin and exp

= exp. We can already dierentiate a lot of functions and

(x) at some point x. This is the slope of the curve at x.

(x) of f(x) = sin(x) + cos(x)

(x) = cos(x) sin(x) 1/(2

(x) = cos(x). In this lecture, we want to understand

(x) > 0. What does it

(x) < 0. Do the roots of f tell something about f

(x) is positive or negative.

(x) > 0 for all x (a, b).

is small and if f is small, then g

(x) and then take the derivative g

of g. This second derivative f

(x) is called the acceleration. It measures the rate

(x) > 0 on some interval the function is called concave up, if f

(x). In each case you are told that

(x) for f(x) = x

, where p = mv is the momentum, the product

? The product rule tells us

(t) is negative and m(t) decreases, we

(0) = 6 then get from h(t) = (1+sin(t))

1 t since the balloon gets smaller. The distance traveled is

1 t. Find the velocity f

(x) and then the derivative

(0) = 0. Is the result in a) not a

(x) = ... to get a formula for f

x/x at x = 1. (Look rst!)

(g(x)) and the second factor has g

(x) = 1. Therefore log

(x) = cos(sin(sin(x))) cos(sin(x)) cos(x).

x and f(x) = sin(x). The chain

x by dierentiating the identity f(x)

(x) = exp(x) and exp

(x) = cosh(x) and cosh

(x) = sinh(x). Furthermore exp = cosh +sinh

is not dened. In this course we do not

and will be treated separately.

(x) = c is not zero. We can assume c > 0 other-

(x) = 12(x 1)(x + 2)(x 3). The critical points

(x) = 0 does not assure

(0) = 0 but 0 is not

(x) > 0, then the graph of the function is concave up. If f

(x) < 0 then the graph

(x) > 0, then f is a

(x) < 0, then f is a local maximum.

(x0) > 0 then f

(x) is negative for x < x0 and positive for f