Unit 3. Calculus

Calculus
Tushar B. Kute,
http://tusharkute.com
Let’s go with an example...
• Sameer and Ajay are traveling in the car ... but the
speedometer is broken.
• Ajay: "Hey Sameer! How fast are we going now?"
• Sameer: Wait a minute ...Well in the last minute we
went 1.2 km, so we are going... 1.2 km per minute x
60 minutes in an hour = 72 km/h
• Ajay: "No, Sameer! Not our average for the last
minute, or even the last second, I want to know our
speed RIGHT NOW."
• Sameer: "OK, let us measure it up here ... at this
road sign... NOW!"
• "OK, we were AT the sign for zero seconds, and the

distance was ... zero meters!"
• The speed is 0m / 0s = 0/0 = I Don't Know!
• "I can't calculate it, Ajay! I need to know some distance

over some time, and you are saying the time should be
zero? Can't be done."
• That is pretty amazing ... you'd think it is easy to

work out the speed of a car at any point in time,
but it isn't.
• Even the speedometer of a car just shows us an
average of how fast we were going for the last
(very short) amount of time.
How About Getting Real Close
• But our story is not finished yet!

• Sameer and Ajay get out of the car,
because they have arrived on
location. Sameer is about to do a
stunt:
• Sameer will do a jump off a 20 m
building.
• Ajay, as photographer, asks: "How
fast will you be falling after 1
second?"
• Sam uses this simplified formula to find the

distance fallen:
d = 5t2
d = distance fallen, in meters
t = time from jump, in seconds
• (Note: the formula is a simpler version of how fast
things fall under gravity: d = ½gt2)
• Example: at 1 second Sameer has fallen
d = 5t2 = 5 × 12 = 5 m
• But how fast is that? Speed is distance over time:
• So at 1 second:
• "BUT", says Ajay, "again that is an average speed,

since you started the jump, ... I want to know the
speed at exactly 1 second, so I can set up the
camera properly."
• "BUT", says Ajay, "again that is an average speed,

since you started the jump, ... I want to know the
speed at exactly 1 second, so I can set up the
camera properly."
• Well ... at exactly 1 second the speed is:

• So again Sameer has a problem.
• Think about it ... how do we figure out a speed at

an exact instant in time?
• What is the distance? What is the time difference?
• They are both zero, giving us nothing to calculate

with!
• But Sam has an idea ... invent a time so short it

won't matter.
• Sam won't even give it a value, and will just call it

"Δt" (called "delta t").
• So Sam works out the difference in distance

between t and t+Δt
• At 1 second Sam has fallen

5t2 = 5 × (1)2 = 5 m
• At (1+Δt) seconds Sam has fallen
5t2 = 5 × (1+Δt)2 m
• We can expand (1+Δt)2:
(1+Δt)2= (1+Δt)(1+Δt)
= 1 + 2Δt + (Δt)2
• So at (1+Δt) seconds Sam has fallen
d= 5 × (1+2Δt+(Δt)2) m
d= 5 + 10Δt + 5(Δt)2 m
• In Summary:
At 1 second:d = 5 m
At (1+Δt) seconds:d = 5 +
10Δt + 5(Δt)2 m
• So between 1 second and

(1+Δt) seconds we get:
Change in d= 5 + 10Δt +
5(Δt)2 − 5 m
• Change in distance over time:
• So the speed is 10 + 5Δt m/s, and Sam thinks about

that Δt value ... he wants Δt to be so small it won't
matter ... so he imagines it shrinking towards zero
and he gets:
Speed = 10 m/s
Finally
• Wow! Sam got an answer!
• Sameer: "I will be falling at exactly 10 m/s"

• Ajay: "I thought you said you couldn't calculate it?"
• Sameer: "That was before I used Calculus!"
What is calculus?
• Calculus is a branch of mathematics that involves the

study of rates of change.
• Before calculus was invented, all math was static: It
could only help calculate objects that were perfectly
still.
• But the universe is constantly moving and changing.
No objects—from the stars in space to subatomic
particles or cells in the body—are always at rest.
Indeed, just about everything in the universe is
constantly moving.
• Calculus helped to determine how particles, stars, and
matter actually move and change in real time.
What is calculus?
• Calculus is the study of rates of change.

• Gottfried Leibniz and Isaac Newton, 17th-century
mathematicians, both invented calculus
independently. Newton invented it first, but
Leibniz created the notations that mathematicians
use today.
• There are two types of calculus: Differential
calculus determines the rate of change of a
quantity, while integral calculus finds the quantity
where the rate of change is known.
What is calculus?
• The word Calculus comes from Latin meaning

"small stone".
– Differential Calculus cuts something into small
pieces to find how it changes.
– Integral Calculus joins (integrates) the small
pieces together to find how much there is.
• Differential Calculus and Integral Calculus are
like inverses of each other, similar to how
multiplication and division are inverses
Derivative
• Let’s say that we increase x by a small amount, which

we denote dx. And assume that this change in dx
increases y by the quantity dy.
• We can visualise this as a right-angled triangle in which
x forms the base and y the height.
Derivative
• If we add dx to x, it will cause an increase in y by dy.

This corresponding change in dy can be described as a
ratio.
• The expression dy describes a rise in y, whereas dx

describes the run (the change in x by dx), which is why
this ratio is known as the “rise over run”. In geometric
terms, “rise over run” is the slope of our hypotenuse.
Derivative
• Well, in this case, our hypotenuse is just a straight line.

The ratio between dy and dx is the same at every
point. It is also equivalent to the ratio of y to x.
• But what do we do if instead of the hypotenuse in the
previous image, we need to find the slope of a non-
linear graph like this one?
Derivative
• Now the slope and therefore our “rise over run”

changes constantly as you move along the line. In
other words, the slope itself becomes a non-linear
graph (plotted in blue).
Derivative
• Following the origin, the green graph slowly picks up

steepness, which also results in a gradual increase in
steepness in the blue graph.
• In the real world, the green graph could model the
deceleration of a car until it comes to a halt at a traffic
light. After the traffic light has turned green, it slowly
starts to accelerate again.
• So the green graph captures the actual change in
speed while the blue graph records the rate of change
in speed at every point in time.
What is Derivative ?
• It should be obvious by now, that we cannot capture

the slope of the green graph in a simple slope of y over
x.
• To determine the slope of the green graph, we would
have to create an infinite number of infinitely small
right-angled triangles at every point along the line.
• Now this infinitely small ratio you obtain at every one
of these triangles is known as the differential and it is
expressed as
What is Derivative ?
• Here d stands for delta.

• It is known as the differential, and the blue graph that
captures the rate of change at every point on the
green graph is known as the derivative of the green
graph.
• Of course, x and y have to be somehow related to be
able to obtain this ratio. We can therefore express the
derivative as a function itself.
How do we got this function?
• Remember in a graph y is usually expressed as a

function of x. Our green graph can be expressed as
y=x3
• More formally we would say:

f(x)=x3
• So f(x) is basically a different expression for y in this
case.
• You could create a right angled triangle along the
graph of f(x), from any point (x, f(x)) to another point
(x + dx, f(x + dx)).
Rise over run formula
• Accordingly, you would arrive at the slope of this

triangle by the following calculation.
• Since the slope is changing continuously in a non-

linear graph, we need to make the triangle as small
as possible, in fact infinitely small, to arrive at the
correct slope at that point.
• As we cannot really calculate the slope of an
infinitely small triangle, we can only approximate it
by getting x as close to zero as possible.
Partial Derivative
• Many real-life problems in areas such as

physics, mechanical engineering, data
science, etc., can be described as
functions of more than one
independent variable.
• How can you know how the change in
one particular variable affects the
system described by your function?
• This is where partial derivatives come
in.
• For instance, the volume of a cylinder
can be described as a function of its
height h and its radius r.
V=πr2h
Partial Derivative
• Suppose you need to describe how the volume

changes in response to varying just the height
while keeping the radius constant.
• To achieve this, you would differentiate the
function describing the volume just with respect to
h, treating everything else, including r, as a
constant. This gives you the following expression.
V′=πr2
Partial Derivative
• But wait, isn’t this the formula for the area of a

circle? Indeed, it is.
• That makes perfect sense. If you increase h by an
infinitesimally small amount, it is like stacking a
circle on top of the cylinder.
Partial Derivative
• The area of the circle is equivalent to the partial derivative

of V with respect to h. Formally we would say.
• Note that ∂ is the partial derivative symbol. You use it

instead of d when you are differentiating a multivariate
function with respect to one variable.
• If we wanted to find out how V changes if we only increased
or decreased r, we would take the partial derivative of V
with respect to r.
What we know?
• the derivative is nothing but the slope of a

function at a particular point. If we take the
multivariate function
f(x,y)=x2+3y
• The derivative with respect to one variable x
will give us the slope along the x dimension.
What we know?
Vector
• Since x^2 is an exponential term, the slope

becomes steeper as we move away from the
origin.
• Taking the partial derivative with respect to y
gives us the slope along the y dimension.
3y is a linear term, therefore the slope

along the y axis remains constant.
Jacobian Row Vector
• What about the total derivative with respect to x and y?

Since we are differentiating with respect to x and y, it is the
slope along both dimensions. We can express this as a row
vector.
• This is known as the Jacobian matrix. In this simple case with

a scalar-valued function, the Jacobian is a vector of partial
derivatives with respect to the variables of that function.
• The length of the vector is equivalent to the number of
independent variables in the function.
Jacobian Matrix
• The Jacobian of a function of real numbers is a vector. We

can expand the definition of the Jacobian to vector-valued
functions.
• Our function vector has m entries. The resulting Jacobian

will be an m×n matrix, where n is the number of partial
derivatives.
• Each row m in the matrix contains the partial derivatives
corresponding to the equivalent row m in the function
vector.
What is use of Jacobian?
• We can use the Jacobian matrix to transform

from one vector space to another.
• Furthermore, if the matrix is square, we can
obtain the determinant.
• The value of the Jacobian determinant gives us
the factor by which the area or volume
described by our function changes when we
perform the transformation.
Hessian Matrix
• Suppose you are walking around in the hills at

night and you would like to find the highest peak.
• You can’t see further than a few meters because
it is dark. If you followed the direction of the
highest slope, you’d eventually end up on a
saddle or on a hill, but it might not be the highest
point.
• The Hessian gives you a way to determine
whether the point you are standing on is, in fact,
the highest hill.
Hessian Matrix
• The Hessian matrix is a matrix of the second-order

partial derivatives of a function.
• The easiest way to get to a Hessian is to first calculate

the Jacobian and take the derivative of each entry of
the Jacobian with respect to each variable.
• This implies that if you take a function of n variables,
the Jacobian will be a row vector of n entries. The
Hessian will be an n×n matrix.
Hessian Matrix
• f you have a vector-valued function with n

variables and m vector entries, the Jacobian will
be m×n, while the Hessian will be m×n×n .
• Let’s do an example to clarify this starting with the
following function.
f(x,y)=3x2+y2
Hessian Matrix
• We first calculate the Jacobian.

J=[6x 2y]
• Now we calculate the terms of the Hessian.
Hessian Matrix
• Our Hessian is a diagonal matrix of constants.

That makes sense since we had to differentiate
twice and therefore good rid of all the
exponents.
• We can easily calculate the determinant of the
Hessian.
det(H)=6×2−0×0=12
Hessian Matrix
• What can we infer from this information?

– If the first term in the upper left corner of our
Hessian matrix is a positive number, we are
dealing with a minimum.
– If the first term in the upper left corner of our
Hessian matrix is negative, we are dealing with a
maximum.
– In both cases, the determinant has to be positive
– If the determinant is negative, the matrix is non-
definite. In this case, we might have arrived at a
saddle point.
Multivariable Chain Rule
• Remember that the chain rule helps us

differentiate nested functions.
• If we have a function f of multiple variables x
and y, which are themselves functions of
another variable r, we can calculate the total
differential.
• As we’ve seen when constructing the Jacobian

matrix, then treating x(r) and y(r) as disparate
functions, we can write them together in a
vector.
• Note that I am writing v with this tiny arrow on

top to distinguish it from the other non-vector
variables.
• Accordingly, we can also write the derivatives as vectors.
• Now we can write the total derivative of f with respect to

the nested variable r as a dot product of the two vectors.
Example:
• Let’s first calculate the partial derivatives of f with

respect to x, y, and the derivatives for x, y with
respect to r.
Example:
Let’s write them in vector format as a dot product and

multiply out.
Alternatively, we can eliminate x and y from the start

by substituting the appropriate terms of r.
Now we can simply differentiate, which gives us the

following.
Summary
• It resolves to the same term as when we

applied the chain rule.
• In this simple case, it is probably faster to use
the second method.
• But once you are dealing with many nested
variables, the chain rule is a much better and
more scalable approach.
Thank you
This presentation is created using LibreOffice Impress 5.1.6.2, can be used freely as per GNU General Public License
/mITuSkillologies @mitu_group /company/mitu- MITUSkillologies

skillologies
Web Resources
https://mitu.co.in
http://tusharkute.com
[email protected]
[email protected]

Unit 3. Calculus

Uploaded by

Copyright:

Available Formats

Unit 3. Calculus

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 3. Calculus

Uploaded by

Copyright:

Available Formats

Calculus

• "OK, we were AT the sign for zero seconds, and the

• The speed is 0m / 0s = 0/0 = I Don't Know!

• "I can't calculate it, Ajay! I need to know some distance

• That is pretty amazing ... you'd think it is easy to

• But our story is not finished yet!

• Sam uses this simplified formula to find the

• But how fast is that? Speed is distance over time:

• "BUT", says Ajay, "again that is an average speed,

• "BUT", says Ajay, "again that is an average speed,

• Well ... at exactly 1 second the speed is:

• So again Sameer has a problem.

• Think about it ... how do we figure out a speed at

• What is the distance? What is the time difference?

• They are both zero, giving us nothing to calculate

• But Sam has an idea ... invent a time so short it

• Sam won't even give it a value, and will just call it

• So Sam works out the difference in distance

• At 1 second Sam has fallen

• So between 1 second and

• Change in distance over time:

• So the speed is 10 + 5Δt m/s, and Sam thinks about

• Wow! Sam got an answer!

• Sameer: "I will be falling at exactly 10 m/s"

• Calculus is a branch of mathematics that involves the

• Calculus is the study of rates of change.

• The word Calculus comes from Latin meaning

• Let’s say that we increase x by a small amount, which

• If we add dx to x, it will cause an increase in y by dy.

• The expression dy describes a rise in y, whereas dx

• Well, in this case, our hypotenuse is just a straight line.

• Now the slope and therefore our “rise over run”

• Following the origin, the green graph slowly picks up

• It should be obvious by now, that we cannot capture

• Here d stands for delta.

• Remember in a graph y is usually expressed as a

• More formally we would say:

• Accordingly, you would arrive at the slope of this

• Since the slope is changing continuously in a non-

• Many real-life problems in areas such as

• Suppose you need to describe how the volume

• But wait, isn’t this the formula for the area of a

• The area of the circle is equivalent to the partial derivative

• Note that ∂ is the partial derivative symbol. You use it

• the derivative is nothing but the slope of a

• Since x^2 is an exponential term, the slope

3y is a linear term, therefore the slope

• What about the total derivative with respect to x and y?

• This is known as the Jacobian matrix. In this simple case with

• The Jacobian of a function of real numbers is a vector. We

• Our function vector has m entries. The resulting Jacobian

• We can use the Jacobian matrix to transform

• Suppose you are walking around in the hills at

• The Hessian matrix is a matrix of the second-order

• The easiest way to get to a Hessian is to first calculate

• f you have a vector-valued function with n

• We first calculate the Jacobian.

• Our Hessian is a diagonal matrix of constants.

• What can we infer from this information?

• Remember that the chain rule helps us