Optimization in Practice

OPTIMIZATION
IN PRACTICE
• We say that 𝑓 𝑥, 𝑦 has a local
maximum/minimum at (a,b) if the
Maxima value of 𝑓 at (a,b) is at least high/low

as the value of 𝑓 at all nearby points in
the domain.
and • We say that 𝑓 𝑥, 𝑦 has a global

Minima
value of 𝑓 at (a,b) is at least high/low
as the value of 𝑓 at all other points in the
domain.
• We say that 𝑓 𝑥, 𝑦 has a local
as the value of 𝑓 at all nearby points in
Maxima
the domain.
and
• We say that 𝑓 𝑥, 𝑦 has a global
as the value of 𝑓 at all other points in the
Minima
domain.
• Clearly, being a global extremum is more

demanding than being a local extremum.
Hence, a global extremum must also be a
local extremum.
Extreme Value
Theorem
• If f is a continuous function and D is a
domain that’s both closed (containing all
boundary points) and bounded (not going
on forever in any direction), then f must
attain a global max and min on D.
Extreme Value
Theorem
• If f is a continuous function and D is a
domain that’s both closed (containing all
boundary points) and bounded (not going
on forever in any direction), then f must
attain a global max and min on D.
• Note that violating the theorem does not

necessarily mean that there definitely is no
global maximum or minimum. Examples:
𝑓 𝑥, 𝑦 = 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡; 𝑓 𝑥, 𝑦 = sin(𝑥 2 +
𝑦2)
Identify Global
Extrema on the
Domain
• Say, we have a single-variable
function 𝑓 𝑥 defined on
2,3 ∪ [4,5].
Identify Global
Extrema on the • Say, we have a single-variable function
𝑓 𝑥 defined on 2,3 ∪ [4,5].
Domain • We should consider the possibility that

global extrema are attained on (2,3) ∪
(4,5). Hence, we look for (critical)
points where 𝑓 ′ = 0 and compute the
function value at those points. Of
course, not every critical point is an
extremum, but that’s okay; it's better
to have extras than to miss one. Also, if
the points fall outside (2,3) ∪ (4,5), we
simply disregard them.
Identify Global • Say, we have a single-variable function
𝑓 𝑥 defined on 2,3 ∪ [4,5].
Extrema on the • We should consider the possibility that

global extrema are attained on (2,3) ∪
Domain
(4,5). Hence, we look for (critical) points
where 𝑓 ′ = 0 and compute the function
value at those points. Of course, not
every critical point is an extremum, but
that’s okay; it's better to have extras than
to miss one. Also, if the points fall outside
(2,3) ∪ (4,5), we simply disregard them.
• We should also consider the possibility

that the global extrema are attained at
the endpoints. Therefore, we need to
compute the function values at 𝑥 = 2, 3, 4,
and 5, and compare them with the values
obtained in the previous step. compute
the function values at the endpoints 𝑥 = 2,
3, 4, and 5, and compare them with the
values obtained in the previous step. If the
domain is unbounded, we take the limit
instead (or in addition).
Identify Global
Extrema on the • The domain of a function of two variables, say
Domain
f(x,y), is often a two-dimensional region, and if
the region has an end, it is usually defined by an
equation that represents a curve.
• We can make an analogy and identify critical

points where ∇𝑓 = 0 to consider the possibility
of global extrema within the interior of the
region. Then, we compute the function values at
those points.
Identify Global • The domain of a function of two variables, say
f(x,y), is often a two-dimensional region, and if the
Extrema on the
region has an end, it is usually defined by an
equation that represents a curve.
Domain • We can make an analogy and identify critical points

where ∇𝑓 = 0 to consider the possibility of global
extrema within the interior of the region. Then,
we compute the function values at those points.
• However, since the boundary curve comprises

infinitely many points, evaluating the function at all
these points is impractical. Instead, we treat the
curve equation as a constraint, search for
constrained critical points, and determine global
extrema on the boundary. Note that if the
boundary curve has endpoints, we must find the
function values at those points in addition to
searching for constrained critical points. If the
region is unbounded in any direction, we need to
consider the limit of the function.
Constrained Critical Points and Lagrangian
• Geometric Intuition: Let's assume the constraint equation is 𝑔 𝑥, 𝑦 = 0. The curve representing this
equation is a level set of 𝑔(𝑥, 𝑦). At extrema points, the level set of our target function 𝑓(𝑥, 𝑦) is tangent to
the level set 𝑔 𝑥, 𝑦 = 0. Therefore, ∇𝑓 = 𝜆∇𝑔. Consequently, seeking points where ∇𝑓 = 𝜆∇𝑔 will guide us
to the local extrema on the constraint curve; these points are called constrained critical points.
• Algebraic Derivation (Lagrange Function): Note that 𝑔 𝑥, 𝑦 = 0. Therefore, optimizing the Lagrange
function 𝐿 𝑥, 𝑦, λ = 𝑓 𝑥, 𝑦 + λ𝑔 𝑥, 𝑦 is equivalent to optimizing 𝑓(𝑥, 𝑦). Seeking critical points of
𝐿 𝑥, 𝑦, λ leads to ∇𝑓 = 𝜆∇𝑔 and 𝑔 𝑥, 𝑦 = 0 , which are the same equations used to find the constrained
critical points for 𝑓(𝑥, 𝑦) subject to 𝑔 𝑥, 𝑦 = 0.
Constrained Critical Points and Lagrangian
• Geometric Intuition: Let's assume the constraint equation is 𝑔 𝑥, 𝑦 = 0. The curve representing this equation is a
level set of 𝑔(𝑥, 𝑦). At extrema points, the level set of our target function 𝑓(𝑥, 𝑦) is tangent to the level set
𝑔 𝑥, 𝑦 = 0. Therefore, ∇𝑓 = 𝜆∇𝑔. Consequently, seeking points where ∇𝑓 = 𝜆∇𝑔 will guide us to the local
extrema on the constraint curve; these points are called constrained critical points.
• Algebraic Derivation (Lagrange Function): Note that 𝑔 𝑥, 𝑦 = 0. Therefore, optimizing the Lagrange function
𝐿 𝑥, 𝑦, λ = 𝑓 𝑥, 𝑦 + λ𝑔 𝑥, 𝑦 is equivalent to optimizing 𝑓(𝑥, 𝑦). Seeking critical points of 𝐿 𝑥, 𝑦, λ leads to
∇𝑓 = 𝜆∇𝑔 and 𝑔 𝑥, 𝑦 = 0 , which are the same equations used to find the constrained critical points for
𝑓(𝑥, 𝑦) subject to 𝑔 𝑥, 𝑦 = 0.
• Note that the Lagrangian changes a constrained optimization problem in 2 variables to an unconstrained problem
in 3 variables.
• Check whether the constraint curve has ends. You may still need to examine the end or limit behavior of 𝑓(𝑥, 𝑦).
Worksheet
#1-(a)
• GeoGebra:
https://www.geogebra.org/m/f2bzkjqz
Worksheet
#1-(a)
• GeoGebra:
https://www.geogebra.org/m/f2bzkjqz
Worksheet #1-part (b) and part (c)
• Part (b) • Part (c)
Worksheet • GeoGebra:
https://www.geogebra.org/m/myug2afc
#2
Worksheet #2
• GeoGebra:
https://www.geogebra.org/m/myug2afc
Methodology
● Gathered data on confirmed COVID-19 deaths by day from different states and provinces around
the world
(Hubei)
Methodology
● Considered 2 possible models:

Methodology
● The second model was a better fit:
● So, they fit this model to the U.S. data (finding values of p, α, β)
Actual US curve
Newton’s Method (Newton–Raphson method)
Finding critical points involves solving a set of
equations. However, if the equations are
complicated, solving them precisely can be very
difficult. In such cases, people often resort to
using Newton's method to approximate the
solution. In this illustration, we demonstrate
Newton's method using a single-variable
function, but it's important to note that this
method can be extended to a set of
multivariable equations. Also, please be aware
that the function denoted as f(x) in this context
is an arbitrary function and not our target
function.
𝑓 𝑥𝑛
𝑥𝑛+1 = 𝑥𝑛 −
𝑓 ′ 𝑥𝑛
Newton’s Method (Newton–Raphson method)
𝑓 𝑥𝑛
𝑥𝑛+1 = 𝑥𝑛 −
𝑓′ 𝑥𝑛
● What we do here is update 𝑥𝑛 with the x-intercept

of the tangent line to the curve at (𝑥𝑛 , 𝑓(𝑥𝑛 )), and
repeat the steps until we obtain a sequence of x-
intercepts (of the tangent lines) that are close
enough.
● Newton's method is uncommon in machine

learning due to its requirement for iteratively
solving a large system of linear equations.
Instead, gradient descent, commonly used, entails
adjusting input points in the direction of the
gradient or its negation to locate a local maximum.
Note that this turkey is not part of Newton’s method.
Happy Thanksgiving!

Optimization in Practice

Uploaded by

Copyright:

Available Formats

Optimization in Practice

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Optimization in Practice

Uploaded by

Copyright:

Available Formats

OPTIMIZATION

Maxima value of 𝑓 at (a,b) is at least high/low

and • We say that 𝑓 𝑥, 𝑦 has a global

• Clearly, being a global extremum is more

• Note that violating the theorem does not

Domain • We should consider the possibility that

Extrema on the • We should consider the possibility that

• We should also consider the possibility

• We can make an analogy and identify critical

Domain • We can make an analogy and identify critical points

• However, since the boundary curve comprises

● Considered 2 possible models:

● The second model was a better fit:

● What we do here is update 𝑥𝑛 with the x-intercept

● Newton's method is uncommon in machine

You might also like