Lecture 9
Lecture 9
Lecture 9
• Idea: Instead of using true Hessian 𝛻 " 𝑓(𝑥& ) in Newton direction, use an
approximation 𝐵& (symmetric positive definite).
-
1 -
𝑓 𝒙& + 𝒑 ≈ 𝑓 𝒙& + 𝛻𝑓 𝒙& 𝒑 + 𝒑 𝑩& 𝒑
2
• 𝐵& is constructed at each step using 𝑩&12 and gradient information
𝛻𝑓 𝒙& and 𝛻𝑓 𝒙&12
• It is common to develop iteration formula directly in the form of 𝑩12
& = 𝑨&
CL 603 Optimization 29
Quasi-Newton methods
• DFP (Davidon Fletcher Powell) method1,2
-
Δ𝒙& Δ𝒙& - Δ𝒙& Δ𝒈& - Δ𝒙& Δ𝒈& -
𝑨&52 = -
+ 𝑰− -
𝑨& 𝑰 −
Δ𝒙& Δ𝒈& Δ𝒙& Δ𝒈& Δ𝒙& - Δ𝒈&
CL 603 Optimization 31
Application to motivating example
DFP method After 3 iterations: 𝒙 ⋆ = [𝟏𝟒. 𝟏𝟏; 𝟖. 𝟓𝟖]
1. Initial guess 𝑥= = [4; 4] 𝒇⋆ = −𝟕. 𝟑𝟔
2. 𝑠2 = −𝛻𝑓 𝑥= = [0.65; 0.39]
Analytical solution: 𝒙 ⋆ = [𝟏𝟒. 𝟏𝟒; 𝟖. 𝟓𝟖]
3. 𝛼2 = 13.16
𝒇⋆ = −𝟕. 𝟑𝟔
4. 𝑥2 = [12.58; 9.14]
10.12 5.31 12 1 2 -1
-3.1619 0-2 -1
-3.1619
-5.5 0-2
5. 𝐴2 =
-5
-3.161
5.31 4.04
.5
-1 90-2
.5
0
-5
6. 𝑠" = [0.09; −0.10] 10
-2
-1
1
-7.
7. 𝛼" = 7.00
34
19
-7.2
-3.16
76
2
.2
76
8
-7
-5.5
-2 -3.1619
-1
CA0
9.11 −2.14
6
-5.
9. 𝐴" =
0
−2.14 4.37
-2
-1
1
-5.5 -5.5
4
10. 𝑠Q = [0.27; 0.05]
-3
.1
61
9
11. 𝛼Q = 3.35 2
-3.1619 -3.1619
-2
-2
-1
-2
12. 𝑥Q = [14.11; 8.58] -1
0
-1 -1
0
0 0 0 0
0 2 4 6 8 10 12 14 16 18 20
=
CL 603 Optimization 32
Application to motivating example
BFGS method After 3 iterations: 𝒙 ⋆ = [𝟏𝟒. 𝟏𝟏; 𝟖. 𝟔𝟐]
1. Initial guess 𝑥= = [4; 4] 𝒇⋆ = −𝟕. 𝟑𝟔
2. 𝑠2 = −𝛻𝑓 𝑥= = [0.65; 0.39]
Analytical solution: 𝒙 ⋆ = [𝟏𝟒. 𝟏𝟒; 𝟖. 𝟓𝟖]
3. 𝛼2 = 13.16
𝒇⋆ = −𝟕. 𝟑𝟔
4. 𝑥2 = [12.58; 9.14]
12 1 2 -1
-3.1619 0-2 -1
-3.1619
-5.5 0-2
10.13 5.29
-5
-3.161
5. 𝐴2 =
.5
-1 90-2
5.29 4.06
.5
0
-5
10
6. 𝑠" = [0.09; −0.105]
-2
-1
1
-7.
34
19
-7.2
7. 𝛼" = 6.79
-3.16
76
2
.2
76
8
-7
-5.5
-2 -3.1619
-1
CA0
6
-5.
18.62 0.07
0
9. 𝐴" =
-2
0.07 4.89
-1
1
-5.5 -5.5
4
-3
10. 𝑠Q = [0.68; 0.15]
.1
6 19
11. 𝛼Q = 1.33
-3.1619 -3.1619
2
-2
-2
-1
-2
-1
12. 𝑥Q = [14.11; 8.62]
0
-1 -1
0
0 0 0 0
0 2 4 6 8 10 12 14 16 18 20
=
CL 603 Optimization 33
Selection of step length 𝛼&
• Exact line search: Minimize 𝑓(𝑥& + 𝛼& 𝑝& ) using univariate
optimization
• Can be computationally expensive
• Inexact line search: Select 𝛼& such that it achieves
‘reasonable’ reduction in 𝑓 in the search direction 𝑝&
• Trade-offs: Substantial decrease in 𝑓 versus computational
effort involved
CL 603 Optimization 34
Conditions on step length
• Simple condition: provide reduction 𝑓 𝑥& + 𝛼& 𝑝& < 𝑓(𝑥& )
• Armijo condition (AC): If 𝑝& is a descent direction, 𝛼& should
give sufficient decrease in 𝑓 i.e.
-𝑝
𝑓 𝑥& + 𝛼& 𝑝& ≤ 𝑓 𝑥& + 𝑐2 𝛼& 𝛻𝑓 𝑥& &
CL 603 Optimization 35
Armijo condition – graphical interpretation
• 𝑓 𝑥 = 𝑥2 − 2 b + 𝑥2 − 2 𝑥"" + 𝑥" + 1 ", 𝑥& = 1; 1 , 𝑐2 = 0.2
• 𝑝& = −𝛻𝑓 - 𝑥& = [6; −6]
𝜶 ≤ 𝟎. 𝟑𝟐
-𝑝
𝑐" 𝛻𝑓 𝑥& &
𝜶 ≥ 𝟎. 𝟎𝟕 𝟎. 𝟎𝟕 ≤ 𝜶 ≤ 𝟎. 𝟑𝟐
Exact line search: 𝜶⋆ = 𝟎. 𝟐𝟓
CL 603 Optimization 37
Wolfe conditions
• Armijo condition and curvature condition
-𝑝
𝑓 𝑥& + 𝛼& 𝑝& ≤ 𝑓 𝑥& + 𝑐2 𝛼& 𝛻𝑓 𝑥& &
-𝑝 -𝑝
𝛻𝑓 𝑥& + 𝛼& 𝑝& & ≥ 𝑐" 𝛻𝑓 𝑥& &
CL 603 Optimization 38
Strong Wolfe conditions
-𝑝
𝑐" 𝛻𝑓 𝑥& &
𝟎. 𝟎𝟕 ≤ 𝜶 ≤ 𝟎. 𝟑𝟏 𝟎. 𝟎𝟕 ≤ 𝜶 ≤ 𝟎. 𝟑𝟏
CL 603 Optimization 39
Why 𝑐2 < 𝑐"
𝒄𝟏 = 𝟎. 𝟓, 𝒄𝟐 = 𝟎. 𝟐
𝜶 ≤ 𝟎. 𝟏𝟐 𝜶 ≥ 𝟎. 𝟏𝟒
No feasible 𝜶
CL 603 Optimization 40
Backtrack algorithm
• Based on Armijo condition
1. Choose 𝛼g > 0, 𝜌, 𝑐2 ∈ (0,1)
2. Set 𝛼 = 𝛼g
-𝑝
3. Is 𝑓 𝑥& + 𝛼& 𝑝& ≤ 𝑓 𝑥& + 𝑐2 𝛼& 𝛻𝑓 𝑥& & satisfied?
a. Yes: 𝛼& = 𝛼
b. No: 𝛼 = 𝜌𝛼, go back to step 3
CL 603 Optimization 41
How to select 𝛼g
• For Newton type methods, 𝛼g = 1
• Choice 1: Assume that first order change in function across
two iterates is same
𝛼& 𝛻- 𝑓 𝑥& 𝑝& = 𝛼&12 𝛻- 𝑓 𝑥&12 𝑝&12
2 𝑓 𝑥& − 𝑓 𝑥&12
𝛼g =
𝛻- 𝑓 𝑥& 𝑝&
CL 603 Optimization 42
Newton with inexact line search
Newton’s method with inexact After 4 iterations: 𝒙 ⋆ = [𝟏𝟒. 𝟏𝟑; 𝟖. 𝟓𝟖]
line search 𝒇⋆ = −𝟕. 𝟑𝟔
1. Initial guess 𝑥= = [4; 4] Analytical solution: 𝒙 ⋆ = [𝟏𝟒. 𝟏𝟒; 𝟖. 𝟓𝟖]
2. 𝛼= = 0.25 𝒇⋆ = −𝟕. 𝟑𝟔
3. 𝑥2 = [9.64; 9.09]
4. 𝛼2 = 1.00
5. 𝑥" = [12.39; 8.71]
6. 𝛼" = 1.00
7. 𝑥Q = [13.83; 8.61]
8. 𝛼Q = 1.00
9. 𝑥b = [14.13; 8.58]
CL 603 Optimization 43