E1 251 Linear and Nonlinear Op2miza2on
E1 251 Linear and Nonlinear Op2miza2on
E1 251 Linear and Nonlinear Op2miza2on
1
11.1.
Standard
form
of
a
linear
program
Minimize cT x
subject to Ax = b and x ≥ 0
2
Transforming
problems
with
inequality:
case
I
Minimize c1 x1 + c2 x2 + + cn xn
subject to
a11 x1 + a12 x2 + + a1n xn ≤ b1
a 21 x1 + a22 x2 + + a2n xn ≤ b2
a m1 x1 + am 2 x2 + + amn xn ≤ bm
x1 ≥ 0, x2 ≥ 0,…, xn ≥ 0
The above problem in matrix
form:
Minimize cT x subject to Ax ≤ b and x ≥ 0
where c = [c1 cn ]T , x = [x1 xn ]T , b = [b1 bm ]T
{A}i, j = aij
3
Minimize c1 x1 + c2 x2 + + cn xn
subject to a11 x1 + a12 x2 + + a1n xn + y1 = b1
a 21 x1 + a22 x2 + + a2n xn + y2 = b2
a m1 x1 + am 2 x2 + + amn xn + ym = bm
x1 ≥ 0, x2 ≥ 0,…, xn ≥ 0
y1 ≥ 0, y2 ≥ 0,…, ym ≥ 0
The above problem in matrix form:
Minimize c′T x′ subject to A′x′ = b and x′ ≥ 0 where
c′ = [c1 cn 00]T , x′ = [x1 xn y1 ym ]T
A = [A I].
4
Transforming
problems
with
inequality:
case
II
Minimize c1 x1 + c2 x2 + + cn xn
subject to
a11 x1 + a12 x2 + + a1n xn ≥ b1
a 21 x1 + a22 x2 + + a2n xn ≥ b2
a m1 x1 + am 2 x2 + + amn xn ≥ bm
x1 ≥ 0, x2 ≥ 0,…, xn ≥ 0
5
Minimize c1 x1 + c2 x2 + + cn xn
subject to a11 x1 + a12 x2 + + a1n xn − y1 = b1
a 21 x1 + a22 x2 + + a2n xn − y2 = b2
a m1 x1 + am 2 x2 + + amn xn − ym = bm
x1 ≥ 0, x2 ≥ 0,…, xn ≥ 0
y1 ≥ 0, y2 ≥ 0,…, ym ≥ 0
The above problem in matrix form:
Minimize c′T x′ subject to A′x′ = b and x′ ≥ 0 where
c′ = [c1 cn 00]T , x′ = [x1 xn y1 ym ]T
A = [A -I].
6
11.
2.
ApplicaAon
of
opAmality
condiAons
The
problem:
Minimize cT x subject to Ax = b and x ≥ 0
x : n × 1 vector; b : m × 1 vector; A : m × n matrix.
Regularity
is
not
required:
T (x* ) and M (x* ) are obviously equivalent subspaces.
KKT
condiAons:
A T λ * + µ * = c, µ * ≥ 0, µ *T x* = 0, Ax* = b, x* ≥ 0
Sufficiency
of
KKT
condiAons:
For a point (x* , λ * , µ * ) satisfying KKT conditions, we have
T
cT x* = (A T λ * + µ * )T x* = (Ax
*
) λ *
+ µ
* *
x = b λ
T *
bT 0
Equivalent problem:
minimize − bT λ subject to A T λ − c ≤ 0.
8
OpAmality
(KKT)
The problem:
minimize f (λ ) = −bT λ subject to g(λ ) = A T λ − c ≤ 0.
The KKT equations are
∇f (λ * ) + J g (λ * )x* = 0 ⇒ − b + Ax* = 0
x* ≥ 0
AT λ * − c ≤ 0
x*T (A T λ * − c) = 0
9
Sufficiency
of
KKT
condiAons
for
dual
problem
Given the solution to the KKT conditions (x* , λ * , µ * ), we have for any
feasible point λ that
bT λ = (x* )T A T λ
( ) (
= (x* )T A T λ − c + cT x* = (x* )T A T λ − c + cT x*)
= -(x* )T µ + cT x*
Since (x* )T µ ≥ 0, bT λ ≤ cT x* = bT λ *
An observation:
cT x − b T λ = (c − A T λ )x = µ T x ≥ 0.
⇒ when both dual and primal variable are feasible, dual
cost lower bounds the primal cost and the primal cost
upper bound the dual cost (weak duality theorem)
The
strong
duality
theorem
(Th.
11.3A):
4 Definition.
THE FUNDAMENTAL
A solution to the problemTHEOREM
(E3) is called OF LINEAR
an optimal feasible
PROGRAMMING
solution; if this solution is basic, it is called basic optimal feasible solution.
this section, through the fundamental theorem of linear programming, we
ablish the primary importance of basic feasible solutions in solving linear
grams. The method of proof of the theorem is in many respects as important 13
as
result itself, since it represents the beginning of the development of the simplex
Theorem
11.4A
(fundamental
theorem
of
linear
Programming)
Minimize cT x ⎫
⎪
subject to Ax = b ....(E1) ⎫ ⎬ ......(E3)
⎬ .....(E2) ⎪
and x ≥ 0 ⎭ ⎭
14
opolytopes.
Appendix We B for a more this
establish complete summary of
correspondence as concepts
follows. related to convexity,
The reader is referredbut
he
Geometric
definition
interpretaAon
o f
linear
p rograms
Appendix B of
foran extreme
a more point is
complete stated here.
summary of concepts related to convexity, but
e definition of an extreme point is stated here.
Definition. A point x in a convex set C is said to be an extreme point of C
if there are no
Definition. A two
pointdistinct convexx1setand
x in a points C xis2 said
in C tosuch
be that x = !x1point
an extreme + "1 −
of !#x C 2
iffor some
there no 0two
are !$ <! < 1. points x1 and x2 in C such that x = !x1 + "1 − !#x2
distinct
for some !$ 0 < ! < 1.
An extreme point is thus a point that does not lie strictly within a line segment
nonnecting
extreme two pointother
is thus a point
points of thethat
set. does not lie points
The extreme strictlyofwithin a lineforsegment
a triangle, example,
nnecting
re twovertices.
its three other points of the set. The extreme points of a triangle, for example,
e its three vertices.
Theorem.
Theorem 11.4B:(Equivalence of extreme points and basic solutions). Let A be an
Theorem.
m × n matrix (Equivalence
of rank mofand extreme
b an points and basic
m-vector. Let Ksolutions). Let A polytope
be the convex be an
× n matrix
mconsisting of of
all rank m and
n-vectors b an m-vector. Let K be the convex polytope
x satisfying
consisting of all n-vectors x satisfying
Ax = b
Ax = b (17)
(E1)
x ! 0% (17)
x ! 0%
A vector x is an extreme point of K if and only if x is a basic feasible solution
(E1) x is an extreme point of K if and only if x is a basic feasible solution
Atovector
(17).
to (17).
roof. Suppose first that x = "x1 $ x2 $ % % % $ xm $ 0$ 0$ % % % $ 0# is a basic feasible 15
Proof of 11.4A.(i):
(1) suppose there exits a feasible solution with number of positive elements
equal p and WLOG let the first p elements of the solution vector, x1 , x2 ,…, x p ,
be the positive elements.
Then x1a1 + x2 a 2 +!+ x p a p = b, where
a1 ,a 2 ,!,a p are the first p columns of A.
(2) If p > m, then a1 ,a 2 ,!,a p are necessarily dependent since the rank of A
is m.
Hence there exists y1 , y2 ,…, y p such that y1a1 + y2 a 2 +!+ y p a p = 0.
( )
Hence, for any ε > 0, ( x1 − εy1 ) a1 + ( x2 − εy2 ) a 2 +!+ x p − εy p a p = b,
which shows that x1 − εy1 , x2 − εy2 ,…, x p − εy p is also a new feasible solution
for sufficiently small ε.
16
(3) If we choose such that = min{xi / yi ; yi > 0}, then x1 − y1 ,…, x p − y p
will have at most p − 1 positive elements with other elements equal to zero.
(4) This shows that if there exist a feasible solution with p postive elements
with p > m, then it also possible find a feasible solution with p − 1 positive
elements. Repeat this until we get at most m postive elements in the soln.
17
Proof of 11.4A.(ii):
(1) suppose there exits an optimal feasible solution with number of positive
elements equal p and WLOG let the first p elements of the solution vector,
x1 , x2 ,…, x p , be the positive elements.
Then x1a1 + x2 a 2 +!+ x p a p = b, where a1 ,a 2 ,!,a p are the first p columns
of A.
(2) If p > m, then a1 ,a 2 ,!,a p are necessarily dependent since the rank of A
is m.
Hence there exists y1 , y2 ,…, y p such that y1a1 + +y2 a 2 +!+ +y p a p = 0.
( )
Hence, for any ε > 0, ( x1 − εy1 ) a1 + ( x2 − εy2 ) a 2 +!+ x p − εy p a p = b,
which shows that x1 − εy1 , x2 − εy2 ,…, x p − εy p is also a new feasible solution
for sufficiently low ε.
18
(3) Next, note that both v and -v are feasible
directions at x ′ where v = [y1 !y p 0 !0]T and x ′ = [x1 !x p 0 !0]T ,
since Av = 0.
19
(4) If we choose ε such that ε = min{xi / yi ; yi > 0}, then x1 − εy1 ,…, x p − εy p
will have at most p − 1 positive elements with other elements equal to zero.
(5) This shows that if there exist a feasible solution with p postive elements
with p > m, then it also possible find a feasible solution with p − 1 positive
elements. Repeat this until we get at most m postive elements in the soln.
20
Proof of 11.4B:
Let x = [x1 x2 ... xm 0 .... 0]T be the basic solution. Then we have
x1a1 + x2 a 2 + ....+ xm a m = b.
21
Hence we have
x1a1 + x2 a 2 + ....+ xm a m = b
y1a1 + y2 a 2 + ....+ ym a m = b
z1a1 + z2 a 2 + ....+ zm a m = b
22
To show that: For an extreme point x with non-zero elements given by
{x1 , x2 ,..., xk }, the corresponding columns of A given by {a1 ,a 2 ,...,a k }
are linearly independent, which will mean that k ≤ m, i.e., x is a basic
solution.
If the vectors {a1 ,a 2 ,...,a k } are linearly dependent, then we have some
scalars {y1 , y2 ,..., yk } such that y1a1 + y2 a 2 + ....+ yk a k = 0.
24
11.5.
The
Simplex
method
11.5.1
Solving
the
KKT
with
chosen
set
of
basic
variables
Minimize cT x ⎫
⎪
subject to Ax = b ....(E1) ⎫ ⎬ ......(E3)
⎬ .....(E2) ⎪
and x ≥ 0 ⎭ ⎭
Assumption: first m variables, if solved for (E1) by forcing other variable to be zero,
will lead to non-degenerate solution for (E2).
25
Selected KKT equations with partition:
BT λ * = c B N T λ * + µ N* = c N , Bx*B = b
Candidate solution:
x*B = B−1b, x*N =0 ⎫
⎪
λ = B cB
* −T
⎬ .........(BFS1)
µ B* = 0 µ N* = c N − NT λ * = c N − NT B−T c B ⎪⎭
(BFS1) solves all KKT equations except the condition that µ N* ≥ 0
provided x*B > 0.
26
11.5.2 Interpretation of negative values of µ N* in BFS1 if x*B > 0
( )
T
d = ⎡− B aq 0 ... 0 1 0 ... 0 ⎤
T
−1
is a feasible direction with µq*
⎣ ⎦
as the rate of change in cost.
28
11.5.3
How
far
can
you
go
along
the
feasible
direcAon
?
29
11.5.4
The
simplex
algorithm
Given the partition: A = [B N] and B−1 such that B−1b ≥ 0
step 1). compute x*B = B−1b, µ N* = c N − N T B−T c B .
If µ N* ≥ 0, stop ("optimal BFS found"). Else goto step 2.
step 2) select q ∈[m + 1,n] such that µq* < 0.
compute d q = B−1a q .
If d q ≤ 0, stop ("problem is unbounded"). Otherwise goto step 3.
min ⎛ {x*B } ⎞
step 3) compute xq′ =
{ } > 0 ⎝ d q ⎟⎠
⎜
i
.
i, d q
i { } i
Minimize Minimize
c1 x1 + c2 x2 + + cn ′ xn ′ c1 x1 + c2 x2 + + cn ′ xn ′
subject to subject to
a11 x1 + a12 x2 + + a1n ′ xn ′ ≤ b1 a11 x1 + a12 x2 + + a1n ′ xn ′ + y1 = b1
a 21 x1 + a22 x2 + + a2 n ′ xn ′ ≤ b2 a 21 x1 + a22 x2 + + a2 n ′ xn ′ + y2 = b2
a m1 x1 + am 2 x2 + + amn ′ xn ′ ≤ bm a m1 x1 + am 2 x2 + + amn ′ xn ′ + ym = bm
x1 ≥ 0, x2 ≥ 0,…, xn ′ ≥ 0 x1 ≥ 0, x2 ≥ 0,…, xn ′ ≥ 0
y1 ≥ 0, y2 ≥ 0,…, ym ≥ 0
Select y1 , y2 ,,…, ym as the basic variables with the solution
y1* = b1 , y2 * = b2 ,,…, ym * = bm
Case II: If the problem was originally in standard form, construct an
auxilliary linear program.
Original problem:
Minimize cT x subject to Ax = b and x ≥ 0
Auxilliay problem:
Minimize [1 1 1]y subject to Ax + y = b and x ≥ 0, y ≥ 0.
33
11.7.2
The
tableau
form
Add a variable z which is equal to the value of the function and add
a constraint c1 x1 + c2 x2 +!+ cn xn − z = 0.
⎡ B N 0 b ⎤
Define tableau: T = ⎢ T ⎥.
⎢⎣ c B c N −1 0 ⎥⎦
T
step 1). If µ N* ≥ 0, stop ("optimal BFS found"). x B* = B−1b. Else goto step 2.
step 2) select q ∈[m + 1,n] such that µq* < 0.
Get d q from N ′ ((q − m)th column)
If d q ≤ 0, stop ("problem is unbounded"). Otherwise goto step 3.
Minimize cT x subject to Ax = b
Maximize bT λ subject to c − A T λ ≤ 0,
and x ≥ 0
KKT condiAons:
A T λ * + µ * = c, µ * ≥ 0, µ *T x* = 0, Ax* = b, x* ≥ 0
39
In simplex method, x B* ≥ 0, but µ N * does not satisfy µ N * ≥ 0, i.e., the solution set
is feasible for primal problem but not optimal. Hence we hunt for the basis set
such that µ N * ≥ 0.
In dual simplex method, µ N * ≥ 0, but x B* does not satisfy x B* ≥ 0, i.e., the solution set
is feasible for dual problem but not optimal. Hence we hunt for the basis set
such that x B* ≥ 0.
40
InterpretaAon
of
negaAve
components
of
basic
variables
Suppose xq* is negative and we want to make it non-basic.
Hence we want to make qth component in µ B equal to some α > 0.
Let α v be the corresponding adjustment in λ .
Then from KKT equation A T λ * + µ * = c
substitute
⎡ 0 ⎤
µ =⎢
*
⎥ + α eq .
µ ′
⎢⎣ N ⎥⎦
We get BT (λ * + α v) + α e q = c B ---------- (E1)
N T (λ * + α v) + µ N′ = c N -------------(E2)
where α v is the corresponding adjustment in λ .
Since BT λ * = c B , from (E1) we get
−BT v = e q .
The new dual cost is
bT (λ * + α v) = bT λ * + α bT v = bT λ * − α bT B−T e q
= bT λ * − α x*T
B e q = b λ − α xq
T * *
min {µ }
*
N
α= j
wj > 0 wj
arg min {µ }*
N
p= j
j ∈[m + 1,n] wj
KKT
condiAons:
A T λ * + µ * = c, µ * ≥ 0, λ * ≥ 0,
µ *T x* = 0, λ *T (Ax* − b) = 0.
x* ≥ 0, Ax* − b ≥ 0,
Equivalent problem:
minimize − bT λ subject to A T λ − c ≤ 0, λ ≥ 0
44
Dual
OpAmality
(KKT)
The problem:
minimize f (λ ) = −bT λ subject to g(λ ) = A T λ − c ≤ 0, λ ≥ 0
The KKT equations are
∇f (λ * ) + J g (λ * )x* − y* = 0
⇒ − b + Ax* = y* Ax* − b ≥ 0
x* ≥ 0 x* ≥ 0
y* ≥ 0
AT λ * − c ≤ 0 AT λ * − c ≤ 0
λ* ≥ 0 λ* ≥ 0
x*T (A T λ * − c) = 0 x*T (A T λ * − c) = 0
y*T λ * = 0
( ) =0
T
Ax −
*
b λ *
An observation:
cT x − b T λ = (c − A T λ )x = µ T x ≥ 0.
⇒ when both dual and primal variable are feasible, dual
cost lower bounds the primal cost and the primal cost
upper bound the dual cost (weak duality theorem)
The
strong
duality
theorem
(Th.
11.9A):