Decompositions in Interior Point Algorithms

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

CHAPTER 8

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


Decompositions in Interior Point
Algorithms

In 1984 N. Karmarkar of AT&T Bell Laboratories introduced an interior


point algorithm for linear programming. The algorithm has a polynomial
time complexity bound and has been established, following extensive com­
putational experimentation, as a viable competitor to the classic simplex
algorithm for the solution of large-scale linear programs. A flurry of re­
search activities followed Karmarkar’s work and several interior point al­
gorithms were developed for linear programming, convex quadratic pro­
gramming, convex programming, linear complementarity problems, and
nonlinear complementarity problems.
Different mathematical tools have been employed to develop and ana­
lyze these algorithms. Depending on the theory underlying the mathemati­
cal analysis the algorithms are classified as potential reduction algorithms,
path following algorithms, barrier function algorithms, affine scaling algo­
rithms, and projective scaling algorithms.
It has been gradually realized that some of the algorithms can be ob­
tained as special cases of a unified interior point method. It is beyond the
scope of this chapter to cover all these developments, and relevant refer­
ences are given in Section 8.4.
One common feature of the interior point algorithms is that they com­
pute a search direction by solving a system of equations that involves the
matrix AAT, where A is the constraint matrix. For problems with struc­
tured constraints matrices it is possible to solve the system of equations
using parallel matrix factorization procedures. Hence, interior point algo­
rithms are well suited for parallel computations. In this respect they draw
heavily from developments in parallel numerical linear algebra.
In this chapter we discuss one particular interior point algorithm—the
primal-dual path following algorithm—for both linear programs and for
quadratic programming problems. This is the algorithm implemented in
some of the most efficient and widely used software systems such as OBI
(Optimization with Barrier 1) and its quadratic programming extension,
OBN, the linear programming code ALPO, and LOQO. We show that, for
structured problems, it is possible to develop parallel matrix factorization

Parallel Optimization, Yair Censor, Oxford University Press (1997), © 1997 by


Oxford University Press, Inc., DOI: 10.1093/9780195100624.003.0008
218 Decompositions in Interior Point Algorithms Chap. 8

procedures for solving the required systems of equations. This results in


improvement in the performance of an already efficient algorithm via the
use of parallelism. Sections 8.1 and 8.2 develop the algorithms for linear
and quadratic programs, respectively. The parallel matrix factorization

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


techniques are the topic of Section 8.3. Notes and references in Section 8.4
conclude this chapter.

8.1 The Primal-Dual Path Following Algorithm for Linear


Programming
Consider the primal linear programming problem:

Minimize (c, x) (8-1)


s.t. Ax = by (8-2)
x > 0. (8-3)

The m x n constraint matrix A that is assumed to have full row rank,


the cost vector c e IRn, and the vector of resource coefficients b 6 IRm
are given. Associated with this problem is its dual linear programming
problem, defined as follows:

Maximize (b,y) (8-4)


s.t. ATy + z = c, (8-5)
z > 0. (8-6)

The dual formulation uses the vector of slack variables z C IRn in order to
express its constraints as equalities.
The primal-dual path following algorithm applies first a logarithmic
barrier function (see Example 4.2.2) to formulate the following pair of
problems: a logarithmic barrier problem for the primal program, given
by,
n
Minimize (c, x) - p l°g xj (8.7)

s.t. Ax ~ by (8.8)

where log is the natural logarithmic function and a barrier problem for
the dual program
n
Maximize (b, y) + p log Zj (8.9)

s.t. ATy + z = c. (8.10)


Sect. 8.1 Primal-Dual Path Following Algorithm for Linear Programming 219

The algorithm then solves these two barrier problems for different values
of the parameter /i, as /z —> 0, and obtains sequences of feasible points
that converge to the solutions of the original primal and dual problems,
respectively (see Section 4.2).

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


Let us now fix a value of p > 0, which is the same for the primal and
dual barrier problems, at a point that satisfies x > 0 and z > 0. We can
then obtain the first-order conditions for simultaneous optimality of both
the primal and the dual barrier problems. In particular, the solution to
the primal barrier problem is the unique point where the gradient of the
associated Lagrangian,
n
Lp(x,y) = (c,x) - /i^Plogxy + {y,b- Ax),
j=i
vanishes. This is the critical point of the Lagrangian, and it is given by
the solution of the following system:

c — pX~1l — ATy = 0, (8.11)


Ax = b. (8.12)

Here X-1 denotes the inverse of the n x n diagonal matrix X defined as


X = diag (371,^2, • • • and 1 E lRn is the all ones vector. The solution
to the dual barrier problem is the unique point where the gradient of the
Lagrangian
n

LD(y,z,x) = (b,y) + lo8 zi + (p,c- ATy - z)

vanishes. This point is obtained by solving:

Ax = b, (8.13)
pZ~ll - x = 0, (8-14)
ATy + z = c, (8.15)

where Z-1 is the inverse of the diagonal matrix Z = diag (z^, Z2,..., zn).
Combining the first-order conditions for the primal and dual problems,
and using the substitution /zZ-1l = XI from (8.14) we obtain the following
system of equations that characterizes, simultaneously, the optimum of
both the primal and the dual barrier problems:

Ax - b, (8.16)
AAy + z = c, (8-17)
XZ1 = yl. (8.18)
220 Decompositions in Interior Point Algorithms Chap. 8

Equation (8.16) is part of the primal feasibility requirement and (8.17) is


part of the dual feasibility requirement. Equation (8.18), relating the dual
slack variables z with the primal vector x, is called p-complementarity. The
duality theory of mathematical programming guarantees that if either the

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


primal or the dual polytope, defined by the feasibility sets of (8.2)-(8.3)
and (8.5)-(8.6) respectively, is bounded and has a nonempty interior, then
there exists a unique solution to (8.16)—(8.18) for which x > 0, z > 0,
for all p > 0. See, e.g., Goldfarb and Todd (1989, p. 156), Fiacco and
McCormick (1968, Chapter 3), or Kojima et al. (1991).
For a given p > 0 the solution to system (8.16)-(8.18) is denoted
by (#M, Zp) and the one-parameter family of solutions {(#M,z^)} is
called the central path. As p tends to zero this path converges to (x
*
,/,?),
where x* is optimal for the original primal problem (8.1)~(8.3), and * )(y
,z
is optimal for the dual problem (8.4)-(8.6). The objective values (c, x^)
and (6, converge, as p tends to zero, to the common objective value of
the primal and dual problems.
A primal-dual path following algorithm iteratively tracks the central
path. It starts from some x > 0, z > 0, and a positive p, and takes a
single step of Newton’s method on the system (8.16)—(8.18) to find a point
close to the central path. It then reduces p and takes another Newton step
starting from the previous iterate, and so forth.
To solve the system (8.16)-(8.18) we start from a point (x,y, z) with
x > 0, z > 0, and move to a new point (z-l-A#, ?/4-A?/, z+Az). Substituting
the new point in (8.16)—(8.18) we obtain a system of equations that is
complicated by the fact that (8.18) is nonlinear. The nonlinear equation
(X 4- AX)(Z + AZ)1 = pl is then linearized by ignoring the cross-product
term AX AZ. The resulting system of linear equations is written as

AXx = p, (8.19)
ATA?/ + Xz = a, (8.20)
ZXx + XXz = (8.21)

where p = b- Ax, a = c- ATy - z, and </> = pl - XZ1.


To solve this system we use first (8.21) to obtain

Xz = X-1(0 - ZAx), (8.22)

and substitute it into (8.20) to obtain

-ZX'1 Xx + ATA?/ = a - X'1^. (8.23)

Extracting Xx from (8.23) we obtain

Xx = -XZ~1(ct - X"^ - ATXy), (8.24)


Sect. 8.1 Primal-Dual Path Following Algorithm for Linear Programming 221

which, upon substitution into (8.19), yields

{AXZ~xAv)Xy = (p + AXZ~\(j - X’M). (8.25)

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


Let 0 = XZ 1 and = p 4- AQ(a — X 1</>). The dual step direction Xy
is then obtained by solving the system

(AQAT)Xy = 'ip. (8.26)

This is a symmetric positive definite system. (Recall that A has full row
rank, and 0 is positive definite since X and Z are positive definite.) It
can be solved using any direct or iterative method for solving systems of
equations. The solution of this system, using parallel matrix factorization
techniques, is the topic of Section 8.3. Having obtained the dual step di­
rection Xy we can substitute it in (8.24) to obtain the primal step direction
Xx, and finally substitute the primal step direction in (8.22) to obtain the
step direction for the slack variables.
The primal-dual path following algorithm is thus summarized as follows.

Algorithm 8.1.1 The Primal-Dual Path Following Algorithm for Linear


Programs

Step 0: (Initialization.) Start with a triplet (a;0, z/°, zQ) satisfying xQ >
0, z® > 0, and any /zq > 0- Let z/ = 0.
Step 1: (Iterative Step.) Given (xy, yy, zy) and a barrier parameter /z^,
let
Xy = diag (xi,X2,... ,x^),
Zv = diag ^,^,...,2"),
e. = xvz~\
pv = b- Ax",
=c — ATy1' -zu,
= Pyl - XyZyl,
r = Pv + AQv(av - X-V),

and solve for the dual step direction Xyv:

(AQvAr)W = (8.27)

Compute the primal step direction Az":

Az" = - Qy^ - - ATXyu\ (8.28)

and the slack variable direction Xzv from


222 Decompositions in Interior Point Algorithms Chap. 8

Xz" = - Z^x"). (8.29)

Update:

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


x"+l = xy 4- ap.iAz1', (8.30)
^+1 = yy + aD,v&y\ (8.31)
z"+1 — zy 4- a^AZ, (8.32)

where Qp^. are primal and dual step lengths, respectively, re­
stricted to the interval (0,1]. Reduce y>y to /z^i, update v <— v 4- 1,
and repeat the iterative step.
We discuss now the choices of the step lengths apy and and of
the barrier parameter Other features of the algorithm that need to
be properly addressed for an efficient implementation are the treatment of
upper bounds, the treatment of free variables, and the computation of an
initial feasible solution. These technical issues are not discussed here, as
they are not essential to understanding the algorithm and do not alter its
parallel implementation. References to the relevant literature are given in
Section 8.4.

8.1.1 Choosing the step lengths


The primal and dual step lengths are chosen in such a way that both the
primal and the slack variables remain positive. To determine the maximum
step length that will keep the iterates positive define first

apj = max{a > 0 | xy 4- qAzJ > 0}, for all j = 1,2,..., n,

and

aDj = max{a > 0 | zy 4- aAzy > 0}, for all j = 1,2,..., n,


= mini<j<n^Dj-

Then the step lengths are defined by ap^ = 6a p^ and ajj^ = 6ap^,
where 0 < 6 < 1. A typical value for <5, used in practice, is 0.9995.

8.1.2 Choosing the barrier parameter


The barrier parameter determines the severity of the barrier term. For
large the search direction points away from the boundary of the feasible
region, and therefore large step lengths are allowed. As a solution is ap­
proached the value of the barrier parameter is reduced so that the iterates
can approach the boundary, since it is known that the solutions to a linear
program are on the boundary.
Sect. 8.2 Primal-Dual Path Following Algorithm for Quadratic Programming 223

An acceptable (theoretically) value for the initial barrier parameter /z0


is where O(L) is the “order of greatness” of L which, in turn, is a
measure of the size of the problem at hand. K = 0(L) means that there
exists a constant M such that | K | < ML, for all problems above a certain

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


size L. The penalty parameter is reduced at every step by the recursion

Mi = (1 -

This choice of the barrier parameter assures polynomial convergence of the


algorithm, but the algorithm is very slow in practice. An updating formula
for setting the barrier parameter, which works well in practice, is given by

(c,^) - (b,yy)

Thus is large when far from the optimum (as measured by a large duality
gap (c, xy) - (&, yy}\ and is reduced rapidly as the optimum is approached.

8.2 The Primal-Dual Path Following Algorithm for


Quadratic Programming
A primal-dual path following algorithm for the quadratic programming
problem is derived using the ideas discussed in the previous section for
the linear program. Namely, a barrier function is employed to eliminate
the inequality constraints and then Newton’s method is used to solve the
optimality conditions of the barrier problem. For separable quadratic func­
tions the algorithms for the linear and the quadratic problems are virtually
identical.
Consider the primal quadratic programming problem

Minimize (c, x) 4- | {x, Qx) (8.33)

s.t. Ax = b, (8.34)
x > 0. (8.35)

The mxn constraint matrix A has full row rank, c G IRn is the cost vector,
and b e IRm is the vector of resource coefficients. The n x n matrix Q is
positive definite.
Associated with this problem is a dual quadratic program, defined by

Maximize (6, y) - - (x, Qx) (8.36)


s.t. ATy 4- z — Qx = c, (8.37)
z > 0. (8.38)
224 Decompositions in Interior Point Algorithms Chap. 8

The dual formulation uses slack variables z G lRn in order to express con­
straints as equalities.
The logarithmic barrier problem for the primal program is given by

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


1 n
Minimize (c, x) + - (x, Qx) — p ^logXj (8.39)
j=i
s.t. Ax = b. (8.40)

Similarly, the barrier problem for the dual program is given by

1 n
Maximize (b, y) - - (x, Qx) + p (8-41)
>=1
s.t. ATy T z — Qx = c. (8.42)

We can solve the barrier problems (8.39)-(8.40) or (8.41)-(8.42) for various


values of the parameter /z, as p 0, and obtain sequences of feasible points
that converge to the solutions of the original primal and dual quadratic
programs, respectively.
Let us fix a value of p > 0 that is the same for the primal and dual
barrier problems at a point that satisfies x > 0 and z > 0. We can then
obtain the first-order conditions for simultaneous optimality of the primal
and the dual barrier problems. In particular, the solution to the primal
barrier problem is the unique critical point of the associated Lagrangian

1 n
Lp(x,y) = (c,x) + ~{x,Qx) - /j, Y2 log Xj + (y,b- Ax).
j=i

The critical point is obtained by setting the gradient of the associated


Lagrangian equal to zero, i.e., it is the solution of the following system:

c + Qx-nX~1l-ATy = 0, (8.43)
Ax = b. (8.44)

Similarily, the solution to the dual barrier problem is the unique critical
point of the Lagrangian

1 n
LD(y, z,x) = (b,y) - ~{x,Qx) + //^jPlogzj + (x,c + Qx - ATy - z).
j=i

The first-order conditions are:

Ax = b, (8.45)
Sect. 8.2 Primal-Dual Path Following Algorithm for Quadratic Programming 225

pZ !1 — x = 0, (8.46)
-Qx 4- ATy + z = c. (8.47)

Doing some simple algebraic manipulations on the first-order conditions

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


for the primal and dual barrier problems we obtain the following system
of equations that characterizes, simultaneously, the optimum of both prob­
lems (8.39)-(8.40) and (8.41)-(8.42):

Ax — b, (8.48)
~Qx + ATy 4- z = c, (8.49)
XZX = Ml. (8.50)

Similar to the linear programming case, it is known that if either the


primal or the dual polytope that describes the feasible region is bounded
and has a nonempty interior, then there exists a unique solution to (8.48)-
(8.50) in the domain given by x > 0, z > 0 for all /z > 0.
As in the linear programming case, the primal-dual path following al­
gorithm for quadratic programs tracks the central path, defined by solving
the system (8.48)-(8.50), for decreasing values of /z. To solve this sys­
tem we start from {x,y,z} with x > 0, z > 0, and move to a new point
(x 4- Arc, y 4- At/, z 4- Xz). Substituting the new point in (8.48)-(8.50) we
obtain a system of equations that is complicated by the nonlinear equation
(8.50) . This equation, i.e., (X 4- AX)(Z 4- AZ)1 = /zl, is again linearized
by ignoring the cross-product term AX AZ. The resulting system of linear
equations is written as

AXx —p, (8.51)


-QXx 4- ATAi/ 4- A2 =a, (8.52)
ZAz4-XAz =0, (8.53)

where p = b - Ax, a = c 4- Qx - ATy ~ z, andcf> = pl - XZ1. To solve


this system we use first (8.53) to obtain

Xz = - ZXx), (8.54)

and then we eliminate Xz in (8.52) to obtain

-(Q + ZX~l)Xx + ATAy = & - X-1^. (8.55)

Solving this system for Xx we get

Xx = - ©(or - X-1^ - ATAi/), (8.56)

where © — (Q 4- ZX x) U Using now (8.51) to eliminate Xx we obtain


the system of equations for the dual step direction Xy :
226 Decompositions in Interior Point Algorithms Chap. 8

(A0At)At/ = (8.57)

where ip = p + AQ(a — X-1^).


The only difference between the dual step direction calculation for the

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


linear program (equation (8.26)) and for the quadratic program (equa­
tion (8.57)) is the definition of the matrix 0: for linear programs it is
0 = XZ~l while for quadratic programs 0 — (Q 4- ZX~1)~1. When Q is
diagonal the computation of 0 is easily obtained as the inverse of a pos­
itive diagonal matrix. The complexity of the computation of Xy is then
the same for linear and quadratic programs. When Q is not diagonal then
0 is more difficult to compute and may even be dense, in which case the
matrix A04T will also be dense.
The primal-dual path following algorithm is thus summarized as follows.

Algorithm 8.2.1 The Primal-Dual Path Following Algorithm for


Quadratic Programs

Step 0: (Initialization.) Start with a triplet (z°,?/0,z0) satisfying x° >


0, z° > 0, and any po > 0. Let v = 0.
Step 1: (Iterative Step.) Given (zp, yy, zy) and a barrier parameter py,
let

Xv = diag (x^xg,..
Zv = diag (zj(,z^,...,z^),
0P = (Q + ZyXj1)-1,
pv = b — Ax1',
a" = c + Qx" — A'ryu -zu,
<t>U = flyl ~ XyZyl,
r =PU + AQy^ - Xyl<n,

and solve for the dual step direction Xyy:

(AQAT)Xyy = (8.58)

Compute the primal step direction Xxy:

Az" = - Qy^ - Xy1^ - ATXyt/'), (8.59)

and the slack step direction Az" from

Xzv = X^{<I>V-ZyXx''). (8.60)

Update:

xl'+1 = xv + a.Q,yXxu, (8.61)


Sect. 8.3 Parallel Matrix Factorization Procedures 227

y,/+1 = y" +aQ,v^y1', (8.62)


zu+l = zv + aQ.pAZ, (8.63)

where uq^ is a step length restricted to the interval (0,1]. Reduce

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


to jz^+i, update p <— p + 1, and repeat the iterative step.
There are two differences between Algorithm 8.1.1 for linear program­
ming and the quadratic programming Algorithm 8.2.1: one is in the defi­
nition of the matrix 0; the other is in the definition of the step length. In
particular, for linear programming it is possible to estimate different step
lengths and that will preserve positivity of the primal and dual
variables, respectively, because the primal problem is a problem only in x,
while the dual problem is a problem only in y, z. For the quadratic pro­
gramming problem, however, the primal variables appear in the constraints
of the dual problem. Hence the step length that will preserve positivity of
both primal and dual variables is computed as = rnin(ap!/y, aj^),
where ocp^ and are determined as in Section 8.1.1.
In the next section we discuss the parallel solution of systems of the
form given by (8.27) and (8.58) when the constraint matrix A is structured
and when the objective function is either linear or separable quadratic (i.e.,
when 0 is diagonal). The solutions of these systems are the computation­
ally intensive parts of Algorithms 8.1.1 and 8.2.1. They usually account for
more than 90 percent of the execution time of the algorithm. The calcu­
lations of Axy and Az17 involve simple matrix-vector products and vector
additions, and can be implemented efficiently on parallel machines; they
are also well suited for vector processing.

8.3 Parallel Matrix Factorization Procedures for the


Interior Point Algorithm
We consider now the solution of the linear systems (8.27) or (8.58) for
structured problems. (To simplify the presentation we discuss the solution
of these systems during a generic iteration of the interior point algorithm
and, therefore, we drop the iteration index />.) In particular, we assume
that the constraint matrix A has the following block-angular structure:

(8.64)

The matrix Ao is of dimension mo x no- The matrices Wi are of dimensions


mi x ni, for I = 1,2,..., N. The matrices 7} are of dimensions mi x no-
It is assumed that Aq and Wi have full row rank, with mi < ni for all
I = 1,2,...,TV, and that no < ni-
228 Decompositions in Interior Point Algorithms Chap. 8

The block-angular structure arises in the deterministic equivalent for­


mulation of two-stage stochastic programs with recourse (see Section 13.3.3).
This is also the constraint matrix of the dual formulation of the multicom­
modity network flow problem of Section 12.2.2. Stochastic programming

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


problems and multicommodity network flow problems encountered in prac­
tical applications are of extremely large size. Hence, matrices of this form
occasionally have millions of columns and hundreds of thousands of rows.
Furthermore, while these matrices are sparse—as illustrated above—the
product matrix A0AT could be completely dense due to the presence of
the matrices 7}, I = 1,2,..., N.
This section develops a matrix factorization procedure for solving the
linear systems of equations for the calculation of At/, that exploits the spe­
cial structure of the constraint matrix A. The vectors At/ and in equa­
tions (8.27) or (8.58) are written as concatenations of subvectors, i.e., as
((At/°)t, (At/1)t, ..., (At/n)t)t and ((t/>°)t, (t/>1)t,..., (V,N)T)T> respecti­
vely, with At/, if1 G for I = 0,1,..., N. 0 is assumed to be diagonal,
as is the case for linear programs and for separable quadratic programs.

8.3.1 The matrix factorization procedure for the dual step direction
calculation
The procedure for solving (8.27) or (8.58) is based on the use of a general­
ized version of the Sherman-Morrison-Woodbury formula. This formula is
stated in the following lemma, where I denotes the identity matrix.
Lemma 8.3.1 For matrices A e IRnxn, U G IRnx\ and V G lRnx/< such
that both A and (I 4- VTA~1U) are invertible,

(A 4- UVT)~1 = X-1 - A~XU(I 4- VTA~1U)~1 VTA~1.

Proof We verify the formula by multiplying its two sides by A 4- UVT:

(A + UVT)~\A + UVT)
= 14- A-'UVV - A~XU(I 4- VTA~1U)~1 VT
-A~1U(I + VTA~1U)~1VTA~1UVT
= I + A~1UVT - A^UIJ 4- VTA~1U)~\I + VTA'1U)VT
= I + A~1UVT - A~1UVT = I.

Having obtained the identity matrix completes the proof. ■


The following theorem provides the foundation for the matrix factor­
ization routine that solves the systems (8.27) and (8.58) for A$/.
Theorem 8.3.1 Let M = A0AT, where 0 is a diagonal matrix, and let
S = diag (S0,Si,S2,...,Sn), Si = WiQiWf G = 1,2,...,7V,
where the matrix Sq = I is an mo x mo identity matrix, and for I =
Sect. 8.3 Parallel Matrix Factorization Procedures 229

0,1,2,..., N, Qi e JRniXni is the diagonal submatrix of 0 corresponding


to the Ith block. Also, denoting (0g1)2 by 0g 2, let

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


G^e^ + A^Ao + ^T^S^Tt, (8.65)
1=1

Gi A-q\
-Aq 0

If Aq and Wi, I = 1,2,..., N, have full row rank then the matrices M and
G2 == —AgGf Mg are invertible, and

M-1 = S'1 - S-1UG~1VTS~1. (8.66)

Proof The matrix 0 is invertible for both the linear and the quadratic
programs. By assumption Wi has full row rank and hence Si is invertible
for I = 1,2,..., N. It follows that S is also invertible. Let

<> =

U= and V = Then M can be written as M — S 4- U VT.


In order to employ Lemma 8.3.1 to calculate the inverse of M, the
matrix I + V S^U must be invertible. We prove that this is indeed the
case. I + VTS~XU — $(4>~2 4- VTS~1U)^ and hence it is invertible if and
only if and (<£~2 4- VTS-1t7) are invertible. is invertible, so we check
if ($“2 4- VTS~1U) is invertible. We can write

/ N

= &02+ a+£ T?sr1 T‘ a


\ -A i=i
I -1

= f 0o 0 V. (A^o + £2L T^S^Ti Aft


\
\0I + 1=1

\-A -l)
= 4>~2 4- VTS~XU.

Therefore, I 4- VTS~1U is invertible if and only if G is invertible.


By construction 0q 2 and AgAg are positive definite and symmetric,
jyrg-i is positive definite and symmetric for all I = 1,2,..., N, since Si
230 Decompositions in Interior Point Algorithms Chap. 8

are positive definite. Hence G± is positive definite and symmetric, being the
sum of positive definite symmetric matrices, and it is invertible. Its inverse
matrix Gf1 is symmetric, has rank no, and can be written as Gf1 =
_i _i _i
G12G12 where Gx 2 is also symmetric. By assumption Aq has full row

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


rank, hence AqG! 2 has full row rank and G2 = -AoGf invertible.
The rank of G is mo + no so that G is invertible. Hence I + V S'-1 U is
invertible and Lemma 8.3.1 can be applied to invert M:

M'1 = (S +WV1 = S'"1 - S'-1[/G“1VTS-1.

This completes the proof. ■


It is easy to verify, using equation (8.66), that the solution of the linear
system (X0AT)A?/ = ty is given by At/ = p - r where p solves the system
Sp = ty, and r is obtained from the system

Gq = VTp and Sr = Uq. (8.67)

The vector p can be computed componentwise by solving Sip1 = , for


I = 1,2,..., N. In order to solve for q we exploit the block structure of G
and write:

Gq = ( _5q 1° ) Q2 ) = 02 ) where 02 ) = VTp. (8.68)-

Hence, we get

Q2 — — X(p2 + AqGx (8.69)


91 = - ^o<72)- (8.70)

Once q is known, r can be computed componentwise from (8.67). The


procedure for calculating A?/ can be summarized as follows. (We adopt
here the notation Ai. for the zth row of A, and A.j for the jth column of
A)

Procedure 8.3.1 Matrix factorization for dual step direction


calculation

Step 1: Solve Sp = ty.


Step 2: (Solve Gq = VTp.)
a. For all 1 = 1,2,..., N, solve Si(u1)1 = for (u1)1, i — 1,2,..., no
thus computing the columns of the matrix SylTi.
b. For all I = 1,2,..., N multiply Tf (ul)\ for i = 1,2,..., no to form
T^Sy1'!}. FormGi (eq. (8.65)). Compute p\p2 (eq. (8.68)).
Sect. 8.3 Parallel Matrix Factorization Procedures 231

c. Solve G\u — p1 for u and set v = p2 + Aqu (eq. (8.69)).


d. Form G2 by solving G\wi — (AJ).i for wi, for i = 1,2,..., mo, and
setting G2 = -Aotw1 w2 • • • wm°].
e. Solve G2q2 = ~v for q2, and solve Giq1 = p1 - AJg2 for q1,

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


(eq. (8.70)).
Step 3: (Solve Sr = Uq.) Set r° = Aog1 4- q2 and for all indices
Z = 1,2, ...,7V solve Sir1 = Tiq1 forr1.
Step 4: (Form Ay.) For I = 1,2,..., N, set Ay1 = pl — rl.

Decompositions for Parallel Computing


Procedure 8.3.1 is well suited for parallel implementation because it relies
largely on matrix (sub)block computations that can be performed indepen­
dent of one another. The computation begins with the submatrices 7), Wi
and Qi and the vector segment located at the Zth processor. Processor I
can compute Si and proceed independently with all computations involving
only local data.
Interprocessor data communication is necessary at only three instances
in the parallel algorithm. After forming the terms Tf S^1!} in Steps 2a-
2b of Procedure 8.3.1 in parallel, the processors communicate to form the
matrix Gi and the vectors p1 and p2 in Step 2b. The results can be accu­
mulated at a single processor, designated as the master. The computations
involving the dense matrix Gi can be done serially on the master processor.
Steps 2d-2e, which involve the dense matrix G2, are also done serially on
the master processor.
The master processor must then broadcast the computed vector q to
all other nodes. The remaining Steps 3 and 4 require only the distributed
data Si, Ti, and pl on the Zth processor and may be carried out with full
parallelism. A final communication step accumulates the partial vectors
Ay1 at the master processor. This vector can then be made available to
all processors for use in the calculation of the directions Ax and Az of an
interior point algorithm.
An alternative parallel implementation of Procedure 8.3.1 would dis­
tribute the matrices Gi and G2 to all processors and let the processors
proceed locally (and redundantly) with all calculations involving these ma­
trices. The approach based on a master processor described above uses
an all-to-one communication step to accumulate the dense matrices at the
master, followed by a one-to-all communication step to distribute the re­
sults to the processors. The alternative approach suggested here combines
these two communication steps into a single all-to-all communication step
that distributes the dense matrices to all processors. Both of these alterna­
tives are very efficient on present day distributed memory machines with
high-bandwidth communication networks (see the results in Section 15.5.1).
Yet another alternative approach is to distribute the dense matrices
232 Decompositions in Interior Point Algorithms Chap. 8

across processors and use parallel dense linear algebra techniques for all
calculations that involve these matrices. This approach is more suitable
to shared-memory, tightly coupled multiprocessors, or when the dense ma­
trices Gi and G2 are large. The parallel procedure that uses the master

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


processor is summarized as follows.

Procedure 8.3.2 Parallel matrix factorization for dual step direction


calculation
Begin with the following data distribution: Processor I holds Si, Ti, and'll)1,
I = 1,2,..., N. A designated master processor also holds Ao, So, ©0, and
if0.
Step 1: (Solve Sp = in parallel.) Solve SoP° — V>0 on the master proce­
ssor. Solve in parallel, on processors I = 1,2,..., N, the system
Sip1 = for pl.
Step 2: (Solve Gq = VTp in parallel.)
a. Solve in parallel, on processors I = 1,2,... ,N, the linear systems
Si(uly = (Ti).i for (uly, 2 = 1,2,..., nQ.
b. Multiply Tf (ulY for i = 1,2,..., no in parallel, on processors I =
1,2,... ,N. Communicate to form Gi and p1 on the master pro­
cessor and compute p2 serially on the master processor.
c. Solve Giu = p1 for u serially on the master processor, and set
v = p2 -J- Aqu.
d. Form G2 by solving serially on the master processor the linear sys­
tems (Gi)w'1 = (A^fi for w1, for 2 = 1,2,..., mo, and by setting
G2 — — Aq[wx w2 • • • wm°].
e. Solve serially, on the master processor, the system G2Q2 = — v for
q2 and the system Giq1 = p1 — A^q2 forq1. Communicate to
distribute q1 to all processors.
Step 3: (Solve Sr — Uq in parallel.) Set r° = AqQ1 4- q2 on the master
processor. Solve the systems Sir1 = Tiq1 for rl in parallel, on all
processors I = 1,2,... ,N.
Step 4: (Form Ap in parallel.) Set A2/0 = p° — r° on the master processor.
Set Ay1 = pl — rl on processors I = 1,2,...,7V. Communicate to
gather the vector Ay on the master processor.
Sections 14.6 and 15.5 give details on the implementation of this pro­
cedure on different parallel architectures and report numerical results.

8.4 Notes and References


The seminal contribution in interior point methods was made by Kar-
markar (1984), where the projective scaling algorithm was proposed and
its polynomial complexity was established. His paper and claims about
Sect. 8.4 Notes and References 233

superior computational performance of this algorithm over the classic sim­


plex algorithm for large-scale problems, fueled the extensive research that
followed. It is now recognized, though, that the closely related affine scal­
ing algorithm was proposed earlier by Dikin (1967, 1974). For a general

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


introduction see Goldfarb and Todd (1989), Gonzaga (1992), Kojima et
al. (1991), and Marsten et al. (1990).
The linear programming software system OBI (Optimization with Bar­
rier 1) is described in Lustig, Marsten, and Shanno (1991). OBN is the
quadratic programming optimization version of OBI. ALPO (Another Lin­
ear Programming Optimizer) is described in Vanderbei (1993), and LOQO
(Linear or Quadratic Optimizer) is described in Vanderbei (1992) and Van­
derbei and Carpenter (1993).

8.1 The primal-dual path following algorithm for linear programs was de­
veloped, independently, by Monteiro and Adler (1989a) and Kojima,
Mizuno, and Yoshise (1989). Earlier work on the primal-dual cen­
tral path was developed by Meggido (1989). Numerical issues for the
efficient implementation of the algorithm were resolved by McShane,
Monma, and Shanno (1989), Lustig, Marsten, and Shanno (1991) and
Vanderbei (1993). See also Choi, Monma, and Shanno (1990) and
Mehrotra (1992).
8.2 The primal-dual path following algorithm for quadratic programming
problems is described in Monteiro and Adler (1989b, 1990), Vanderbei
and Carpenter (1993), and Carpenter et al. (1993).
8.3 For a general discussion of direct matrix inversion methods see House­
holder (1975). Parallel implementation of interior point algorithms
for dense problems is discussed in Qi and Zenios (1994) and Eckstein
et al. (1992). The exploitation of matrix structure in computing the
dual step direction for interior point algorithms has been addressed
in Loute and Vial (1992), Choi and Goldfarb (1993), Czyzyk, Fourer,
and Mehrotra (1995), and De Silva and Abramson (1996). The use
of the Sherman-Morrison-Woodbury updating formula for comput­
ing dual steps for stochastic programs was suggested by Birge and
Qi (1988), and was further extended by Birge and Holmes (1992)
who also performed extensive numerical experiments with this method.
The extension to multistage stochastic programming problems is dis­
cussed in Holmes (1993). Its use for solving multicommodity network
flow problems has been suggested by Choi and Goldfarb (1990, 1993)
and Lustig and Li (1992). The parallel matrix factorization proce­
dure and its implementation on hypercubes and other parallel ma­
chines, was developed by Jessup, Yang, and Zenios (1994a). Yang and
Zenios (1996) discuss the application of the parallel matrix factoriza­
tion procedure within an interior point algorithm for the solution of
large-scale stochastic programs. Ruszczynski (1993) provides a survey
234 Decompositions in Interior Point Algorithms Chap. 8

on the use of interior point methods for solving stochastic program­


ming problems, and Vladimirou and Zenios (1996) provide a survey of
parallel methods for stochastic programming, including a discussion
on the parallelization of interior point methods.

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


Part III

Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024


APPLICATIONS

The sciences do not try to explain, they hardly even try to


interpret, they mainly make models. By a model is meant a
mathematical construct which, with the addition of certain
verbal interpretations, describes observed phenomena. The
justification of such a mathematical construct is solely and
precisely that it is expected to work.
John Von Neumann
Downloaded from https://academic.oup.com/book/53915/chapter/422193749 by OUP site access user on 12 May 2024

You might also like