Pre Solving in Linear Programming
Pre Solving in Linear Programming
Pre Solving in Linear Programming
Erling D. Andersen I,*, Knud D. Andersen b,I Department of Management, Odense University, Campusvej 55, DK-5230 Odense M, Denmark b Department of Mathematics and Computer Sciences, Odense University, Campusvej 55, DK-5230 Odense M, Denmark
Received I December 1993
Abstract
Most modem linear programming solvers analyze the LP problem before submitting it to
optimization. Some examples are the solvers WHIZARD (Tomlin and Welch, 1983), OBI (Lustig et al., 1994), OSL (Forrest and Tomlin, 1992), Sciconic (1990) and CPLEX (Bixby, 1994). The purpose of the presolve phase is to reduce the problem size and to discover whether the problem is unbounded or infeasible.
In this paper we present a comprehensive survey of presolve methods. Moreover, we discuss the restoration procedure in detail, i.e., the procedure that undoes the presolve. Computational results on the NETLIB problems (Gay, 1985) are reported to illustrate the
1. Introduction
Even though computers and LP algorithms have been improved a lot during the last ten
years, it is still important to present an LP problem to a solver in a properly formulated
form. There are some rules a properly formulated LP problem must satisfy. It should
contain as few variables, constraints and nonzeros as possible, it must be well-scaled
Corresponding author. e-mail: [email protected]. This author was supported by a Danish SNF Research studentship.
0025-5610
1995-The
SSD10025-5610(95)00016-X
222
fast inspection type procedures to detect various forms of redundancy and then repeat these procedures until the problem cannot be reduced further. We use the same strategy,
Another type of presolve procedure has been proposed by Adler et al. [2] and Chang and McCormick [11]. They propose using general linear transformations on the constraints to reduce the number of nonzeros in the problem. This option is not
discussed here. discuss a presolve procedure in combination with the simplex Papers [8,9,22-24] algorithm. Here we use the presolve procedure in combination with an interior-point algorithm. Papers [ 1, 15] already document that a presolve procedure in combination with an interior-point algorithm is beneficial. The paper is organized as follows. In Section 2 we present our notation. In Section 3 the presolve procedure is described. In Section 4 we discuss how to recover an optimal primal and dual feasible solution to the original problem. In Section 5 we discuss
2. Notation Any LP problem can be formulated in the following form: min s.t. c x, Ax = b, 1 S x s U,
(1)
where x, c, 1, u G 1R', b c RI" and A c RnI"Xn. Some of the simple bounds Ij and Uj may be -Jo and +o respectively. Define L = {j: Ij > -oc} and U {j: uj < +Xo}; then the optimality conditions to (1) can be stated as follows:
Ax* =b,
z =0, I s x* U,
Vj L UU,
E.D. Andersen, K.D. Andersen/Mathematical Programming 71 (1995) 221-245 Table I Conclusion that can be reached from the optimality conditions Ii >-Do -0 -DO ? uJ +DC < +oo ? ?+-o < Ij Z. > <0 > 0 < 0 x. Problem status
-
223
l0
U
-
where y (e R"' and z G IR". y and z are the Lagrange multipliers corresponding
to the
linear constraints and the linear inequalities in ( I), respectively.Note that the superscript
"*" means optimal values. In Table I we have listed the conclusions that can be deduced
from the optimality conditions (2). For example, if it can be established that z* > 0,
then xi is fixed at its lower bound if it is finite, otherwise the problem is unbounded.
3. A presolve procedure The computational complexity of simplex and interior-point type LP algorithms is
dependent on the number of constraints and variables in (1). Even more important in
practice is the sparsity of A, because only nonzeros are stored and the work is dependent on the number of nonzeros. Therefore, a presolve procedure should reduce the size of A without creating any new nonzeros in A. This has the advantage that the storage cost cannot be increased by the presolve procedure. The disadvantage of not allowing the creation of new fill-ins is that general linear transformations on A are not allowed, even though it might be beneficial in some circumstances. However, we allow the presolve procedure to modify the objective function c, the right-hand side b and the simple bounds I and u. In the following, we assume that A is sparse, i.e., contains less than, say, 30% nonzeros. The presolve procedure might not be advantageous if this assumption is not fulfilled. Furthermore, we assume that all reductions on (I) are applied immediately and therefore we always work on the reduced problem. Whenever we write Vi and Vj, it means for all rows and for all columns in the current reduced problem. The idea of the presolve procedure is to make several passes through A and in each pass various forms of redundancies are detected and removed. In the following sections we discuss the procedure in detail.
224
Dependent on the bounds on variable xi and its objective coefficient ci, variable xj is fixed at one of its bounds or the problem is unbounded.
2j:
Ij = uj.
(6)
* -
(7)
The ith constraint fixes variable xj at level bi/aik. It should be noted that one reduction may lead to further reductions. For example, if A
is a permuted lower triangular matrix, repeated use of (7) solves the problem. To exploit
this possibility, our presolve procedure first counts the number of nonzeros in each constraint. Then, a list containing all singleton constraints is created. As the constraints in this list are used to fix variables, the number of nonzeros for each constraint is updated and new singleton constraints are appended to the list.
2(j,k):
ai = 0,
Vi =Ak, akj * O.
(8)
If one of the bounds on the column singleton is infinite, then the column singleton imposes a bound on one of the optimal Lagrange multipliers. All such bounds are listed in Table 2. Such bounds will be useful later.
(a, = 0,Vi
51
k,ajk
(9)
The substitution
(10)
E.D. Andersen, K.D. Andersen/Mathematical Programming 71 (1995) 221-245 Table 2 Bounds on the optimal Lagrange multipliers Y* implied by the column singleton a;kj
ij -00 Uj
c#0
225
( kJ
cj1/akj
0
S0
The free column singleton reduction is very advantageous, because one constraint and one variable is removed from the problem without generating any fill-ins in A, although
the objective function is modified. Next we will discuss two methods for generating more free column singletons. (vii) A doubleton equation combined with a column singleton: i, j, k: aixj + aikxk
=
bi,
j =i k, a, 4 0, 1
aik * 0.
(11)
If variable Xk is a column singleton, then the bounds on the variable x1 are modified so that the feasible region is unchanged even if the bounds on Xk are removed. Afterwards variable Xk can be substituted out of the problem. It is also possible in some cases to establish that bounds on a singleton are redundant. An example is
Xi -
EXk= 0,
k9#j
Xj,Xk
O,
( 12)
f= { ((
and
(b
EPj Ekegj
aiklk ZkEM
aikUk)/aJ,1
(b-
aikUk
ZEkeM, aiklk)/aij,
0, aij > 0,
aiJ <
where Mij = {k: aik < 0,] * k} and P,1 = {k: aik > 0,] mi k}. It can be verified that for any feasible solution x to (1). If these new bounds are at least as xi <x i' tight as the original ones, the variable is said to be implied free. (viii) An impliedfree column singleton:
]j,
k:
I' < l i
uj
u15 U1
( 15)
226
It is assumed that 1' and uk1 are computed according to (13) and (14). The variable xi can be treated as a free column singleton (see (9)). Detection of column singletons has been used in other presolve procedures [8, pp. 149-152], but, at least to our knowledge, a systematic check whether a column singleton is implied free has not previously been proposed. In the case of a doubleton equation, Bradley et al. [8] propose always modifying the simple bounds on one of the variables such that the other one becomes free. Afterwards, the free variable is substituted out of the problem. We have not implemented this more general procedure, because it might causes fill-ins in A.
E= + ajui E
Zalj
.iPe EM,
(16)
where Pi =
gi S Lajxj
(17)
for any solution x that satisfies the simple bounds. The following possibilities can occur.
(18)
If gi = bi, then due to linearity the only feasible value of x1 is lj (uj) if aj, > 0 (aii < 0). Therefore, we can fix all variables in the ith constraint. A similar statement can be made if hi = bi. It is highly advantageous to get rid of forcing constraints, because all variables in them are structurally degenerate. In the previous section, (13) and (14) were developed to compute implied bounds on variable xi. These bounds in combination with the original bounds can be used to detect more free column singletons. A procedure using this method is called a dominated
constraint procedure. (The term "dominated" is proposed in [19].) In order to limit the computational cost of the dominated constraint procedure, we
only compute lj' and uj' for each ai i 0 if either gi or hi are finite. Furthermore, gj and
hi are computed during the forcing constraint check and they can be reused to compute
=I(bi -gi)
laij + 1j, aij > 0, " ( bi - hi) laij + Ij, aij < ,
1.~ ~ ,>,(20) ~
227
(0
and
I'
"j
aij > 0.
aij < -
21 21
It should be noted that the implied bounds are only used to detect more free column
(see also [24]) and our procedure is similar to that, although in [9] the procedure is
executed more often than we propose. We limit the number of times the procedure is executed, because we have found it to be computationally expensive for some dense
problems.
The only other reference to a dominated constraint procedure, we know of, is [ 19], i.e., it is implemented in the LP solver OBI. However, OBl's dominated constraint procedure seems not to be as general as the procedure proposed in this section.
The first step of the dominated column procedure is to find all column singletons
and use them to compute implied lower and upper bounds on the optimal Lagrange multipliers y*. How this is done has been treated in Section 3.2. Let yj and 5i be the maximum and minimum respectively of all such bounds, i.e., we have
H'i Yi*'< 9i. Vi. <~ (22)
These simple bounds on the Lagrange multipliers can be used to fix some of the primal variables. Define
e= Zayi
iEPj
ai
~i and dj = Zaij9i
iPj'
ic-GM 1
E icMj
aijyi,
(23)
228
e, Zai
< d.
(24)
From the optimality conditions (2) and (24) it can be deduced that
ci- di
z =cj - ,ai
y <c
e.
(25)
]j:
If ci
-
<0.
(26)
fixed at its lower bound. However, if Ij = -oo, the problem is unbounded. An analogue conclusion can be reached if ci - ej < 0. (xii) A weakly dominated column: ]j
f
S:
cj-di
=OAlj >-oc,
(27)
]j
S:
cJ-e,=OAu 1 <+oo,
(28)
where S = {j: aj is a column singleton for some i}. Note that the column singletons are used to generate the bounds dj and ej and, therefore, they cannot be dropped with this test. In the following proposition we prove that in the cases (27) and (28) the variable xj can be fixed at its lower and upper bound, respectively.
Proposition 3.1. Assume ( I ) has an optimal solution. If (27) or (28) holds, then there exists an optimal solution such that xj*= lj or xj* = uj, respectively.
Proof Assume (27) holds. Furthermore assume without loss of generality that I = 0, u = oc and j = n. Then the dual problem of ( 1) is max s.t. bTy, ATy + s = c, s 0. (29)
Given the assumptions of the proposition, (29) has an optimal solution. Therefore there exists a dual optimal solution x G IR-1 to (29) without the last constraint which implies A(x,0) =b and i x 0. (30)
3j V S: (cj-ej=OAuj=oo)V(cj-dj
=0A Ij=-oc).
(31 )
229
zJ'
c,
ej = 0. By
,k {
akl < 0
> 0,
(32)
If -v* times the kth constraint is subtracted from the objective function when ak) #' 0, the kth constraint can be removed from the problem. We have not implemented this method because we believe it occurs rarely. Finally, the bounds e/ and dX are used to compute new bounds on the optimal Lagrange multipliers y* in the following manner. Assume that Ij > 0o and uj = +o. It follows
cj -
Zajy*
+Yk, akj>
(33)
If
a*J #
0, we have ck e
ak] ,
(34)
or
Ok
akj
+ k,
akj< O-
(35)
+9k,
akj > 0,
(36)
akj
+ k,
akJ< 0-
(37)
O. Otherwise (26) is the case. Observing that
(c,
e1 j)/akJ 0, it follows
k Yk <
-e
+ Yk,
(38)
so the new bounds on the Lagrange multiplier cannot be inconsistent. Similarly all the
certain linear system must be solved in each iteration. This linear system has a singular matrix if A contains linearly dependent rows.
230
A simple form of linear dependent rows is when two rows are identical except for a scalar multiplier. Two such rows are called duplicate by Tomlin and Welch [23]. However, they treat slack variables separately and therefore we use a generalized version
where S = {j: aj is a column singleton for some i} and v is a scalar multiplier. Assume the ith and the kth row are duplicate; by definition we have
js aijxj +
akJxi + aij = lvakj,
s a jXi =bi, 1
>3sak
E
= bk,
(40)
Vj X S.
jS
aijxj +
E 5
a,~~~x, =
~~'j
1)
Z,. aF
- V( L(i
(xiv) (a) The kth row is empty, (b) the kth row contains an implied free variable or (c) the kth row is a doubleton equation containing a singleton column.
All these cases have already been treated.
In this case we have two independent problems which can be solved separately. We believe this is a rare occurrence and it is not implemented in our presolve procedure. These checks should be done twice when the ith row is added to the kth and, then, vice versa. The main objection against finding all duplicate rows is, of course, that it may be computationally too expensive. However, if the problem is sparse, we have found that not to be the case. We find all the duplicate columns with a method similar to that proposed by Tomlin and Welch [23, p. 8]. The method consists of two phases. First rows with the same sparsity pattern in nonsingleton columns are found, because if two rows are duplicate, their sparsity pattern must be identical. It is completed in O(nz (A) )
operations and in practice this phase is very fast. In the second phase the duplicate
columns are detected in each group of columns with the same sparsity pattern. The second phase is typically more expensive than the first one, because it involves floatingpoint operations. In [23] the two phases are combined, but we have separated them because on a sparse problem, typically, very few rows have the same sparsity pattern. Thereby the computational cost in the second phase is minimized.
ED. Andersen, K.D. Andersen/Mathematical Programming 71 (1995) 221-245 Table 3 Bounds on the Lagrange multipliers when two coluins are duplicate
Ik Uk Zk 1
231
cj - Lckk
?
-00 -0C
+oo 0
+00
0 < 0
0
0 0
0 0
?0
3.6. Duplicate columns The dual of duplicate rows are duplicate columns, i.e., columns that are identical
except for a scalar multiplier. Even though the number of duplicate columns are expected to be modest in number, there exist classes of LP problems which contain a large
Vil j
k,
(43)
aik = cj +
t(Z
(44)
Using (44), it is possible to generate bounds on the optimal Lagrange multipliers. All such bounds are listed in Table 3.
0.
(45)
Using Table 3, it is possible to establish a bound on z,*. This bound is used to fix variable x1 or to detect that the problem is dual infeasible.
Let us study the part of the problem ( 1) which amounts to the two duplicate columns. We have
* cjXv +
...
CkXk '
l ,
Define
(47)
(48)
a.kxk
b,
Ik < Xk t'Xj
(49)
< Uk.
Ij s xj uj,
232
E.D. Andersen, K.D. Andersen/Mathematical Programming 71(1995) 221-245 Table 4 Replacement bounds
Il
l1 It +11u la +
I11
<0 > 0
Uk+ I'+i
U'lj+ I'Uj Uk
Variables xi and Xk can be replaced by variable x' if its bounds are specified as shown in Table 4. In summary, whenever two columns are duplicate and (46) holds, the duplicate column procedure modifies the bounds on variable Xk according to Table 4 and removes
reported. Note that the duplicate column procedure will detect all split-free variables,
i.e., free variables modeled by a difference of nonnegative variables. Such variables mean serious numerical difficulties in a primal-dual interior-point algorithm. The reason is that the set of optimal primal solutions is unbounded and the interior of the dual feasible region is empty. If the split-free variables are combined into free variables, the problem can normally be solved.
4. Postsolve
After problem (I) has been presolved, and if it is not detected infeasible or unbounded, the reduced problem must be solved by an LP algorithm. In general, the optimal solution to the reduced problem is not feasible for the original problem. Therefore, a restoration procedure is needed to "undo" the presolving. There are two kinds of restoration procedures. One of them solves the original problem by the simplex algorithm, using the optimal solution to the reduced problem as a hot start. The other type of restoration procedure is a reverse presolve procedure. The first method has been used by Williams [24], but can be very inefficient. Observing this, Tomlin and Welch [21] proposed the second procedure, calling it a "postsolve". The first type of restoration procedure is excluded in our case because we use an interior-point based LP algorithm. We will therefore not discuss this approach further. The approach of the postsolve procedure is first to undo the last reduction made by the presolve procedure in such a manner that if the primal and dual solutions to the reduced problem are optimal and feasible, then the restored solution is also optimal and feasible. This process is then repeated until the optimal solution to the original problem is recovered. To do the postsolving correctly and efficiently, it is necessary that the presolve procedure store some information on a presolve "stack". In the following we assume that all the necessary information is available from the presolve stack. It should be noted that whenever we write Ej yi*aij, it means that the sum is over the
233
An empty row. No primal postsolving is necessary and y1* can be set to any value, for example 0. A singleton row. Assume a,, is the only element in the ith row. No primal postsolving is necessary, because xj is fixed at its optimal value. The value * c1 -Lkoi
'kakJ
and
zj* = 0(50)
1iv'*aij
A free column singleton or an impliedfree column singleton. Assume the last reduction
made by the presolve procedure is to use the column singleton aij to eliminate variable xi and the ith constraint. The optimal value xl can be computed using (10). The dual optimal value is given by y* = cj/aij and z* = 0.
A doubleton equation combined with a column singleton. In this case the relations
ajxj + aikXk =
bi,
1i
Xi C Mj, ;
lk - Xk < Uk
(51)
are replaced by
ajxj + aikXk =
bi and
[k (Xk
< fik
(52)
where ai1 is a column singleton, 1k and ik are chosen so that the set of primal feasible solutions to (51 ) and (52) are identical and variable xj is substituted out of the problem. The optimal x* is given by
* bi -alkx
iaii
(53)
(53,
Let Fk*be the optimal Lagrange multiplier corresponding to xk obtained from the reduced problem. The optimal Lagrange multiplier y7* not known. However, we have is ,. = ck c-aik -E
a
Fi phi
fI
y1 a1Ik = ck -E
Yj
/1
}~apk,
(54)
P
=
where y4*= c1 /a, 1 . Eq. (54) follows as a result of xj being substituted out of the
problem. This value for y* is not necessarily optimal and feasible if X4 Xk = Elk < uk. In these cases the dual optimal value is
y* = Ck-
lk
>
or
El,,
aikk
iy
ak
and
z* = 0.
(55)
A forcing constraint. A forcing constraint forces its variables to be at one of their bounds. Assume, for simplicity, that the ith constraint forces all its variables to be at
their lower bound. This implies that either aij < 0 or aij 0 for all j. The Lagrange
multiplier v* must be chosen such that
4 = cj
-c
>yap;
pJ #i
yj ai,
(56)
234
for all j, where alJ * 0. If aij > 0 for all j, then the feasible yi* is given by Yi
jC{k:
mmnfiy/ m
Clik>O}
aij
(57)
max
I G,{k: CIk<O}
c-
ai 1
y,: a11
(58)
Note that there always exists a feasible value for yT. Indeed, there are an infinite number of feasible values which can be assigned to yi*. We have treated a special case, but it is easy to generalize it to arbitrary bounds and constraints. Postsolving of a forcing constraint has been treated in [21].
(59)
where y and 9 are the Lagrange multipliers in the original and transformed problem, respectively. When the optimal to the reduced problem is known, y* can be computed by multiplying 9 with M. Note that M can be represented by two indices and a real, because it is an elementary matrix (one row is added to another). Fixing a duplicate column. Same as a fixed variable.
r*
Replacement of two duplicate columns by one. Assume that the columns a*j and a k
are duplicate and that they have been replaced by variable xk in the reduced problem.
xk
<x*
-Jxj
+ xk
<K uj,
- UXj < Uk,
(60)
lk Xk
where X> is the optimal value as obtained from the reduced problem. Any value for X> and x4 which satisfies (60) is primal feasible and optimal, but the values should be chosen such that complementary slackness conditions are satisfied. We have now outlined how to recover both primal and dual optimal feasible solutions if an optimal solution to the reduced problem is known. It should be noted that the amount of work spent in the postsolve phase in all cases is modest.
5. Implementation In this section we will discuss our implementation of the presolve procedure. The presolve procedure assumes that A is stored sparse in a row-wise and columnwise format and that the objective function c, the simple bounds I and u are stored in dense vectors. Furthermore, a Boolean flag is tagged to each constraint and variable. This flag has the value "false" if the constraint or the variable has been eliminated from the problem. Whenever a constraint or a variable is mapped off, it is treated as nonexistent by the presolve procedure. Therefore, during the presolve phase, it is not necessary to change the data structure of A. The presolve procedure can be outlined as follows.
Procedure Presolve
Remove all fixed variables.
all column singletons in combination with a doubleton equation Dominated columns Duplicate rows Duplicate columns
until( no reductions in last pass Remove all empty rows and columns end Presolve. The interpretation of the above presolve procedure is that each subprocedure makes a pass through A. During the pass the redundancy is discovered and removed. The presolve procedure is terminated when it is not possible to reduce the problem further. Finally, empty rows and columns are removed. Furthermore, the procedure is terminated if it discovers that the problem is either primal or dual infeasible.
During the presolve phase the procedure informs the user of the kind of reductions it
makes. (This is optional.) This information may be useful for the user to improve the
model formulation.
6. Computational results
Can the proposed presolve procedure really reduce the size of an LP problem and is the presolve phase worthwhile? These are the main questions we will answer with a
236
test. Let us first outline the test environment. The presolve procedure is used as a front-
end procedure to Mehrotra's predictor-corrector algorithm, see [17,20]. The predictorcorrector algorithm is extended with the global convergence the least-squares ideas of [14] (see also
[18]). The most expensive operation in the primal-dual algorithm is the solution of
problem in each iteration. This is solved using a supernodal column
Cholesky decomposition and the ordering is obtained with the multiple minimum-degree
heuristic (see [16] for a discussion of details). The source code is written in ANSI C and all time results are obtained with the C procedure "clock" which measures the
architecture, because the presolve phase consists of operations which cannot be vectorized. Therefore, the relative benefit of the presolve phase is expected to be greater in a workstation environment than in a vector computer environment. In the first part of the test, all the small NETLIB problems [13] were solved on the IBM RISC 6000/230 computer. By "small" we mean that they can be solved fast on
the workstation.
In Table 5 the optimal primal objective values XFEAS and ZFEAS for each problem are shown where
(61)
ZFEAS = max
ICjL\U
jVLUU
max (Iz*J),0.Y
J i
(62)
XFEAS and ZFEAS measure the maximal violation by primal and dual variables of their simple bounds. In the case of an equality constraint in the primal or dual problem, it is assumed that the constraint is formulated with a fixed slack variable. Moreover, Table 5 shows the number of significant figures in the optimal objective value in the column SIGF. SIGF is computed as follows: SIGF |I ( primal obj.
-
g 1
dual obj.l1
(63)
We want to point out that the results are reported after the postsolve has been done, i.e., for the original problem.