I am trying to parallelize an optimization routine by using MPI directives. The structure of the program is roughly like in the block diagram at the end of the text. Data is fed to the optimization routine, it calls an Objective function subroutine and another subroutine, which calculates a matrix called “Jacobian”. The Optimization routine iterates as many times as needed to reach a minimum of the Objective function and exits with a result.The Jacobian is used to decide in which direction the minimum might be and to take a step in that direction. I don’t have control over the Optimization routine, I only supply the Objective function and the function calculating the Jacobian. Most of the time is spend on calculating the Jacobian. Since each matrix element of the Jacobian is independent of the rest of the elements, it seems as a good candidate for parallelization. However, I haven’t been able to accomplish this. Initially I was thinking that I can distribute the calculation of the Jacobian over a large number of nodes, each of which would calculate only some of the matrix elements. I did that but after just one iteration all the threads on the nodes exit and the program stalls. I am starting to think that without the source code of the Optimization routine this might not be possible. The reason is that distributing the code over multiple nodes and instructing them to only calculate a fraction of the Jacobian messes up the optimization on all of them, except the master. Is there a way around it, using MPI and without touching the code in the optimization routine ? Can only the function calculating the Jacobian be executed on all nodes except the master ? How would you do this ?
-
Are you computing the Jacobian by symbolically taking partial derivatives of your objective function, which are then evaluated at your current guess for the optimal input vector, or are you computing it numerically by evaluating the objective function for small deviations around your current guess?– lockcmpxchg8bCommented Dec 14, 2017 at 1:49
-
@lockcmpxchg8b Theres is no analytic expression for the objective function, so I calculate the derivatives numerically. The calculations are lengthy and time consuming, that's why I am looking to parallelize them. MPI looked like a good candidate but I am not sure it can work in this case, since I don't have access to the source code of the Optimization Routine. Perhaps I'll have to write my own OR.– PlamenkCommented Dec 14, 2017 at 14:59
-
@High Performance Mark Thank you for the suggestion. I might post the question there, but it looks like the focus there is on the algorithms, not on the performance.– PlamenkCommented Dec 14, 2017 at 15:03
-
If you do end up implementing your own OR, I found web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf to be very good. I implemented sequential quadratic programming with equality and inequality constraints based on this book---for a mildly non-linear optimization problem.– lockcmpxchg8bCommented Dec 14, 2017 at 18:40
-
I would think that MPI could be effective for computing each element of the Jacobian, particularly if evaluating the objective function is itself very costly. There are functions in MPI for managing update of a shared-2d-array, which could serve as the storage for your jacobian matrix. A lot of ORs let you provide the routine that computes the Jacobian, so you migth be able to hook that without seeing the rest of the OR.– lockcmpxchg8bCommented Dec 14, 2017 at 18:43
|
Show 2 more comments
1 Answer
It turned out easier than I thought. As explained in the question, the worker threads were exiting after just one iteration. The solution is to just enclose the code in the Jacobian calculation executed by the workers with an infinite while loop and break out of it by sending a message from the main thread (master) once it exits with the answer.