-1

I want to calculate a Gaussian process. For this I need to calculate the following equation: f = X @ Z^-1 @ y

To handle the inverse I use c = Z^-1 @ y which is the solution of the problem Z @ c = y. So, I use a solver for linear systems (least squares) to calculate c on my GPU (using cupy). The problem now is, that I reach the memory capacity (16000x16000 matrix). Is it possible to calculate c in batches? Or how can I solve the out of memory issue?

I searched the internet but I found nothing.

1 Answer 1

0

The GPU will help you with the arithmetic, but will penalize data access.

Solving a linear system will require O(n^2) arithmetic operations, and the amount of data accessed is O(n^2). So you are probably not going to have a big gain.

You will find efficient linear solvers CPU implementations in pytorch, or scipy for instance, if you can use float32 instead of float64 even better.

You can have a matrix of 16000 x 16000 if you have a recent decent GPU, it would take 2GB using float64 elements. Make sure to release all other allocated memory before trying to allocate Z, and when solving try not to allocate more data. If the matrix is well conditioned, on GPU you could try to use float16 that would place a 16000 x 16000 matrix in 512MB memory.

If you want to use GPU pytorch is my preferred option, and if you have an algorithm running on CPU you can run it on GPU with very little changes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.