Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beamformer loading of CSM is inefficient #281

Open
gherold opened this issue Jun 27, 2024 · 1 comment
Open

Beamformer loading of CSM is inefficient #281

gherold opened this issue Jun 27, 2024 · 1 comment

Comments

@gherold
Copy link
Member

gherold commented Jun 27, 2024

Feature Description

The frequency domain beamformers load the frequency-specific CSM from the respective freq_data object in a loop through all frequencies and allocate a new array each time using the numpy.array() function.

csm = array(self.freq_data.csm[i], dtype='complex128') (1)

If caching is used, it is necessary for the array to be loaded into memory from the hdf file. Doing this via

csm = self.freq_data.csm[i] (2)

would be slightly faster (about +10% speed for loading even though a new array is allocated).

If caching is deactivated, loading is 5 times faster with the current implementation anyway, but option (2) is then about 8 times as fast as option (1), as no allocation needs to take place.

The casting to complex128 is only needed for the numba implementations downstream if the CSM was intentionally calculated with only complex64 precision (not sure anybody ever uses this). Default for the CSM is complex128 (despite the docstring stating otherwise!)

It may be worth considering dropping the forced new allocation+casting in the loop. If complex64 support is necessary, a different numba function could be set to be called on a more general level.

Use Case

Every time frequency domain bf is done with a lot of frequencies, this would make calculations a little quicker.

Importance

Low: Nice to have, but not critical.

@esarradj
Copy link
Member

The main problem here is that we need a contiguous C-ordered csm in order for the numba fastfuncs to work (or to be fast at least).
(2) returns a view, which might not be contiguous depending on which type is actually returned by self.freq_data.csm.

However, there are some beamformers that do not use the csm directly as a fastfunc argument. At least for these it would be no problem at all to implement (2). One example is BeamformerFunctional, where this is already implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants