pyCUDA version of transient F-stat
All threads resolved!
All threads resolved!
Compare changes
This adds a pyCUDA implementation of lalpulsar's ComputeTransientFstatMap function, to be used for now only from the TransientGridSearch() class but in principle portable to other search classes too.
There are two separate kernel files (.cu) installed as 'package_data', and a source file tcw_fstat_map_funcs.py that includes both the direct wrappers to those GPU kernel and some setup acrobatics.
@GregAshton Could you have a preliminary look and first let me know if you think this is all in scope for PyFstat in the first place, or if you'd prefer the kernels and wrappers to live in an external package?
Then here's a (non-exhaustive) list of hacks or possibly controversial implementation details:
__del__
destructor to the ComputeFstat class. pyCuda has a nice autoinit feature which would do the cleanup too, but for multi-GPU hosts (e.g. CIT head nodes) I need to manually work with what it calls "contexts" and then clean that up in the end, too.__init__
from the child...?I'm also still chasing some larger-than-expected differences between CPU and GPU versions for exponential windows, though those could just be due to Reinhard's "FastExp" functions. Which is why this is WIP:
and self-assigned for now. Feedback already much appreciated, though!