• Heinz-Bernd Eggenstein's avatar
    Bug #1608: clFFT use of native_sin , native_cos can cause validation problems · 48a3c019
    Heinz-Bernd Eggenstein authored
    Still experimental: replace calls to native_sin in clFFT
    This change explores the performance impacts of using a set of LUTs, precomputed on the CPU
    to perform sin(x_i) and cos(x_i) in a grid x_i= +/- 2*pi *i/N , N fixed.
    
    On a 6770M, this code is still ca 3% slower than the original native_sin/native_cos varaint
    for a BRP4-like transform
    
    This variant should have a very high accuracy, versions with lesser accuracy but
    higher performance should be explored next. Eventually the method should be selectable
    by a parameter to the plan creator as suggested by Bernd.
    
    TODO: - remove some diagnostic code,
          - optimze total size of LUTs perhaps by using
            cos(x) = sin(x+pi/2), so no need to keep separate LUTs for sin and cos, just one slighly longer with
            an additional alias pointer
          - try caching the LUTs in shared memory (using constant memory didn't help)
    48a3c019
fft_base_kernels.h 19.1 KB