Skip to content
Snippets Groups Projects
Select Git revision
  • 52add4a0d949909e5be9a1f6a1ff0dc648c257d6
  • master default protected
  • add-clFFT_GetSize
  • add-dylib-target
  • counting-mallocs
  • remove-CPU-constraint
  • Add-PKGBUILD
  • HSA
  • clmathfft
  • longer_dft_support
  • current_fgrp_apps
  • current_brp_apps
12 results

test.single.2

Blame
  • Forked from einsteinathome / libclfft
    Source project has a limited visibility.
    • Heinz-Bernd Eggenstein's avatar
      20314512
      Bug #1608: clFFT use of native_sin , native_cos can cause validation problems · 20314512
      Heinz-Bernd Eggenstein authored
      experimental: -added alternative method for twiddle factor calc, using a smaller LUT (256 * float2 )
                     via Taylor series to 3rd order, seems to be almost as accurate as method with 2 bigger LUTs, but faster.
                    -improved method w/ 2 bigger LUTs to use LUTs of float2
                    -improved method using slow sin/cos functions (now uses sincos combined function), still slow
                    - preparaed plan struct to have method switchable at plan creation time.
      
                    TODO: load smaller LUT for Taylor series approx into shared mem.
      20314512
      History
      Bug #1608: clFFT use of native_sin , native_cos can cause validation problems
      Heinz-Bernd Eggenstein authored
      experimental: -added alternative method for twiddle factor calc, using a smaller LUT (256 * float2 )
                     via Taylor series to 3rd order, seems to be almost as accurate as method with 2 bigger LUTs, but faster.
                    -improved method w/ 2 bigger LUTs to use LUTs of float2
                    -improved method using slow sin/cos functions (now uses sincos combined function), still slow
                    - preparaed plan struct to have method switchable at plan creation time.
      
                    TODO: load smaller LUT for Taylor series approx into shared mem.
    pykat_output.kat 150 B
    l l1 1 0 0 n1
    s s1 10 1 n1 n2
    m m1 0.5 0.5 0 n2 n3
    s s2 10 1 n3 n4
    m m2 0.5 0.5 0 n4 dump
    
    pd PD1 n2
    
    xaxis m1 phi lin 0 360 360
    gnuterm no
    pyterm no