Skip to content
Snippets Groups Projects
  1. Apr 25, 2023
  2. Aug 10, 2022
  3. Mar 04, 2021
  4. Mar 02, 2021
  5. Dec 03, 2019
  6. Aug 21, 2019
  7. Aug 12, 2019
  8. Jun 17, 2019
  9. Jun 14, 2019
  10. Jun 13, 2019
  11. Feb 21, 2019
  12. Feb 20, 2019
  13. Feb 18, 2019
  14. Apr 23, 2018
  15. Jun 07, 2016
  16. Sep 21, 2012
  17. Jul 26, 2012
  18. Jul 25, 2012
  19. Jul 24, 2012
  20. Jul 23, 2012
  21. Jul 13, 2012
  22. Jul 07, 2012
  23. Jun 26, 2012
  24. Jun 25, 2012
  25. Jun 22, 2012
    • Heinz-Bernd Eggenstein's avatar
      Bug #1608: clFFT use of native_sin , native_cos can cause validation problems · 20314512
      Heinz-Bernd Eggenstein authored
      experimental: -added alternative method for twiddle factor calc, using a smaller LUT (256 * float2 )
                     via Taylor series to 3rd order, seems to be almost as accurate as method with 2 bigger LUTs, but faster.
                    -improved method w/ 2 bigger LUTs to use LUTs of float2
                    -improved method using slow sin/cos functions (now uses sincos combined function), still slow
                    - preparaed plan struct to have method switchable at plan creation time.
      
                    TODO: load smaller LUT for Taylor series approx into shared mem.
      20314512
  26. Jun 08, 2012
    • Heinz-Bernd Eggenstein's avatar
      Bug #1608: clFFT use of native_sin , native_cos can cause validation problems · 48a3c019
      Heinz-Bernd Eggenstein authored
      Still experimental: replace calls to native_sin in clFFT
      This change explores the performance impacts of using a set of LUTs, precomputed on the CPU
      to perform sin(x_i) and cos(x_i) in a grid x_i= +/- 2*pi *i/N , N fixed.
      
      On a 6770M, this code is still ca 3% slower than the original native_sin/native_cos varaint
      for a BRP4-like transform
      
      This variant should have a very high accuracy, versions with lesser accuracy but
      higher performance should be explored next. Eventually the method should be selectable
      by a parameter to the plan creator as suggested by Bernd.
      
      TODO: - remove some diagnostic code,
            - optimze total size of LUTs perhaps by using
              cos(x) = sin(x+pi/2), so no need to keep separate LUTs for sin and cos, just one slighly longer with
              an additional alias pointer
            - try caching the LUTs in shared memory (using constant memory didn't help)
      48a3c019
  27. Oct 20, 2011
  28. Oct 17, 2011
  29. Sep 13, 2011
  30. May 20, 2011
Loading