- 08 Jun, 2012 1 commit
-
-
Heinz-Bernd Eggenstein authored
Still experimental: replace calls to native_sin in clFFT This change explores the performance impacts of using a set of LUTs, precomputed on the CPU to perform sin(x_i) and cos(x_i) in a grid x_i= +/- 2*pi *i/N , N fixed. On a 6770M, this code is still ca 3% slower than the original native_sin/native_cos varaint for a BRP4-like transform This variant should have a very high accuracy, versions with lesser accuracy but higher performance should be explored next. Eventually the method should be selectable by a parameter to the plan creator as suggested by Bernd. TODO: - remove some diagnostic code, - optimze total size of LUTs perhaps by using cos(x) = sin(x+pi/2), so no need to keep separate LUTs for sin and cos, just one slighly longer with an additional alias pointer - try caching the LUTs in shared memory (using constant memory didn't help)
-
- 20 Oct, 2011 1 commit
-
-
Oliver Bock authored
-
- 17 Oct, 2011 1 commit
-
-
Oliver Bock authored
-
- 13 Sep, 2011 1 commit
-
-
Oliver Bock authored
-
- 20 May, 2011 6 commits
-
-
Oliver Bock authored
-
Oliver Bock authored
* Supported targets: linux (default), macos, win32, clean
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
- 23 Mar, 2011 1 commit
-
-
Oliver Bock authored
-
- 21 Mar, 2011 5 commits
-
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
- 18 Mar, 2011 6 commits
-
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
* NULL platform unsupported (Apple doesn't care) * Use first FULL_PROFILE platform found
-
Oliver Bock authored
-
Oliver Bock authored
-
- 17 Mar, 2011 2 commits
-
-
Oliver Bock authored
-
Oliver Bock authored
-
- 16 Mar, 2011 8 commits
-
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
- 09 Mar, 2011 3 commits
-
-
Oliver Bock authored
Added crude feature to pass device ID as second command line argument (using lots of redundant code)
-
Oliver Bock authored
-
Oliver Bock authored
-
- 07 Mar, 2011 3 commits
-
-
Oliver Bock authored
-
Oliver Bock authored
-
Oliver Bock authored
-
- 03 Mar, 2011 1 commit
-
-
Oliver Bock authored
-
- 12 Jan, 2011 1 commit
-
-
Oliver Bock authored
-