Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
N
NVidia_AMD_Bench
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Henning Fehrmann
NVidia_AMD_Bench
Repository graph
Repository graph
You can move around the graph by using the arrow keys.
9a07a73b6fc794e24b1f00e9df1aaaab8095df1d
Select Git revision
Branches
1
master
default
protected
1 result
Begin with the selected commit
Created with Raphaël 2.2.0
8
Mar
5
12
Feb
9
8
5
29
Jan
28
27
removed some merge obstacles
master
master
loop over parameters and measure the run time in the test routine
index swaping
some tensor core code compilation instruction. not in the git yet
testing quality of results
removed a library
need to do the data copying after the data generation. Required for later tests
test different precissions for tensor core
tensor core benchmark
header
working expample
try half precission mm
need hw specific flags
tensor core code
removed unused variable
generic sync
add synchronization
test tensor_core
some polishing
added license info
need rocfft header file
needed header files
make fftw platform agnostic
add cufft library
removed spaces
stdio is needed in profiler.h
fftw code and outsourced code lines
added some FFTW handling
outsource parts which are also needed for FFTWs
add fftw compilation
cleaned some leftover
some spelling
obsolete makefiles
platform agnostic code
disclaimer
Initial commit, not platform agnostic yet
Loading