Videos
Nvidia's visual profiler (nvvp) can be used to profile OpenCL programs, but it is more of a pain than profiling in CUDA directly.
Simon McIntosh's High Performance Computing group over at the University of Bristol came up with the original solution (here), and I can verify it works.
I'll summarise the basics:
- Firstly, the environment variable COMPUTE_PROFILE must be set, this is done with
COMPUTE_PROFILE=1 Secondly a
COMPUTE_PROFILE_CONFIGmust be provided, a sample I use (called nvvp.cfg) contains:profilelogformat CSV streamid gpustarttimestamp gpuendtimestampNext to perform the actual profiling, in this case I'll profile an OpenCL application called HuffFramework, using:
CopyCOMPUTE_PROFILE=1 COMPUTE_PROFILE_CONFIG=nvvp.cfg ./HuffFrameworkThis then generates a series of opencl_profile_*.log files, where * is the number of threads.
These log files can't be loaded by nvvp just yet as all kernel function symbols have a leading
OPENCL_instead of an expectedCUDA_, thus replace these symbols with a quick script like so:Copysed 's/OPENCL_/CUDA_/g' opencl_profile_0.log > cuda_profile_0.logFinally cuda_profile_0.log can now be imported by nvvp, by starting nvvp and going File->Import...->Command-line Profiler, point it to cuda_profile_0.log and preso!

nvvp can only profile CUDA applications.