20.04 comes with an old nvprof tool: nvidia-profiler (10.1.243-3)

20.10 comes with a newer one: nvidia-profiler (11.0.3-1ubuntu1)

Unfortunately, neither of these is capable of running on a 3000-series card.

Even when you get the 11.2 profiler from This NVIDIA server that serves deb archives, it will not support it.

Instead, you are expected to run nvidia-nsight-compute to profile your kernel. It seems like command line profiling with nvprof is no longer a thing.

Answer from Bram on Stack Exchange
🌐
CSDN
blog.csdn.net › Zero_Muzi › article › details › 121431789
Warning: nvprof is not supported on devices with compute capability 8.0 and higher.-CSDN博客
November 19, 2021 - MathWorks - Bug Reports 1. 问题说明 当运行 alexnet 等卷积神经网络需要使用 GPU 加速时,matlab 如果提示如下的警告信息: GPUs of compute capability 6.0 and above are not supported due to a problem with this ve... ... 问题一: Warning: The user does not have permission to profile on the target device. See the following link for instructions to enable permissions and get more information: https://developer.nvidia.com/NVSOLN1000 原因: 出现这个问题的原因是nvidia的nvpp和nvprof不再支持non.
🌐
GitHub
github.com › NVIDIA-NeMo › NeMo › discussions › 8754
How to profile model training? · NVIDIA-NeMo/NeMo · Discussion #8754
nvprof --profile-from-start off ... with compute capability 8.0 and higher. Use NVIDIA Nsight Systems for GPU tracing and CPU sampling and NVIDIA Nsight Compute for GPU profiling....
Author   NVIDIA-NeMo
Discussions

nvprof --analysis-metrics not working for RTX 2070 (CUDA 10.0)
On my Pascal GPU (GTX 1070) I can do the following command: nvprof --kernels stereo_flow --analysis-metrics -f -o analysis.nvprof ./my_application And then I can import the file in Visual Profier. However on my RTX 2070 nvprof gives this output: ======== Warning: Skipping profiling on device ... More on forums.developer.nvidia.com
🌐 forums.developer.nvidia.com
6
0
December 11, 2018
How to profile in CUDA application with compute capability 7.x? Is metric "dram_read_throughput" valid in Nsight Compute? - Stack Overflow
My setup environment: CUDA 10.2 Device: RTX 2080 OS: Ubuntu 16.04 When I try to use nvprof, I find that it doesn't support devices with compute capability 7.2 and higher. It is recommended that I s... More on stackoverflow.com
🌐 stackoverflow.com
An Even Easier Introduction to CUDA
If anyone with a newer GPU runs the profiler command as mentioned in the tutorial like this: nvprof ./add_cuda And you get this warning: ======== Warning: nvprof is not supported on devices with compute capability 8.0 and higher. Use NVIDIA Nsight Systems for GPU tracing and CPU sampling and ... More on forums.developer.nvidia.com
🌐 forums.developer.nvidia.com
20
1
July 23, 2023
PyTorchProfiler crashes when emit_nvtx=True
🐛 Bug When training with PyTorchProfiler(emit_nvtx=True), the training stops with the following error : AttributeError: 'emit_nvtx' object has no attribute 'function_events' Please ... More on github.com
🌐 github.com
2
February 23, 2021
🌐
GitHub
github.com › PaddleJitLab › CUDATutorial › issues › 5
Nvprof Cannot be used with compute capability 8.0 and higher · Issue #5 · PaddleJitLab/CUDATutorial
D:\Codes\CUDATutorial\02_first_kernel>nvprof add.exe ======== Warning: nvprof is not supported on devices with compute capability 8.0 and higher. Use NVIDIA Nsight Systems for GPU tracing and CPU sampling and NVIDIA Nsight Compute for GPU profiling.
Author   PaddleJitLab
🌐
Zhihu
zhuanlan.zhihu.com › p › 362796754
CUDA 编程小练习(五) - 知乎
nvprof ./exercise01 0 0 ======== Warning: nvprof is not supported on devices with compute capability 8.0 and higher. Use NVIDIA Nsight Systems for GPU tracing and CPU sampling and NVIDIA Nsight Compute for GPU profiling.
🌐
NVIDIA Developer Forums
forums.developer.nvidia.com › developer tools › other tools › visual profiler and nvprof
nvprof --analysis-metrics not working for RTX 2070 (CUDA 10.0) - Visual Profiler and nvprof - NVIDIA Developer Forums
December 11, 2018 - On my Pascal GPU (GTX 1070) I can do the following command: nvprof --kernels stereo_flow --analysis-metrics -f -o analysis.nvprof ./my_application And then I can import the file in Visual Profier. However on my RTX 2070 nvprof gives this output: ======== Warning: Skipping profiling on device 0 since profiling is not supported on devices with compute capability greater than 7.2 What are the plans for Turning profiling support and/or are there alternatives? nvprof: NVIDIA (R) Cuda command li...
🌐
CSDN
blog.csdn.net › u013378687 › article › details › 130114918
nvprof --metrics参数含义_nvprof is not supported on devices with compute ca-CSDN博客
======== Warning: nvprof is not supported on devices with compute capability 8.0 and higher. Use NVIDIA Nsight Systems for GPU tracing and CPU sampling and NVIDIA Nsight Compute for GPU profiling.
Top answer
1 of 1
5

How to profile in CUDA application with compute capability 7.x?

For compute capability 7.5 and higher the recommended tools are nsight compute, and nsight systems. The documentation for nsight compute is here, the documentation for nsight systems is here. There is an introductory blog describing these "new" CUDA profiler tools here, and a tutorial blog on nsight systems here and a tutorial blog on nsight compute here. The introductory blog describes why there are 2 tools, and how they relate to each other.

Is metric “dram_read_throughput” valid in Nsight Compute?

It is not. The naming format of that metric indicates it is a nvprof metric. The nvprof metric names can generally not be used directly in Nsight Compute. To find out if there is an "equivalent" metric in nsight compute for a given nvprof metric, use the nvprof transition guide, in particular the metric comparison table. By studying that table, you'll note that there is a Nsight compute metric that is equivalent to dram_read_throughput and it is named dram__bytes_read.sum.per_second For instructions on how to capture this metric in nsight compute, please refer to the blog I already mentioned here, or refer to the documentation here.

But I can not launch the above two software because of the lack of graphical interface. How could I use Nsight Compute in remote server?

If you have the CUDA toolkit installed on the remote server, you should be able to run Nsight Compute in CLI (command-line-interface) mode. That is described both in the documentation already linked, and the blog article already linked. Alternatively, you may be able to run the GUI in remote mode, as described here.

By the way, is it possible to profile metrics in Nsight Compute?

Yes, we have already covered that.

I won't be able to use this question/answer to debug remote connection details or any other follow-up questions about specific access cases or usage scenarios of Nsight tools. There are documentation and tutorials already available. If you have another specific question, please ask a new question. To locate resources for Nsight Compute and Nsight Systems, I suggest simply googling those names. Usually the first hits will be landing pages here and here which link to all of the above resources, plus additional resources such as video tutorials describing specific cases and advanced usage.

All of these tools are available on windows as well with similar user interfaces. Furthermore, these tools can/should be used for any GPU of compute capability 7.0 or higher.

Find elsewhere
🌐
NVIDIA
docs.nvidia.com › cuda › profiler-users-guide › changelog.html
Changelog — Profiler 12.9 documentation
May 31, 2025 - Visual Profiler and nvprof don’t support devices with compute capability 8.0 and higher.
🌐
GitHub
github.com › PyTorchLightning › pytorch-lightning › issues › 6153
PyTorchProfiler crashes when emit_nvtx=True · Issue #6153 · Lightning-AI/pytorch-lightning
February 23, 2021 - Additional docs issue : It says ... have to use nvprof to collect the profiler traces, but it is no longer supported for devices with compute capability 8.0 and higher....
Author   Lightning-AI
🌐
GitHub
github.com › puttsk › cuda-tutorial › issues › 11
nvprof is deprecated · Issue #11 · puttsk/cuda-tutorial
April 8, 2024 - ======== Warning: This version of nvprof doesn't support the underlying device, GPU profiling skipped ======== Warning: nvprof is not supported on devices with compute capability 8.0 and higher. So I suggest an update of using another pr...
Author   puttsk
🌐
CSDN
cnblogs.com › wxkang › p › 17325879.html
CUDA 教程(三)CUDA C 编程简介 - CV技术指南(公众号) - 博客园
April 17, 2023 - ======== Warning: nvprof is not supported on devices with compute capability 8.0 and higher. Use NVIDIA Nsight Systems for GPU tracing and CPU sampling and NVIDIA Nsight Compute for GPU profiling.
🌐
Wityx
wityx.com › post › 243090_1_1.html
nvprof is not supported on devices with compute capability 8.0 ...
January 5, 2023 - 今日: 0| 昨日: 0| 帖子: 1386| 会员: 373| 欢迎新会员: 2680346713 广告:商用版个人微信API接口加微信836869520 · javase基础面试题栏目提供java常见面试题,java基础面试题及答案,java多线程面试题,java算法面试题,java最新面试题,java设计...
🌐
UCSD
cseweb.ucsd.edu › classes › wi15 › cse262-a › static › cuda-5.5-doc › pdf › CUDA_Profiler_Users_Guide.pdf pdf
PROFILER USER'S GUIDE DU-05982-001_v5.5 | July 2013
2.2.2.1. Import nvprof Session......................................................................... 6 · 2.2.2.2. Import Command-Line Profiler Session..................................................... 7 · 2.3. Application Requirements............................................................................... 7 · 2.4. Profiling Limitations......................................................................................8
🌐
Course Hero
coursehero.com › file › p2bbc9vs › Note-Visual-Profiler-and-nvprof-allow-tracing-features-for-non-root-and-non
Note visual profiler and nvprof allow tracing
September 27, 2023 - Instant access to millions of Study Resources, Course Notes, Test Prep, 24/7 Homework Help, Tutors, and more. Learn, teach, and study with Course Hero. Get unstuck.
🌐
NVIDIA
docs.nvidia.com › cuda › profiler-users-guide
1. Preparing An Application For Profiling — Profiler 12.9 documentation
May 31, 2025 - Note that Visual Profiler and nvprof are deprecated and will be removed in a future CUDA release. The NVIDIA Volta platform is the last architecture on which these tools are fully supported.
🌐
Stack Overflow
stackoverflow.com › questions › tagged › nsight-compute
Recently Active 'nsight-compute' Questions - Stack Overflow
April 9, 2024 - I'm trying to multiply blocks of size 8x8 using Tensor Cores on a GPU designed with the Turing architecture. For that I'm using the WMMA API and fragments of size 16x16. My assumption was that shared ... ... On device with compute capability <= 7.2 , I always use nvprof --events shared_st_bank_conflict but when i run it on RTX2080ti with CUDA10 , it returns Warning: Skipping profiling on device 0 ...
🌐
NVIDIA Developer Forums
forums.developer.nvidia.com › accelerated computing › cuda › cuda programming and performance
How to use nvprof --metrics gld_efficiency on RTX2080ti - CUDA Programming and Performance - NVIDIA Developer Forums
August 21, 2019 - I want to measure global memory load efficiency,but when I run nvprof --metrics gld_efficiency ./HellowWorld.exe ,it shows nvprof --metrics gld_throughput .\HellowWorld.exe 32 32 ======== Warning: Skipping profiling on device 0 since profiling is not supported on devices with compute capability greater than 7.2 ==12656== NVPROF is profiling process 12656, command: .\HellowWorld.exe 32 32 SumOnGPU Time Cost 6.138 ms sumMatrixOnGPU2D ==12656== Profiling application: .\...
🌐
GitHub
github.com › undertherain › benchmarker › issues › 168
nvprof defpreciated · Issue #168 · undertherain/benchmarker
April 30, 2021 - Warning: Skipping profiling on device 0 since profiling is not supported on devices with compute capability 7.5 and higher. Use NVIDIA Nsight Compute for GPU profiling and NVIDIA Nsight Systems for GPU tracing and CPU sampling.
Author   undertherain