🌐
AMD GPUOpen
gpuopen.com › rgp
AMD Radeon™ GPU Profiler - AMD GPUOpen
AMD RGP gives you unprecedented, in-depth access to a GPU. Easily analyze graphics, async compute usage, event timing, pipeline stalls, barriers, bottlenecks, and other performance inefficiencies.
🌐
NVIDIA Developer
developer.nvidia.com › nvidia-visual-profiler
NVIDIA Visual Profiler | NVIDIA Developer
The NVIDIA Visual Profiler is a cross-platform performance profiling tool that delivers developers vital feedback for optimizing CUDA C/C++ applications. First introduced in 2008, Visual Profiler supports all 350 million+ CUDA capable NVIDIA ...
🌐
GitHub
github.com › JeremyMain › GPUProfiler
GitHub - JeremyMain/GPUProfiler: GPUProfiler - Understand your application and workflow resource requirements · GitHub
GPUProfiler is not a source code profiler but a resource and utilization profile that can provide a snapshot of a system and select resource utilization metrics over a period of time.
Starred by 313 users
Forked by 20 users
🌐
NVIDIA Developer
developer.nvidia.com › performance-analysis-tools
Nsight Developer Tools | NVIDIA Developer
Nsight Graphics is a standalone application for the debugging, profiling, and analysis of graphics applications on Microsoft Windows and Linux. It allows you to optimize the performance of applications based on Direct3D 11, Direct3D 12, DirectX ...
🌐
Intel
intel.com › content › www › us › en › docs › oneapi › optimization-guide-gpu › 2023-0 › gpu-analysis-with-vtunetm-profiler.html
GPU Analysis with VTuneTM Profiler
VTuneTM Profiler is a performance analysis tool for serial and multi-threaded applications. It helps you analyze algorithm choices and identify where and how your application can benefit from available hardware resources.
🌐
Microsoft Learn
learn.microsoft.com › en-us › visualstudio › profiling › gpu-usage
Use the GPU Usage tool in the Performance Profiler - Visual Studio (Windows) | Microsoft Learn
October 30, 2025 - Use the GPU Usage tool in the Performance Profiler to better understand the high-level hardware usage of your Direct3D app. It helps you see whether the performance of your app is CPU-bound or GPU-bound, and gain insight into how you can use ...
🌐
NERSC
docs.nersc.gov › tools › performance › nvidiaproftools
NVIDIA Profiling Tools - NERSC Documentation
NVIDIA Data Center GPU Manager (DCGM) is a light weight tool to measure and monitor GPU utilization and comprehensive diagnostics of GPU nodes on a cluster. NERSC will be using this tool to measure application utilization and monitor the status of the machine.
🌐
ROCm Blogs
rocm.blogs.amd.com › software-tools-optimization › profilers › README.html
Introduction to profiling tools for AMD hardware - ROCm™ Blogs
April 10, 2026 - For HPC workloads, we recommend programming with HIP in a Linux® environment on AMD Instinct™ GPUs and using rocprof-compute, rocprof-sys or rocprofv3 profiling tools on them. ... uProf (AMD MICRO-prof) is a software profiling analysis tool ...
🌐
NVIDIA Developer
developer.nvidia.com › nsight-systems
Nsight Systems | NVIDIA Developer
Nsight Systems latches on to target applications with low-overhead to expose CPU and GPU activity in timeline, correlating events to remedy performance blockers. For compute tasks, it supports investigating CUDA, cuBLAS, cuDNN, and NVIDIA TensorRT™. For graphics, it profiles Vulkan, OpenGL, DirectX 11, DirectX 12, DXR, and NVIDIA OptiX™ APIs.
Find elsewhere
🌐
Intel
intel.com › developers › tools › intel® graphics performance analyzers
Intel® Graphics Performance Analyzers
... A command line and scripting interface allows users to access frameworks that expose capture and playback functionalities in Intel® Graphics Performance Analyzers (Intel® GPA).
🌐
Intel
intel.com › developers › tools › intel® vtune™ profiler
Fix Performance Bottlenecks with Intel® VTune™ Profiler
Intel® VTune™ Profiler optimizes application performance, system performance, and system configuration for AI, HPC, cloud, IoT, media, storage, and more. CPU, GPU, and NPU: Tune the entire application’s performance―not just the accelerated ...
🌐
Qualcomm
qualcomm.com › developer › software › qualcomm-profiler
Qualcomm® Profiler | Qualcomm
The Qualcomm Profiler GUI tool is a system-wide performance profiling tool designed to visualize system performance and identify optimization and application scaling improvement opportunities across Qualcomm SoC CPUs, GPUs, DSPs and other IP blocks.
🌐
Do more with less.
polarsignals.com › blog › posts › 2025 › 04 › 01 › introducing-continuous-gpu-profiling
Maximize GPU Efficiency: Visualize, Analyze, and Optimize with Precision
April 1, 2025 - Now you can see exactly how your GPUs are being utilized millisecond by millisecond. Our solution helps you move from guesswork to data-driven optimization. Polar Signals Continuous Profiling for GPUs empowers you to take control of your GPU ...
🌐
GitHub
github.com › JeremyMain › GPUProfiler › releases
Releases · JeremyMain/GPUProfiler
June 27, 2021 - GPUProfiler - Understand your application and workflow resource requirements - Releases · JeremyMain/GPUProfiler
Author   JeremyMain
🌐
TechSpot
techspot.com › downloads › 7563-radeon-gpu-profiler.html
AMD Radeon GPU Profiler Download Free - 1.15.1 | TechSpot
August 29, 2023 - Download AMD Radeon GPU Profiler - AMD's Radeon GPU Profiler is a low-level optimization tool that provides detailed information on Radeon GPUs.
Rating: 5 ​ - ​ 1 votes
🌐
Medium
aditya-sunjava.medium.com › optimizing-gpu-performance-a-comprehensive-guide-to-profiling-tools-and-techniques-90cac941a088
Optimizing GPU Performance: A Comprehensive Guide to Profiling Tools and Techniques | by Aditya Bhuyan | Medium
September 17, 2025 - ... AMD Radeon Developer Tool Suite (GPU PerfAPI and GPU PerfStudio): A set of profiling tools for AMD GPUs, including GPU PerfAPI for low-level performance counter access and GPU PerfStudio for a graphical profiling interface.
🌐
Harvard
handbook.eng.kempnerinstitute.harvard.edu › s5_ai_scaling_and_engineering › scalability › gpu_profiling.html
19.3. GPU Profiling — Kempner Institute Computing Handbook
October 30, 2025 - Nsight Systems (High-Level Profiling) checks our code overall to see if there are any problems (e.g., with host and device communication or GPU kernels) identifying non-performant/top kernel(s) and then Nsight Compute (Kernel-Specific Profiling) dives into the details of the identified kernel(s) to help with optimizing, debugging and fixing the issue.
🌐
Reddit
reddit.com › r/cuda › using nvidia tools for profiling
r/CUDA on Reddit: Using Nvidia tools for profiling
March 7, 2025 -

I've written a guide on using Nvidia tools (Nsight systems, Nsight Compute,..) from zero to hero, here is content:

Fix-Bug

Chapter01: Introduction to Nsight Systems - Nsight Compute

Chapter02: Cuda toolkit - Cuda driver

Chapter03: NVIDIA Compute Sanitizer Part 1

Chapter04: NVIDIA Compute Sanitizer Part 2

Chapter05: Global Memory Coalescing

Chapter06: Warp Scheduler

Chapter07: Occupancy Part 1

Chapter08: Occupancy Part 2

Chapter09: Bandwidth - Throughput - Latency

Chapter10: Compute Bound - Memory Bound

🌐
AMD GPUOpen
gpuopen.com › learn › amd-lab-notes › amd-lab-notes-profilers-readme
Introduction to profiling tools for AMD hardware - AMD GPUOpen
April 12, 2023 - For HPC workloads, we recommend programming with HIP in a Linux® environment on AMD Instinct™ GPUs and using Omniperf, Omnitrace or ROC-profiler profiling tools on them. AMD uProf (AMD MICRO-prof) is a software profiling analysis tool for x86 applications running on Windows, Linux® and FreeBSD operating systems and provides event information unique to the AMD “Zen”-based processors and AMD Instinct™ MI Series accelerators.