Brave Search

reddit.com › r › unrealengine › comments › 1g96vok › how_to_read_gpu_profiler

Reposting: Use "stat unitgraph" console command to validate if you are CPU/RHI or GPU bound (counter with highest ms). If your GPU bound initially use the GPU Visualiser by running the "ProfileGPU" console command or pressing the hotkey " Ctrl + Shift + ," to capture a frame's information. Sort the GPU Visualiser list by highest ms and open the top couple of parent headings then sort the list by highest ms again so the child headings are sorted properly, use this information to investigate further. (Also look at using Renderdoc to analyse a frame in more detail) If your CPU bound you would instead use Unreal Insights to take a capture and look through it to find where the largest chunks of time are being spent and then try to optimise the code related to them. Answer from CloudShannen on reddit.com

GitHub

github.com › JeremyMain › GPUProfiler

GitHub - JeremyMain/GPUProfiler: GPUProfiler - Understand your application and workflow resource requirements · GitHub

GPUProfiler is not a source code profiler but a resource and utilization profile that can provide a snapshot of a system and select resource utilization metrics over a period of time.

Starred by 313 users

Forked by 20 users

NVIDIA Developer

developer.nvidia.com › nvidia-visual-profiler

NVIDIA Visual Profiler | NVIDIA Developer

The NVIDIA Visual Profiler is a cross-platform performance profiling tool that delivers developers vital feedback for optimizing CUDA C/C++ applications. First introduced in 2008, Visual Profiler supports all 350 million+ CUDA capable NVIDIA ...

Discussions

How to read GPU profiler

Reposting: Use "stat unitgraph" console command to validate if you are CPU/RHI or GPU bound (counter with highest ms). If your GPU bound initially use the GPU Visualiser by running the "ProfileGPU" console command or pressing the hotkey " Ctrl + Shift + ," to capture a frame's information. Sort the GPU Visualiser list by highest ms and open the top couple of parent headings then sort the list by highest ms again so the child headings are sorted properly, use this information to investigate further. (Also look at using Renderdoc to analyse a frame in more detail) If your CPU bound you would instead use Unreal Insights to take a capture and look through it to find where the largest chunks of time are being spent and then try to optimise the code related to them. More on reddit.com

r/unrealengine

5

1

October 22, 2024

How to use GPU profiler stats?

Hello @MartinTilo (tagging you because of this very useful post , and I was wonder what might have changed since July), I’m using HDRP trying to get some GPU profiler stats at runtime but haven’t been having any luck as of yet. (I’m using 2020.2.4, HDRP 10.3.1 on Windows.) More on discussions.unity.com

discussions.unity.com

12

1

February 16, 2021

[P] seeking feedback on a gpu profiler I made as a Python package

Took a look at the PyPI page. Cool that you're tackling this because GPU profiling tooling is genuinely painful and most teams just stare at nvidia-smi and hope for the best. Few questions and thoughts. What's the overhead like when profiling is active? Our clients doing ML work are always paranoid about profilers that distort the thing they're measuring. If your instrumentation adds latency or memory pressure it can shift where bottlenecks appear. Would be worth documenting expected overhead as a percentage. The auto-calibration for any GPU is ambitious. How are you handling the differences between consumer cards versus datacenter stuff like A100s and H100s? Memory bandwidth characteristics and compute unit architectures vary a ton. Curious whether calibration actually captures those differences or if it's approximating. The compute/memory/overhead classification is the useful part imo. Most people don't realize their "slow kernel" is actually just waiting on memory transfers. If your tool makes that obvious you're solving a real problem. One thing that would make this way more useful is integration examples with common training loops. PyTorch Lightning, HuggingFace Trainer, that kind of thing. People are lazy and if they can't drop it into their existing workflow with minimal changes they won't bother. Also no README on PyPI is rough. I see you have a GitHub link but the project description is basically empty. You're asking people to pip install something with almost no context. Throw some example output and a quick usage snippet on there, it'll dramatically increase adoption. More on reddit.com

r/MachineLearning

2

3

January 3, 2026

GPU Profiling tools

Hi there I am looking for some tools to profile GPU workloads running on a physical desktop. I am aware of GPUprofiler (Releases · JeremyMain/GPUProfiler (github.com) but are there alternatives? More on community.omnissa.com

community.omnissa.com

2

October 24, 2024

Videos