Hi, sys.getsizeof() will return the size of the python object. It will the same for all tensors as all tensors are a python object containing a tensor. For each tensor, you have a method element_size() that will give you the size of one element in byte. And a function nelement() that returns the n… Answer from albanD on discuss.pytorch.org
Top answer
1 of 2
70

The answer is in two parts. From the documentation of sys.getsizeof, firstly

All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

so it could be that for tensors __sizeof__ is undefined or defined differently than you would expect - this function is not something you can rely on. Secondly

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

which means that if the torch.Tensor object merely holds a reference to the actual memory, this won't show in sys.getsizeof. This is indeed the case, if you check the size of the underlying storage instead, you will see the expected number

import torch, sys
b = torch.randn(1, 1, 128, 256, dtype=torch.float64)
sys.getsizeof(b)
>> 72
sys.getsizeof(b.storage())
>> 262208

Note: I am setting dtype to float64 explicitly, because that is the default dtype in numpy, whereas torch uses float32 by default.

2 of 2
0

if you want get the size of tensor or network in cuda,you could use this code to calculate it size:

import torch

device = 'cuda:0'

# before
torch._C._cuda_clearCublasWorkspaces()
memory_before = torch.cuda.memory_allocated(device)

# your tensor or network
data5 = torch.randn((10000,100),device=device)

# after
memory_after = torch.cuda.memory_allocated(device)
latent_size = memory_after - memory_before

latent_size
# 4000256

get idea from this:https://github.com/pytorch/pytorch/blob/ee28b865ee9c87cce4db0011987baf8d125cc857/torch/distributed/pipeline/sync/_balance/profile.py#L102

Discussions

Is there a way to get the underlying memory array size of a tensor?
E.g. I have this code snippet x = torch.rand(1000, 1000) # this will allocate a memory block of 4M in bytes x1 = x[0:1] # only selecting the first row. x1 itself contains only 1K floats, but for now it shares the underlying memory array with x torch.save(x1, 'some/disk/file') # after this, ... More on discuss.pytorch.org
🌐 discuss.pytorch.org
4
0
June 22, 2023
python - How to find the size of a tensor in bytes? - Data Science Stack Exchange
Supposing I have a tensor named encoding which is of the shape (1,32) Now I would like to find out the size of the tensor in bytes not the shape of it. So basically I want to know how much memory it More on datascience.stackexchange.com
🌐 datascience.stackexchange.com
April 23, 2020
Get the memory needed to store a tensor in PyTorch - Stack Overflow
For each tensor, you have a method element_size() that will give you the size of one element in byte. And a function nelement() that returns the number of elements. So the size of a tensor a in memory (cpu memory for a cpu tensor and gpu memory for a gpu tensor) is a.element_size() * a.nelement(). More on stackoverflow.com
🌐 stackoverflow.com
Tensor type memory usage
I would like to know how much memory (on GPU) do the different torch types allocate, but I was not able to find it anywhere in the docs. At the following link all the possible Tensor types are specified, but nothing is said about memory usage. https://pytorch.org/docs/stable/tensors.html For ... More on discuss.pytorch.org
🌐 discuss.pytorch.org
3
0
May 6, 2022
🌐
Sling Academy
slingacademy.com › article › pytorch-determine-the-memory-usage-of-a-tensor-in-bytes
PyTorch: Determine the memory usage of a tensor (in bytes) - Sling Academy
April 18, 2023 - In order to calculate the memory used by a PyTorch tensor in bytes, you can use the following formula: memory_in_bytes = tensor.element_size() * tensor.nelement() his will give you the size of the tensor data in memory, whether...
🌐
PyTorch Forums
discuss.pytorch.org › t › is-there-a-way-to-get-the-underlying-memory-array-size-of-a-tensor › 182626
Is there a way to get the underlying memory array size of a tensor? - PyTorch Forums
June 22, 2023 - E.g. I have this code snippet x = torch.rand(1000, 1000) # this will allocate a memory block of 4M in bytes x1 = x[0:1] # only selecting the first row. x1 itself contains only 1K floats, but for now it shares the underlying memory array with x torch.save(x1, 'some/disk/file') # after this, `some/disk/file` will be of size roughly 4M instead of 4K.
🌐
PyTorch Forums
discuss.pytorch.org › t › confusion-on-tensors-memory-usage › 162753
Confusion on tensor's memory usage - PyTorch Forums
October 4, 2022 - I have a float tensor containing 3 elements. Upon moving this tensor to GPU, the nvidia-smi reports 1089MiB memory consumption. Please see below ipython notebook more details: $ ipython Python 3.9.13 (main, Aug 25 ...
Find elsewhere
🌐
PyTorch Forums
discuss.pytorch.org › t › list-all-the-tensors-and-their-memory-allocation › 144108
List all the tensors and their memory allocation - PyTorch Forums
February 14, 2022 - I run out of GPU memory when I start to infer a trained model (not training at all in this code). So I would like to know where is the bottleneck. What uses up all the memory. Edit: My saved models are more than 700MB in size on disk
🌐
PyTorch
pytorch.org › blog › tensor-memory-format-matters
Efficient PyTorch: Tensor Memory Format Matters – PyTorch
December 15, 2021 - ChannelsLast3d: For 3d tensors (video tensors), the memory is laid out in THWC (Time, Height, Width, Channels) or NTHWC (N: batch, T: time, H: height, W: width, C: channels) format. The dimensions could be permuted in any order. The reason that ChannelsLast is preferred for vision models is because XNNPACK (kernel acceleration library) used by PyTorch expects all inputs to be in Channels Last format, so if the input to the model isn’t channels last, then it must first be converted to channels last, which is an additional operation.
🌐
PyPI
pypi.org › project › pytorch-memlab
pytorch-memlab · PyPI
Element type Size Used MEM ... import MemReporter linear = torch.nn.Linear(1024, 1024).cuda() inp = torch.Tensor(512, 1024).cuda() # pass in a model to automatically infer the tensor names reporter = MemReporter(linear) ...
      » pip install pytorch-memlab
    
Published   Jul 29, 2023
Version   0.3.0
🌐
GitHub
github.com › Oldpan › Pytorch-Memory-Utils
GitHub - Oldpan/Pytorch-Memory-Utils: pytorch memory track code
... Model Sequential : params: 0.450304M Model Sequential : intermedite variables: 336.089600 M (without backward) Model Sequential : intermedite variables: 672.179200 M (with backward) ...
Starred by 1K users
Forked by 153 users
Languages   Python 100.0% | Python 100.0%
🌐
GitHub
gist.github.com › Stonesjtu › 368ddf5d9eb56669269ecdf9b0d21cbe
A simple Pytorch memory usages profiler · GitHub
@kkonevets, thanks, didn't pay much attention to the scenario when # tensors goes high. But I think the in operation won't take much time in this script, how do you think. Copy link · Author · I wrote a more powerful and pip installable tool recently, you can check this out: https://github.com/stonesjtu/pytorch_memlab ·
🌐
PyTorch Forums
discuss.pytorch.org › t › scope-and-memory-consumption-of-tensors-created-using-self-new-api › 19667
Scope and memory consumption of tensors created using self.new_* API - PyTorch Forums
June 13, 2018 - What is even more intriguing is that the model runs fine for roughly first 2000 steps and the memory allocated as per nvidia-smi starts increasing from 14GB to 16GB gradually and then crash finally.
🌐
Stack Overflow
stackoverflow.com › questions › 62547072 › why-does-pytorch-use-so-much-gpu-memory-to-store-tensors
Why Does PyTorch Use So Much GPU Memory to Store Tensors? - Stack Overflow
I made a simple test using PyTorch which involves measuring the current GPU memory use, creating a tensor of a certain size, moving it to the GPU, and measuring the GPU memory again. According to my calculation, around 6.5k bytes were required ...
🌐
PyTorch Forums
discuss.pytorch.org › vision
How do I create Torch Tensor without any wasted storage space/baggage? - vision - PyTorch Forums
September 4, 2021 - From testing experience, the first Tensor push to GPU will roughly take up to 700-800 MiB of the GPU VRAM. You can then allocate more tensor to GPU without a shift in the VRAM until you have exceeded the pre-allocated space given from the first ...
🌐
APXML
apxml.com › courses › advanced-pytorch › chapter-1-pytorch-internals-autograd › memory-management
PyTorch Memory Management Strategies
Check model size and complexity. Insert print statements for torch.cuda.memory_allocated() at different points in your training loop to pinpoint where memory usage spikes. Use the PyTorch Profiler (covered in Chapter 4) to get a detailed breakdown of memory usage per operator. Memory Leaks: If memory usage continuously grows over training iterations without stabilizing, you might have a memory leak. This often happens when tensors with computation history are unintentionally accumulated in lists or dictionaries outside the torch.no_grad() context.
🌐
PyTorch
docs.pytorch.org › reference api › torch.tensor
torch.Tensor — PyTorch 2.12 documentation
Current implementation of torch.Tensor introduces memory overhead, thus it might lead to unexpectedly high memory usage in the applications with many tiny tensors. If this is your case, consider using one large structure. ... There are a few main ways to create a tensor, depending on your use case. To create a tensor with pre-existing data, use torch.tensor(). To create a tensor with specific size, use torch.* tensor creation ops (see Creation Ops).
🌐
Hugging Face
huggingface.co › blog › train_memory
Visualize and understand GPU memory in PyTorch
December 24, 2024 - For the Qwen2.5-1.5B model, this gives 5,065,216 activations per input token. To estimate the total activation memory for an input tensor, use: ... A A A is the number of activations per token. B B B is the batch size.