pytorch tensor memory size

How to know the memory allocated for a tensor on gpu?

discuss.pytorch.org › t › how-to-know-the-memory-allocated-for-a-tensor-on-gpu › 28537

Hi, sys.getsizeof() will return the size of the python object. It will the same for all tensors as all tensors are a python object containing a tensor. For each tensor, you have a method element_size() that will give you the size of one element in byte. And a function nelement() that returns the n… Answer from albanD on discuss.pytorch.org

PyTorch Forums

discuss.pytorch.org › t › how-to-know-the-memory-allocated-for-a-tensor-on-gpu › 28537

How to know the memory allocated for a tensor on gpu? - PyTorch Forums

Top answer

1 of 15

2 of 15

Thank you for the detailed reply @albanD! Regarding the other objects in the gpu, so are you saying that the functions which tracks the graph for each tensors etc are stored in cpu?

Stack Overflow

stackoverflow.com › questions › 54361763 › pytorch-why-is-the-memory-occupied-by-the-tensor-variable-so-small

python - Pytorch: Why is the memory occupied by the `tensor` variable so small? - Stack Overflow

Top answer

1 of 2

The answer is in two parts. From the documentation of sys.getsizeof, firstly

All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

so it could be that for tensors __sizeof__ is undefined or defined differently than you would expect - this function is not something you can rely on. Secondly

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

which means that if the torch.Tensor object merely holds a reference to the actual memory, this won't show in sys.getsizeof. This is indeed the case, if you check the size of the underlying storage instead, you will see the expected number

import torch, sys
b = torch.randn(1, 1, 128, 256, dtype=torch.float64)
sys.getsizeof(b)
>> 72
sys.getsizeof(b.storage())
>> 262208

Note: I am setting dtype to float64 explicitly, because that is the default dtype in numpy, whereas torch uses float32 by default.

2 of 2

if you want get the size of tensor or network in cuda,you could use this code to calculate it size:

import torch

device = 'cuda:0'

# before
torch._C._cuda_clearCublasWorkspaces()
memory_before = torch.cuda.memory_allocated(device)

# your tensor or network
data5 = torch.randn((10000,100),device=device)

# after
memory_after = torch.cuda.memory_allocated(device)
latent_size = memory_after - memory_before

latent_size
# 4000256

get idea from this:https://github.com/pytorch/pytorch/blob/ee28b865ee9c87cce4db0011987baf8d125cc857/torch/distributed/pipeline/sync/_balance/profile.py#L102

Discussions

Is there a way to get the underlying memory array size of a tensor?

E.g. I have this code snippet x = torch.rand(1000, 1000) # this will allocate a memory block of 4M in bytes x1 = x[0:1] # only selecting the first row. x1 itself contains only 1K floats, but for now it shares the underlying memory array with x torch.save(x1, 'some/disk/file') # after this, ... More on discuss.pytorch.org

discuss.pytorch.org

June 22, 2023

python - How to find the size of a tensor in bytes? - Data Science Stack Exchange

Supposing I have a tensor named encoding which is of the shape (1,32) Now I would like to find out the size of the tensor in bytes not the shape of it. So basically I want to know how much memory it More on datascience.stackexchange.com

datascience.stackexchange.com

April 23, 2020

Get the memory needed to store a tensor in PyTorch - Stack Overflow

For each tensor, you have a method element_size() that will give you the size of one element in byte. And a function nelement() that returns the number of elements. So the size of a tensor a in memory (cpu memory for a cpu tensor and gpu memory for a gpu tensor) is a.element_size() * a.nelement(). More on stackoverflow.com

stackoverflow.com

Tensor type memory usage

I would like to know how much memory (on GPU) do the different torch types allocate, but I was not able to find it anywhere in the docs. At the following link all the possible Tensor types are specified, but nothing is said about memory usage. https://pytorch.org/docs/stable/tensors.html For ... More on discuss.pytorch.org

discuss.pytorch.org

May 6, 2022

Sling Academy

slingacademy.com › article › pytorch-determine-the-memory-usage-of-a-tensor-in-bytes

PyTorch: Determine the memory usage of a tensor (in bytes) - Sling Academy

April 18, 2023 - In order to calculate the memory used by a PyTorch tensor in bytes, you can use the following formula: memory_in_bytes = tensor.element_size() * tensor.nelement() his will give you the size of the tensor data in memory, whether...

PyTorch Forums

discuss.pytorch.org › t › is-there-a-way-to-get-the-underlying-memory-array-size-of-a-tensor › 182626

Is there a way to get the underlying memory array size of a tensor? - PyTorch Forums

June 22, 2023 - E.g. I have this code snippet x = torch.rand(1000, 1000) # this will allocate a memory block of 4M in bytes x1 = x[0:1] # only selecting the first row. x1 itself contains only 1K floats, but for now it shares the underlying memory array with x torch.save(x1, 'some/disk/file') # after this, `some/disk/file` will be of size roughly 4M instead of 4K.

Stack Exchange

datascience.stackexchange.com › questions › 72870 › how-to-find-the-size-of-a-tensor-in-bytes

python - How to find the size of a tensor in bytes? - Data Science Stack Exchange

Top answer

1 of 4

Step 1 :
Get the dtype of the tensor.
This will tell you about the number of bytes e.g.float64 is 64 bits = 8 Bytes.

Step 2
Get the shape of the Tensor.
This will give you the number of place-holders of the dtype.
lets's assume shape = m x n x p
Count of the placeholders is C = m * n * p
Memory = 8 * C => Memory = 8 *m * n * p Bytes.

See this excerpt from Deep Learning with Python, by Francois Chollet

A 60-second, 144 × 256 YouTube video clip sampled at 4 frames per second would have 240 frames. A batch of four such video clips would be stored in a tensor of shape (4, 240, 144, 256, 3). That’s a total of 106,168,320 values! If the dtype of the tensor was float32, then each value would be stored in 32 bits, so the tensor would represent 405 MB

2 of 4

To do this in a line of code, use:

size_in_bytes = encoding.nelement() * encoding.element_size()

This multiplies the number of elements in your tensor by the size of the tensor in bytes, to get the total memory usage of the tensor - not including the Python object overhead, which can be found with sys.getsizeof().

Stack Overflow

stackoverflow.com › questions › 76125524 › get-the-memory-needed-to-store-a-tensor-in-pytorch

Get the memory needed to store a tensor in PyTorch - Stack Overflow

Top answer

1 of 1

Per the PyTorch discussion forum:

sys.getsizeof() will return the size of the python object. It will the same for all tensors as all tensors are a python object containing a tensor. For each tensor, you have a method element_size() that will give you the size of one element in byte. And a function nelement() that returns the number of elements. So the size of a tensor a in memory (cpu memory for a cpu tensor and gpu memory for a gpu tensor) is a.element_size() * a.nelement().

Therefore, the (simple) answer appears to be as follows, and holds true when calculating expected memory usage and release in various scenarios:

a.element_size() * a.nelement()

This is supported by experimenting with the following:

torch.tensor(1, dtype=torch.int8).element_size()   # 1 (byte)
torch.tensor(1, dtype=torch.int16).element_size()  # 2 (bytes)
torch.tensor(1, dtype=torch.int32).element_size()  # 4 (bytes)

Additionally, this post offers a more detailed way to calculate the memory usage.

PyTorch Forums

discuss.pytorch.org › memory format

Tensor type memory usage - Memory Format - PyTorch Forums

Top answer

1 of 3

Oh yeah, nbytes() was intoduced on pytorch 1.11 I think. So if you want to use that you need to update your version with !pip install --update torch !pip install --update torchvision # Or however you need to update it (if you want to do it like this, but you also got an alternative) And if I find…

2 of 3

With Storage you can get some information about what is happening. For example data_ptr() will give you the address of the first element. a = torch.ones([10000], dtype=torch.uint8, device='cuda:0') b = torch.ones([10000], dtype=torch.float32, device='cuda:0') print(a.storage().data_ptr()) print(b…

PyTorch Forums

discuss.pytorch.org › t › confusion-on-tensors-memory-usage › 162753

Confusion on tensor's memory usage - PyTorch Forums

October 4, 2022 - I have a float tensor containing 3 elements. Upon moving this tensor to GPU, the nvidia-smi reports 1089MiB memory consumption. Please see below ipython notebook more details: $ ipython Python 3.9.13 (main, Aug 25 ...

Find elsewhere

Google Bing Mojeek

PyTorch Forums

discuss.pytorch.org › t › list-all-the-tensors-and-their-memory-allocation › 144108

List all the tensors and their memory allocation - PyTorch Forums

February 14, 2022 - I run out of GPU memory when I start to infer a trained model (not training at all in this code). So I would like to know where is the bottleneck. What uses up all the memory. Edit: My saved models are more than 700MB in size on disk

PyTorch

pytorch.org › blog › tensor-memory-format-matters

Efficient PyTorch: Tensor Memory Format Matters – PyTorch

December 15, 2021 - ChannelsLast3d: For 3d tensors (video tensors), the memory is laid out in THWC (Time, Height, Width, Channels) or NTHWC (N: batch, T: time, H: height, W: width, C: channels) format. The dimensions could be permuted in any order. The reason that ChannelsLast is preferred for vision models is because XNNPACK (kernel acceleration library) used by PyTorch expects all inputs to be in Channels Last format, so if the input to the model isn’t channels last, then it must first be converted to channels last, which is an additional operation.

PyPI

pypi.org › project › pytorch-memlab

pytorch-memlab · PyPI

Element type Size Used MEM ... import MemReporter linear = torch.nn.Linear(1024, 1024).cuda() inp = torch.Tensor(512, 1024).cuda() # pass in a model to automatically infer the tensor names reporter = MemReporter(linear) ...

      » pip install pytorch-memlab

Published Jul 29, 2023

Version 0.3.0

Homepage https://github.com/Stonesjtu/pytorch_memlab

GitHub

github.com › Oldpan › Pytorch-Memory-Utils

GitHub - Oldpan/Pytorch-Memory-Utils: pytorch memory track code

... Model Sequential : params: 0.450304M Model Sequential : intermedite variables: 336.089600 M (without backward) Model Sequential : intermedite variables: 672.179200 M (with backward) ...

Starred by 1K users

Forked by 153 users

Languages Python 100.0% | Python 100.0%

GitHub

gist.github.com › Stonesjtu › 368ddf5d9eb56669269ecdf9b0d21cbe

A simple Pytorch memory usages profiler · GitHub

@kkonevets, thanks, didn't pay much attention to the scenario when # tensors goes high. But I think the in operation won't take much time in this script, how do you think. Copy link · Author · I wrote a more powerful and pip installable tool recently, you can check this out: https://github.com/stonesjtu/pytorch_memlab ·

Stack Exchange

datascience.stackexchange.com › questions › 93613 › memory-issue-when-trying-to-initiate-zero-tensor-with-pytorch

machine learning - Memory issue when trying to initiate zero tensor with pytorch - Data Science Stack Exchange

Top answer

1 of 1

The reason the tensor takes up so much memory is because by default the tensor will store the values with the type torch.float32. This data type will use 4kb for each value in the tensor (check using .element_size()), which will give a total of ~48GB after multiplying with the number of zero values in your tensor (4 * 2000 * 2000 * 3200 = 47.68GB). What you could try to do is change the datatype from torch.float32 to something like torch.int8, which only uses 1kb for each value, reducing the memory needed by 75% to 12GB. But given that you only seem to have 8GB available this will also not solve your problem. The only solution therefore would simply be to use a tensor will fewer values.

PyTorch Forums

discuss.pytorch.org › t › scope-and-memory-consumption-of-tensors-created-using-self-new-api › 19667

Scope and memory consumption of tensors created using self.new_* API - PyTorch Forums

June 13, 2018 - What is even more intriguing is that the model runs fine for roughly first 2000 steps and the memory allocated as per nvidia-smi starts increasing from 14GB to 16GB gradually and then crash finally.

Stack Overflow

stackoverflow.com › questions › 62547072 › why-does-pytorch-use-so-much-gpu-memory-to-store-tensors

Why Does PyTorch Use So Much GPU Memory to Store Tensors? - Stack Overflow

I made a simple test using PyTorch which involves measuring the current GPU memory use, creating a tensor of a certain size, moving it to the GPU, and measuring the GPU memory again. According to my calculation, around 6.5k bytes were required ...

PyTorch Forums

discuss.pytorch.org › vision

How do I create Torch Tensor without any wasted storage space/baggage? - vision - PyTorch Forums

September 4, 2021 - From testing experience, the first Tensor push to GPU will roughly take up to 700-800 MiB of the GPU VRAM. You can then allocate more tensor to GPU without a shift in the VRAM until you have exceeded the pre-allocated space given from the first ...

APXML

apxml.com › courses › advanced-pytorch › chapter-1-pytorch-internals-autograd › memory-management

PyTorch Memory Management Strategies

Check model size and complexity. Insert print statements for torch.cuda.memory_allocated() at different points in your training loop to pinpoint where memory usage spikes. Use the PyTorch Profiler (covered in Chapter 4) to get a detailed breakdown of memory usage per operator. Memory Leaks: If memory usage continuously grows over training iterations without stabilizing, you might have a memory leak. This often happens when tensors with computation history are unintentionally accumulated in lists or dictionaries outside the torch.no_grad() context.

PyTorch

docs.pytorch.org › reference api › torch.tensor

torch.Tensor — PyTorch 2.12 documentation

Current implementation of torch.Tensor introduces memory overhead, thus it might lead to unexpectedly high memory usage in the applications with many tiny tensors. If this is your case, consider using one large structure. ... There are a few main ways to create a tensor, depending on your use case. To create a tensor with pre-existing data, use torch.tensor(). To create a tensor with specific size, use torch.* tensor creation ops (see Creation Ops).

reddit.com › r/pytorch › tensor.to_sparse() memory allocation

r/pytorch on Reddit: tensor.to_sparse() Memory Allocation

April 22, 2023 -

Hi,

for some scientific research project I'm using a very large matrix with dim(M) = 4 and about 90% sparsity. Therefore I want to use a sparse tensor. Unfortunately I need to use a method calculating the entries of the matrix, which outputs a dense numpy array.

From this numpy array I create a dense PyTorch tensor which is later converted to a sparse tensor with tensor.to_sparse(). Somehow this leads to an error in memory allocation while trying to allocate aprox 300Gb of RAM. This would usually not be a problem as the cluster I'm working on has 512Gb RAM, but due to the fact that I already have the dense tensor, memory allocation is not possible.

Does anyone of you have any idea how to prevent this or how to directly initialize the tensor from the numpy array to be sparse?

Thanks a lot

Top answer

1 of 2

If we assume your tensor dtype is 32bit float, that means for each 4 byte nonzero value, you need to have 4x 8byte integer index, that means each nonzero value in your sparse tensor costs 36B. In summary, with 90% sparsity, it would cost numel * 0.1 * 36B = numel * 3.6B to store this tensor. while the original dense tensor takes numel * 4B. you can see that the memory you saved is not very significant, at this point you either have to increase the sparsity, or reconcider whether you really need to use sparse tensor. The only real benefit of using sparse-tensors is the memory saving, for everything else, it performs WORSE than dense tensors on modern hardware.

2 of 2

You can initialize the sparse torch tensor by yourself. You need to specify the non-zero values and coordinates, then use the sparse_coo_tensor method. It has not limitations as far as the number of dimensions go, but many torch methods are not implemented for sparse tensors. But If you have that function which outputs dense and you can't change, then it might not be useful for you.