Does this mean that the PyTorch training is using 33 processes X 15 GB = 495 GB of memory?

Not necessary. You have a worker process (with several subprocesses - workers) and the CPU has several cores. One worker usually loads one batch. The next batch can already be loaded and ready to go by the time the main process is ready for another batch. This is the secret for the speeding up.

I guess, you should use far less num_workers.

It would be interesting to know your batch size too, which you can adapt for the training process as well.

Is there a more accurate way of calculating the total amount of RAM being used by the main PyTorch program and all its child DataLoader worker processes?

I was googling but could not find a concrete formula. I think that it is a rough estimation of how many cores has your CPU and Memory and Batch Size.

To choose the num_workers depends on what kind of computer you are using, what kind of dataset you are taking, and how much on-the-fly pre-processing your data requires.

HTH

Answer from j35t3r on Stack Overflow
Top answer
1 of 2
2

Does this mean that the PyTorch training is using 33 processes X 15 GB = 495 GB of memory?

Not necessary. You have a worker process (with several subprocesses - workers) and the CPU has several cores. One worker usually loads one batch. The next batch can already be loaded and ready to go by the time the main process is ready for another batch. This is the secret for the speeding up.

I guess, you should use far less num_workers.

It would be interesting to know your batch size too, which you can adapt for the training process as well.

Is there a more accurate way of calculating the total amount of RAM being used by the main PyTorch program and all its child DataLoader worker processes?

I was googling but could not find a concrete formula. I think that it is a rough estimation of how many cores has your CPU and Memory and Batch Size.

To choose the num_workers depends on what kind of computer you are using, what kind of dataset you are taking, and how much on-the-fly pre-processing your data requires.

HTH

2 of 2
-1

There is a python function called tracemalloc which is used to trace memory blocks allocated to python. https://docs.python.org/3/library/tracemalloc.html

  • Tracebacks
  • Statics on memory per filename
  • Compute the diff between snapshots
import tracemalloc
tracemalloc.start()
do_someting_that_consumes_ram_and releases_some()
# show how much RAM the above code allocated and the peak usage
current, peak =  tracemalloc.get_traced_memory()
print(f"{current:0.2f}, {peak:0.2f}")
tracemalloc.stop()

https://discuss.pytorch.org/t/measuring-peak-memory-usage-tracemalloc-for-pytorch/34067

🌐
PyTorch Forums
discuss.pytorch.org › data
DataLoader memory usage keeps increasing - data - PyTorch Forums
April 8, 2024 - Hello, i am trying to use pytorchs Dataset and DataLoader to load a large dataset of several 100GB. This is of course too large to be stored in RAM, so parallel, lazy loading is needed. I am trying to load one large HDF…
Discussions

Dataloader eating ram
I have a dataset of 9 gigs of wav files for music synthesis, and to manage batches across different files i load each file into custom WavFileDataset which i then combine in ConcatDataset to use as a dataset for dataloader. Problems begin when i try to sample from dataloader, even with batch_size ... More on discuss.pytorch.org
🌐 discuss.pytorch.org
1
0
March 22, 2020
Num_workers in DataLoader will increase memory usage?
I am using DataLoader to load my training data. According to the document, we can set num_workers to set the number of subprocess to speed up the loading process. However, when I use it like this: dataloader = DataLoa… More on discuss.pytorch.org
🌐 discuss.pytorch.org
10
1
November 1, 2018
Memory usage keeps increasing in Dataloader (in very simple code)
Hi, I create a dataloader to load features from local files by their file pathes. The dataloader can be simplfied as: (P.S. All codes were tested on Pytorch 1.0.0 and Pytorch 1.0.1. Memory capacity of my machine is 256Gb) import numpy as np import torch import torch.utils.data as data import ... More on discuss.pytorch.org
🌐 discuss.pytorch.org
0
1
May 13, 2019
Pytorch Dataloader Memory Leak
Hi, I noticed that while training a PyTorch model the subprocesses that are started by the dataloader workers are accumulating memory over time while loading new batches and it seems this memory is never released, ultimately resulting in a “dataloader worker does not have sufficient shared ... More on discuss.pytorch.org
🌐 discuss.pytorch.org
8
1
August 30, 2023
🌐
GitHub
github.com › pytorch › pytorch › issues › 20433
Dataloader's memory usage keeps increasing during one single epoch. · Issue #20433 · pytorch/pytorch
May 13, 2019 - During each epoch, the memory usage is about 13GB at the very beginning and keeps inscreasing and finally up to about 46Gb, like this: Although it will decrease to 13GB at the beginning of next epoch, this problem is serious to me because in ...
Author   pytorch
🌐
PyTorch Forums
discuss.pytorch.org › t › dataloader-eating-ram › 74064
Dataloader eating ram - PyTorch Forums
March 22, 2020 - I have a dataset of 9 gigs of wav files for music synthesis, and to manage batches across different files i load each file into custom WavFileDataset which i then combine in ConcatDataset to use as a dataset for dataloader. Problems begin when i try to sample from dataloader, even with batch_size = 1 and length of sequences of 100 samples: ram gets quickly filled up to 21gb, stays there and then restarts the notebook session.
🌐
Yuxin's Blog
ppwwyyxx.com › blog › 2022 › Demystify-RAM-Usage-in-Multiprocess-DataLoader
Demystify RAM Usage in Multi-Process Data Loaders - Yuxin's Blog
December 24, 2022 - The essence of the solution is to let all processes share memory through a single torch.Tensor object, which needs to be moved to Linux shared memory by PyTorch's custom pickling routine. The TLDR on how to achieve sharing is: Don't let dataloader workers access many Python objects in their parent.
🌐
PyTorch Forums
discuss.pytorch.org › nlp
Memory usage keeps increasing in Dataloader (in very simple code) - nlp - PyTorch Forums
May 13, 2019 - Hi, I create a dataloader to load features from local files by their file pathes. The dataloader can be simplfied as: (P.S. All codes were tested on Pytorch 1.0.0 and Pytorch 1.0.1. Memory capacity of my machine is 256Gb) import numpy as np import torch import torch.utils.data as data import time class MyDataSet(data.Dataset): def __init__(self): super(MyDataSet, self).__init__() # Assume that the self.infoset here contains the description information about the dataset.
Find elsewhere
🌐
PyTorch Forums
discuss.pytorch.org › t › dataloader-increases-ram-usage-every-iteration › 6636
DataLoader increases RAM usage every iteration - PyTorch Forums
August 24, 2017 - Hello, I’m running into troubles while training a U-net like model. I defined my own dataset class as follows: class CARVANA(Dataset): """ CARVANA dataset that contains car images as .jpg. Each car has 16 i…
🌐
CodeGenes
codegenes.net › blog › pytorch-dataloader-memory
PyTorch Dataloader Memory: A Comprehensive Guide — codegenes.net
The num_workers parameter can significantly affect memory usage. If you set it too high, it may lead to excessive memory consumption due to multiple processes loading data simultaneously.
🌐
PyTorch Forums
discuss.pytorch.org › t › dataloader-memory-leak › 44860
DataLoader Memory Leak? - PyTorch Forums
May 9, 2019 - I have the following situation, I’m trying to train a Unet Learner using fastai’s Library. My data is stored as float16 tensor saved by using torch.save and loaded via a custom load function. In fastai, you create a Lear…
🌐
PyTorch Forums
discuss.pytorch.org › t › cuda-out-of-memory-while-increase-num-workers-in-dataloader › 101877
CUDA out of memory while increase num_workers in DataLoader - PyTorch Forums
November 6, 2020 - My GPU: RTX 3090 Pytorch version: 1.8.0.dev20201104 - pytorch-nightly Python version: 3.7.9 Operating system: Windows CUDA version: 10.2 This case consumes 19.5GB GPU VRAM. train_dataloader = DataLoader(dataset = train_dataset, batch_size = 16, \ ...
🌐
GitHub
github.com › mosaicml › streaming › issues › 652
Out of Memory when using Streaming Dataloader · Issue #652 · mosaicml/streaming
April 13, 2024 - Below is the dataset and dataloader implementation. Each sample is roughly 10 MB. With 16 workers, a prefetch factor of 4, and a batch size of 32, the total memory usage should be, at max, 20 GB.
Author   mosaicml
🌐
PyTorch Forums
discuss.pytorch.org › t › solved-dataloader-runs-out-of-ram-for-a-small-dataset › 21718
[SOLVED] Dataloader: runs out of RAM for a small dataset - PyTorch Forums
July 25, 2018 - I have a dataset with 100 images which occupy around 120 MB and their masks occupy around 4.2 MB. I have written a dataloader. When I try to plot some examples using the following code, the process is killed because of running out of memory. import torchvision import matplotlib.pyplot as plt ...
🌐
Hugging Face
discuss.huggingface.co › 🤗transformers
Trainer + Datasets + Pytorch Dataloader Workers - how to manage memory usage? - 🤗Transformers - Hugging Face Forums
April 28, 2025 - I’m trying to train a model on a large (600GB on disk) dataset of pre-computed embedding vectors. Training is i/o bound (loading embeddings from disk). I’m trying to increase the speed of dataloading by using multiple w…
🌐
GitHub
github.com › pytorch › pytorch › issues › 97432
suspicious memory leak when increase DataLoader's prefetch_factor and enable pin_memory · Issue #97432 · pytorch/pytorch
March 23, 2023 - train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=8, sampler=train_sampler, num_workers=16, pin_memory=True, prefetch_factor=16, persistent_workers=True)
Author   pytorch