Does this mean that the PyTorch training is using 33 processes X 15 GB = 495 GB of memory?
Not necessary. You have a worker process (with several subprocesses - workers) and the CPU has several cores. One worker usually loads one batch. The next batch can already be loaded and ready to go by the time the main process is ready for another batch. This is the secret for the speeding up.
I guess, you should use far less num_workers.
It would be interesting to know your batch size too, which you can adapt for the training process as well.
Is there a more accurate way of calculating the total amount of RAM being used by the main PyTorch program and all its child DataLoader worker processes?
I was googling but could not find a concrete formula. I think that it is a rough estimation of how many cores has your CPU and Memory and Batch Size.
To choose the num_workers depends on what kind of computer you are using, what kind of dataset you are taking, and how much on-the-fly pre-processing your data requires.
HTH
Answer from j35t3r on Stack OverflowDoes this mean that the PyTorch training is using 33 processes X 15 GB = 495 GB of memory?
Not necessary. You have a worker process (with several subprocesses - workers) and the CPU has several cores. One worker usually loads one batch. The next batch can already be loaded and ready to go by the time the main process is ready for another batch. This is the secret for the speeding up.
I guess, you should use far less num_workers.
It would be interesting to know your batch size too, which you can adapt for the training process as well.
Is there a more accurate way of calculating the total amount of RAM being used by the main PyTorch program and all its child DataLoader worker processes?
I was googling but could not find a concrete formula. I think that it is a rough estimation of how many cores has your CPU and Memory and Batch Size.
To choose the num_workers depends on what kind of computer you are using, what kind of dataset you are taking, and how much on-the-fly pre-processing your data requires.
HTH
There is a python function called tracemalloc which is used to trace memory blocks allocated to python. https://docs.python.org/3/library/tracemalloc.html
- Tracebacks
- Statics on memory per filename
- Compute the diff between snapshots
import tracemalloc
tracemalloc.start()
do_someting_that_consumes_ram_and releases_some()
# show how much RAM the above code allocated and the peak usage
current, peak = tracemalloc.get_traced_memory()
print(f"{current:0.2f}, {peak:0.2f}")
tracemalloc.stop()
https://discuss.pytorch.org/t/measuring-peak-memory-usage-tracemalloc-for-pytorch/34067