pytorch dataloader workers

How does the "number of workers" parameter in PyTorch dataloader actually work?

stackoverflow.com › questions › 53998282 › how-does-the-number-of-workers-parameter-in-pytorch-dataloader-actually-work

When num_workers>0, only these workers will retrieve data, main process won't. So when num_workers=2 you have at most 2 workers simultaneously putting data into RAM, not 3.
Well our CPU can usually run like 100 processes without trouble and these worker processes aren't special in anyway, so having more workers than cpu cores is ok. But is it efficient? it depends on how busy your cpu cores are for other tasks, speed of cpu, speed of your hard disk etc. In short, its complicated, so setting workers to number of cores is a good rule of thumb, nothing more.
Nope. Remember DataLoader doesn't just randomly return from what's available in RAM right now, it uses batch_sampler to decide which batch to return next. Each batch is assigned to a worker, and main process will wait until the desired batch is retrieved by assigned worker.

Lastly to clarify, it isn't DataLoader's job to send anything directly to GPU, you explicitly call cuda() for that.

EDIT: Don't call cuda() inside Dataset's __getitem__() method, please look at @psarka's comment for the reasoning

Answer from Shihab Shahriar Khan on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 53998282 › how-does-the-number-of-workers-parameter-in-pytorch-dataloader-actually-work

python - How does the "number of workers" parameter in PyTorch dataloader actually work? - Stack Overflow

Top answer

1 of 1

128

When num_workers>0, only these workers will retrieve data, main process won't. So when num_workers=2 you have at most 2 workers simultaneously putting data into RAM, not 3.
Well our CPU can usually run like 100 processes without trouble and these worker processes aren't special in anyway, so having more workers than cpu cores is ok. But is it efficient? it depends on how busy your cpu cores are for other tasks, speed of cpu, speed of your hard disk etc. In short, its complicated, so setting workers to number of cores is a good rule of thumb, nothing more.
Nope. Remember DataLoader doesn't just randomly return from what's available in RAM right now, it uses batch_sampler to decide which batch to return next. Each batch is assigned to a worker, and main process will wait until the desired batch is retrieved by assigned worker.

Lastly to clarify, it isn't DataLoader's job to send anything directly to GPU, you explicitly call cuda() for that.

EDIT: Don't call cuda() inside Dataset's __getitem__() method, please look at @psarka's comment for the reasoning

PyTorch Forums

discuss.pytorch.org › t › guidelines-for-assigning-num-workers-to-dataloader › 813

Guidelines for assigning num_workers to DataLoader - PyTorch Forums

March 1, 2017 - I realize that to some extent this comes down to experimentation, but are there any general guidelines on how to choose the num_workers for a DataLoader object? Should num_workers be equal to the batch size? Or the nu…

Discussions

A clear explanation of what num_workers=0 means for a DataLoader

Hello, the pytorch documentation it says that setting num_workers=0 for a DataLoader causes it to be handled by the “main process” from the pytorch doc: " 0 means that the data will be loaded in the main process." maybe i’m wrong but usually i find that the pytorch doc gives often (but ... More on discuss.pytorch.org

discuss.pytorch.org

April 15, 2023

DataLoader persistent_workers Usage

Hello, I’m trying to better understand the operation of the persistent_workers option for DataLoader. My understanding is that the dataloader will not stop the worker processes that have been consuming the dataset after you stop consuming from it. To me this implies that it will save the ... More on discuss.pytorch.org

discuss.pytorch.org

October 3, 2023

Communicating with Dataloader workers

Hey, I am having some issues with how the dataloader works when multiple workers are used. In my dataset, I resize the images to the input dimensions of the network. I am training a fully convolutional network and I can thus change the input dimension of my network in order to make it more ... More on discuss.pytorch.org

discuss.pytorch.org

December 22, 2017

number of workers of data loader for reading data from HDD

The number of workers are the processes used to "get the minibatches ready" for your training loop. If you have multiple workers, minibatches can be loaded in parallel. So this has nothing to do with your model's accuracy/performance, but more with the time your model needs to train. Since the workers have to be coordinated, too many workers will actually slow you down. This is probably dependent on your individual setup. In my experience, 4-7 workers are fine - but you can just test this by timing your training for a few epochs. More on reddit.com

r/pytorch

August 28, 2024

Videos

10:16

YouTube

PyTorch DataLoaders Overview and Examples (batch_size, shuffle, ...

6. Dataloader in PyTorch - YouTube

April 5, 2021

15.8K

deeplizard.com

PyTorch DataLoader Source Code - Debugging Session - deeplizard

06:38

YouTube

PyTorch DataLoader num_workers - Deep Learning Speed Limit Increase ...

September 29, 2019

6.93K

deeplizard.com

PyTorch DataLoader num_workers - Deep Learning Speed Limit Increase ...

06:41

YouTube

PyTorch Lecture 08: PyTorch DataLoader - YouTube

October 29, 2017

View all

GeeksforGeeks

geeksforgeeks.org › deep learning › how-the-number-of-workers-parameter-in-pytorch-dataloader-actually-works

How the "Number of Workers" Parameter in PyTorch DataLoader Actually Works - GeeksforGeeks

July 23, 2025 - Set num_workers=0 for single-threaded data loading. b. Set num_workers>0 to enable multi-threaded data loading. 4. Initialize model and optimizer. 5. Start training loop: a. For each epoch: i. Iterate over DataLoader to fetch batches of data. ii. Pass data to the model for training.

PyTorch Forums

discuss.pytorch.org › t › a-clear-explanation-of-what-num-workers-0-means-for-a-dataloader › 177614

A clear explanation of what num_workers=0 means for a DataLoader - PyTorch Forums

April 15, 2023 - Hello, the pytorch documentation it says that setting num_workers=0 for a DataLoader causes it to be handled by the “main process” from the pytorch doc: " 0 means that the data will be loaded in the main process." ma…

Medium

chtalhaanwar.medium.com › pytorch-num-workers-a-tip-for-speedy-training-ed127d825db7

PyTorch num_workers, a tip for speedy training | by Talha Anwar | Medium

September 23, 2021 - There is a huge debate what should be the optimal num_workers for your dataloader. Num_workers tells the data loader instance how many sub-processes to use for data loading. If the num_worker is zero (default) the GPU has to weight for CPU to ...

AWS

docs.aws.amazon.com › codeguru › detector-library › python › pytorch-data-loader-with-multiple-workers

Pytorch data loader with multiple workers | Amazon Q, Detector Library

Using DataLoader with num_workers greater than 0 can cause increased memory consumption over time when iterating over native Python objects such as list or dict. Pytorch uses multiprocessing in this scenario placing the data in shared memory. However, reference counting triggers copy-on-writes ...

PyTorch Forums

discuss.pytorch.org › data

DataLoader persistent_workers Usage - data - PyTorch Forums

October 3, 2023 - Hello, I’m trying to better understand the operation of the persistent_workers option for DataLoader. My understanding is that the dataloader will not stop the worker processes that have been consuming the dataset after you stop consuming from it. To me this implies that it will save the state of the Dataloader instance and when you come back to consume more batches it will pick up where it left off.

Find elsewhere

Google Bing Mojeek

Lightning AI

lightning.ai › docs › pytorch › stable › advanced › speed.html

Speed Up Model Training — PyTorch Lightning 2.6.1 documentation

In this case, setting persistent_workers=True in your dataloader will significantly speed up the worker startup time across epochs. GPUs of the generation Ampere or later (A100, H100, etc.) support low-precision matrix multiplication to trade-off precision for performance: # Default used by PyTorch ...

PyTorch Forums

discuss.pytorch.org › t › communicating-with-dataloader-workers › 11473

Communicating with Dataloader workers - PyTorch Forums

December 22, 2017 - Hey, I am having some issues with how the dataloader works when multiple workers are used. In my dataset, I resize the images to the input dimensions of the network. I am training a fully convolutional network and I can thus change the input dimension of my network in order to make it more ...

PyTorch

docs.pytorch.org › docs › stable › data.html

Redirecting…

Redirecting… · Continue to ../2.12/data.html

Kaggle

kaggle.com › questions-and-answers › 175432

How does the “number of workers” parameter in PyTorch dataloader actually work? | Kaggle

The value of num_workers decides the number of cores of cpu to be used for data processing. If you assign num_workers=0, it uses one core of the cpu. If you assign num_workers greater than the number of cores you have available, it will simply ...

PyTorch Lightning

pytorch-lightning.readthedocs.io › en › 0.10.0 › performance.html

Fast Performance — PyTorch-Lightning 0.10.0 documentation

Dataloader(dataset, num_workers=8, pin_memory=True)

reddit.com › r/pytorch › number of workers of data loader for reading data from hdd

r/pytorch on Reddit: number of workers of data loader for reading data from HDD

August 28, 2024 -

Hello,will there be an advantage of using num_workers > 0 when reading data from a hdd during training? and is there a downside to my models accuracy when using less workers. Thank you for your response