The simple way to do this is with a queue.Queue for the work and starting the threads with for _ in range(MAXTHREADS): threading.Thread(target=f, args=(the_queue,)).start(). I find this easier to read by subclassing Thread, however. Your mileage may vary.
import threading
import queue
class Worker(threading.Thread):
def __init__(self, q, other_arg, *args, **kwargs):
self.q = q
self.other_arg = other_arg
super().__init__(*args, **kwargs)
def run(self):
while True:
try:
work = self.q.get(timeout=3) # 3s timeout
except queue.Empty:
return
# do whatever work you have to do on work
self.q.task_done()
q = queue.Queue()
for ptf in b:
q.put_nowait(ptf)
for _ in range(20):
Worker(q, otherarg).start()
q.join() # blocks until the queue is empty.
If you're insistent about using a function, I'd suggest wrapping your targetFunction with something that knows how to get from the queue.
def wrapper_targetFunc(f, q, somearg):
while True:
try:
work = q.get(timeout=3) # or whatever
except queue.Empty:
return
f(work, somearg)
q.task_done()
q = queue.Queue()
for ptf in b:
q.put_nowait(ptf)
for _ in range(20):
threading.Thread(target=wrapper_targetFunc,
args=(targetFunction, q, otherarg)).start()
q.join()
Answer from Adam Smith on Stack OverflowBeginner Question on Queue and Multithreading
multithreading - Threading in python using queue - Stack Overflow
multithreading - How to process with thread Queue in python - Stack Overflow
Python multithreading and SQLite or a similar DB
Videos
I'm starting to dabble with Queue and Multithreading. Conceptually, I get when and where you would want to introduce these into your code. However, I just want to make sure I'm understanding the use case relationship between the two. Most online resources seem to treat the two as conjoined, but this seems to be under the assumption they will be used at larger scales.
My question, is Queue paired with Multithreading just so you can control the number of threads at any given moment without the program ending due to a lack of open threads? So, if I won't need more than 10 threads at a time then incorporating Queue might not be necessary. However, if I plan to have over 100 threads concurrently (assuming this is hitting some kind of processing limitation) then Queue should be incorporated?
Thanks!
Setting the thread's to be daemon threads causes them to exit when the main is done. But, yes you are correct in that your threads will run continuously for as long as there is something in the queue else it will block.
The documentation explains this detail Queue docs
The python Threading documentation explains the daemon part as well.
The entire Python program exits when no alive non-daemon threads are left.
So, when the queue is emptied and the queue.join resumes when the interpreter exits the threads will then die.
EDIT: Correction on default behavior for Queue
Your script works fine for me, so I assume you are asking what is going on so you can understand it better. Yes, your subclass puts each thread in an infinite loop, waiting on something to be put in the queue. When something is found, it grabs it and does its thing. Then, the critical part, it notifies the queue that it's done with queue.task_done, and resumes waiting for another item in the queue.
While all this is going on with the worker threads, the main thread is waiting (join) until all the tasks in the queue are done, which will be when the threads have sent the queue.task_done flag the same number of times as messages in the queue . At that point the main thread finishes and exits. Since these are deamon threads, they close down too.
This is cool stuff, threads and queues. It's one of the really good parts of Python. You will hear all kinds of stuff about how threading in Python is screwed up with the GIL and such. But if you know where to use them (like in this case with network I/O), they will really speed things up for you. The general rule is if you are I/O bound, try and test threads; if you are cpu bound, threads are probably not a good idea, maybe try processes instead.
good luck,
Mike