Brave Search

How does ThreadPoolExecutor().map differ from ThreadPoolExecutor().submit?

stackoverflow.com › questions › 20838162 › how-does-threadpoolexecutor-map-differ-from-threadpoolexecutor-submit

The problem is that you transform the result of ThreadPoolExecutor.map to a list. If you don't do this and instead iterate over the resulting generator directly, the results are still yielded in the original order but the loop continues before all results are ready. You can test this with this example:

import time
import concurrent.futures

e = concurrent.futures.ThreadPoolExecutor(4)
s = range(10)
for i in e.map(time.sleep, s):
    print(i)

The reason for the order being kept may be because it's sometimes important that you get results in the same order you give them to map. And results are probably not wrapped in future objects because in some situations it may take just too long to do another map over the list to get all results if you need them. And after all in most cases it's very likely that the next value is ready before the loop processed the first value. This is demonstrated in this example:

import concurrent.futures

executor = concurrent.futures.ThreadPoolExecutor() # Or ProcessPoolExecutor
data = some_huge_list()
results = executor.map(crunch_number, data)
finals = []

for value in results:
    finals.append(do_some_stuff(value))

In this example it may be likely that do_some_stuff takes longer than crunch_number and if this is really the case it's really not a big loss of performance while you still keep the easy usage of map.

Also since the worker threads(/processes) start processing at the beginning of the list and work their way to the end to the list you submitted the results should be finished in the order they're already yielded by the iterator. Which means in most cases executor.map is just fine, but in some cases, for example if it doesn't matter in which order you process the values and the function you passed to map takes very different times to run, the future.as_completed may be faster.

Answer from Kritzefitz on Stack Overflow

Python

docs.python.org › 3 › library › concurrent.futures.html

concurrent.futures — Launching parallel tasks — Python 3.14 ...

January 30, 2026 - The returned iterator raises a TimeoutError if __next__() is called and the result isn’t available after timeout seconds from the original call to Executor.map(). timeout can be an int or a float.

reddit.com › r/learnpython › using concurrent.futures.executor.map to map each item of a pandas data frame to a function

r/learnpython on Reddit: Using concurrent.futures.executor.map to map each item of a Pandas data frame to a function

November 24, 2022 -

Hi,

As the title suggests I want to map each item in a Pandas data frame to a function and the function then performs an operation on the given item from the data frame. Here is my code so far

def product_check(item, function_lock):
    if not shop.product_exists(item.sku):
        with function_lock:
            category_id = shop.category_check(item.category)
            time.sleep(3)
        shop.create_product(item, category_id)

def start(user):
    items = [*some data frame*]
    executor = concurrent.futures.ProcessPoolExecutor()
    manager = multiprocessing.Manager()
    function_lock = manager.Lock()
    with executor:
        executor.map(product_check, (items, function_lock,))

However, the with executor part works, but when it gets to the executor.map part the product_check function never runs. Any ideas what could be the issue. Thanks

Top answer

1 of 2

1

Hello, I'm a Reddit bot who's here to help people nicely format their coding questions. This makes it as easy as possible for people to read your post and help you. I think I have detected some formatting issues with your submission: Python code found in submission text that's not formatted as code. Use of triple backtick/ curlywhirly code blocks (``` or ~~~). These may not render correctly on all Reddit clients. If I am correct, please edit the text in your post and try to follow these instructions to fix up your post's formatting. Am I misbehaving? Have a comment or suggestion? Reply to this comment or raise an issue here .

2 of 2

1

Map needs a single iterator, so you can’t call it with the tuple of iterator and second parameter. Create a new iterator args= [ item, function_lock from item in items] and use this as second parameter to map.

Discussions

Make Executor.map work with infinite/large inputs correctly

BPO 29842 Nosy @brianquinlan, @ezio-melotti, @pkch, @MojoVampire, @dlukes, @leezu PRs #707#18566 Note: these values reflect the state of the issue at the time it was migrated and might not reflect ... More on github.com

github.com

11

March 18, 2017

Lazy collection of iterables for concurrent.futures.Executor.map

Currently concurrent.futures.Executor has a map function which consumer iterables immediately rather than lazily (doc ref: concurrent.futures — Launching parallel tasks — Python 3.14.0 documentation). Is this necessary and can we consume it somewhat lazily? More on discuss.python.org

discuss.python.org

5

November 14, 2022

dictionary - Python concurrent executor.map() and submit() - Stack Overflow

123 How does ThreadPoolExecutor().map differ from ThreadPoolExecutor().submit? 5 Python concurrent futures executor.submit timeout More on stackoverflow.com

stackoverflow.com

How to use executor.map function in python on keyword arguments - Stack Overflow

I came across a scenario where i need to run the function parallely for a list of values in python. I learnt executor.map from concurrent.futures will do the job. And I was able to parallelize the More on stackoverflow.com

stackoverflow.com

Videos

youtube.com

PYTHON : Pass multiple parameters to concurrent.futures ...

youtube.com

MultiThreading in Python | Python Concurrent futures ...

09:57

YouTube

Mastering Python 3.x : Using Concurrent.futures to Launch and Manage ...

May 9, 2019

View all

Stack Overflow

stackoverflow.com › questions › 20838162 › how-does-threadpoolexecutor-map-differ-from-threadpoolexecutor-submit

python - How does ThreadPoolExecutor().map differ from ThreadPoolExecutor().submit? - Stack Overflow

Top answer

1 of 4

64

The problem is that you transform the result of ThreadPoolExecutor.map to a list. If you don't do this and instead iterate over the resulting generator directly, the results are still yielded in the original order but the loop continues before all results are ready. You can test this with this example:

import time
import concurrent.futures

e = concurrent.futures.ThreadPoolExecutor(4)
s = range(10)
for i in e.map(time.sleep, s):
    print(i)

The reason for the order being kept may be because it's sometimes important that you get results in the same order you give them to map. And results are probably not wrapped in future objects because in some situations it may take just too long to do another map over the list to get all results if you need them. And after all in most cases it's very likely that the next value is ready before the loop processed the first value. This is demonstrated in this example:

import concurrent.futures

executor = concurrent.futures.ThreadPoolExecutor() # Or ProcessPoolExecutor
data = some_huge_list()
results = executor.map(crunch_number, data)
finals = []

for value in results:
    finals.append(do_some_stuff(value))

In this example it may be likely that do_some_stuff takes longer than crunch_number and if this is really the case it's really not a big loss of performance while you still keep the easy usage of map.

Also since the worker threads(/processes) start processing at the beginning of the list and work their way to the end to the list you submitted the results should be finished in the order they're already yielded by the iterator. Which means in most cases executor.map is just fine, but in some cases, for example if it doesn't matter in which order you process the values and the function you passed to map takes very different times to run, the future.as_completed may be faster.

2 of 4

46

Below is an example of .submit() vs .map(). They both accept the jobs immediately (submitted|mapped - start). They take the same time to complete, 11 seconds (last result time - start). However, .submit() gives results as soon as any thread in the ThreadPoolExecutor maxThreads=2 completes (unordered!). While .map() gives results in the order they are submitted.

import time
import concurrent.futures

def worker(i):
    time.sleep(i)
    return i,time.time()

e = concurrent.futures.ThreadPoolExecutor(2)
arrIn = range(1,7)[::-1]
print arrIn

f = []
print 'start submit',time.time()
for i in arrIn:
    f.append(e.submit(worker,i))
print 'submitted',time.time()
for r in concurrent.futures.as_completed(f):
    print r.result(),time.time()
print

f = []
print 'start map',time.time()
f = e.map(worker,arrIn)
print 'mapped',time.time()
for r in f:
    print r,time.time()

Output:

[6, 5, 4, 3, 2, 1]
start submit 1543473934.47
submitted 1543473934.47
(5, 1543473939.473743) 1543473939.47
(6, 1543473940.471591) 1543473940.47
(3, 1543473943.473639) 1543473943.47
(4, 1543473943.474192) 1543473943.47
(1, 1543473944.474617) 1543473944.47
(2, 1543473945.477609) 1543473945.48

start map 1543473945.48
mapped 1543473945.48
(6, 1543473951.483908) 1543473951.48
(5, 1543473950.484109) 1543473951.48
(4, 1543473954.48858) 1543473954.49
(3, 1543473954.488384) 1543473954.49
(2, 1543473956.493789) 1543473956.49
(1, 1543473955.493888) 1543473956.49

GitHub

gist.github.com › mangecoeur › 9540178

Easy parallel python with concurrent.futures · GitHub

with concurrent.futures.ProcessPoolExecutor() as executor: result = executor.map(function, iterable)

Python Engineer

python-engineer.com › posts › threadpoolexecutor

How to use ThreadPoolExecutor in Python - Python Engineer

May 2, 2022 - The map() method is used to assign tasks to worker threads. This action is non-blocking. It returns an iterable immediately, which on iteration returns the output of the target function, blocking the interpreter process.

Super Fast Python

superfastpython.com › home › tutorials › how to use map() with the threadpoolexecutor in python

How to Use map() with the ThreadPoolExecutor in Python - Super Fast Python

September 12, 2022 - You can execute tasks asynchronously with the ThreadPoolExecutor by calling the map() function. In this tutorial, you will discover how to use the map() function to execute tasks with the thread pool in Python.

Super Fast Python

superfastpython.com › processpoolexecutor-map

How to Use Map With the ProcessPoolExecutor in Python – SuperFastPython

January 21, 2022 - You can also submit tasks by calling the map() function and specify the name of the function to execute and the iterable of items to which your function will be applied. ... # execute each function call in a separate process results = executor.map(task, items)

Find elsewhere

Google Bing Mojeek

GitHub

github.com › python › cpython › issues › 74028

Make Executor.map work with infinite/large inputs correctly · Issue #74028 · python/cpython

March 18, 2017 - gh-74028: concurrent.futures.Executor.map: introduce buffersize param for lazier behavior #125663

Author MojoVampire

Python Module of the Week

pymotw.com › 3 › concurrent.futures

concurrent.futures — Manage Pools of Concurrent Tasks

March 18, 2018 - $ python3 futures_thread_pool_map.py main: starting ThreadPoolExecutor-0_0: sleeping 5 ThreadPoolExecutor-0_1: sleeping 4 main: unprocessed results <generator object Executor.map.<locals>.result_iterator at 0x103e12780> main: waiting for real results ThreadPoolExecutor-0_1: done with 4 ThreadPoolExecutor-0_1: sleeping 3 ThreadPoolExecutor-0_0: done with 5 ThreadPoolExecutor-0_0: sleeping 2 ThreadPoolExecutor-0_0: done with 2 ThreadPoolExecutor-0_0: sleeping 1 ThreadPoolExecutor-0_1: done with 3 ThreadPoolExecutor-0_0: done with 1 main: results: [0.5, 0.4, 0.3, 0.2, 0.1]

Python.org

discuss.python.org › ideas

Lazy collection of iterables for concurrent.futures.Executor.map

November 14, 2022 - Currently concurrent.futures.Executor has a map function which consumer iterables immediately rather than lazily (doc ref: concurrent.futures — Launching parallel tasks — Python 3.14.0 documentation). Is this necessary a…

Python for Network Engineers

pyneng.readthedocs.io › en › latest › book › 19_concurrent_connections › concurrent_futures_map.html

Method map - Python for network engineers

result = executor.map(send_show, devices, repeat('sh clock')) - map method is similar to map function, but here the send_show function is called in different threads.

Super Fast Python

superfastpython.com › home › tutorials › map() vs. submit() with the threadpoolexecutor in python

map() vs. submit() With the ThreadPoolExecutor in Python - Super Fast Python

September 12, 2022 - Use map() when converting a for-loop to use threads and use submit() when you need more control over asynchronous tasks when using the ThreadPoolExecutor in Python. In this tutorial, you will discover the difference between map() and submit() when executing tasks with the ThreadPoolExecutor in Python.

Stack Overflow

stackoverflow.com › questions › 63512329 › python-concurrent-executor-map-and-submit

dictionary - Python concurrent executor.map() and submit() - Stack Overflow

Top answer

1 of 2

2

Here is the map version of your existing code. Note that the callback now accepts a tuple as a parameter. I added an try\except in the callback so the results will not throw an error. The results are ordered according to the input list.

from concurrent.futures import ThreadPoolExecutor
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://www.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the url and contents
def load_url(tt):  # (url,timeout)
    url, timeout = tt
    try:
      with urllib.request.urlopen(url, timeout=timeout) as conn:
         return (url, conn.read())
    except Exception as ex:
        print("Error:", url, ex)
        return(url,"")  # error, return empty string

with ThreadPoolExecutor(max_workers=5) as executor:
    results = executor.map(load_url, [(u,60) for u in URLS])  # pass url and timeout as tuple to callback
    executor.shutdown(wait=True) # wait for all complete
    print("Results:")
for r in results:  # ordered results, will throw exception here if not handled in callback
    print('   %r page is %d bytes' % (r[0], len(r[1])))

Output

Error: http://www.wsj.com/ HTTP Error 404: Not Found
Results:
   'http://www.foxnews.com/' page is 320028 bytes
   'http://www.cnn.com/' page is 1144916 bytes
   'http://www.wsj.com/' page is 0 bytes
   'http://www.bbc.co.uk/' page is 279418 bytes
   'http://some-made-up-domain.com/' page is 64668 bytes

2 of 2

1

Without using the map method, you can use enumerate to build the future_to_url dict with not just the URLs as values, but also their indices in the list. You can then build a dict from the future objects returned by the call to concurrent.futures.as_completed(future_to_url) with indices as the keys, so that you can iterate an index over the length of the dict to read the dict in the same order as the corresponding items in the original list:

with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {
        executor.submit(load_url, url, 60): (i, url) for i, url in enumerate(URLS)
    }
    futures = {}
    for future in concurrent.futures.as_completed(future_to_url):
        i, url = future_to_url[future]
        futures[i] = url, future
    for i in range(len(futures)):
        url, future = futures[i]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

Stack Overflow

stackoverflow.com › questions › 59520376 › how-to-use-executor-map-function-in-python-on-keyword-arguments

How to use executor.map function in python on keyword arguments - Stack Overflow

Top answer

1 of 2

6

If you have an iterable of sites that you want to map and you want to pass the same search_term and pages argument to each call. You can use zip to create an iterable that returns tuples of 3 elements where the first is your list of sites and the 2nd and 3rd are the other parameters just repeating using itertools.repeat

def func(site, search_term, pages):
    ...

from functools import partial
from itertools import repeat
executor.map(func, zip(sites, repeat(search_term), repeat(pages)))

2 of 2

1

Here is a useful way of using kwargs in executor.map, by simply using a lambda function to pass the kwargs with the **kwargs notation:

from concurrent.futures import ProcessPoolExecutor

def func(arg1, arg2, ...):
    ....
 
items = [
    { 'arg1': 0, 'arg2': 3 },
    { 'arg1': 1, 'arg2': 4 },
    { 'arg1': 2, 'arg2': 5 }
]

with ProcessPoolExecutor() as executor:
    result = executor.map(
        lambda kwargs: func(**kwargs), items)

I found this also useful when using Pandas DataFrames by creating the items with to_dict by typing items = df.to_dict(orient='records') or loading data from JSON files.

TutorialsPoint

tutorialspoint.com › concurrency_in_python › concurrency_in_python_pool_of_processes.htm

Concurrency in Python - Pool of Processes

Consider the following Python script example to understand this. import time import concurrent.futures value = [8000000, 7000000] def counting(n): start = time.time() while n > 0: n -= 1 return time.time() - start def main(): start = time.time() with concurrent.futures.ProcessPoolExecutor() as executor: for number, time_taken in zip(value, executor.map(counting, value)): print('Start: {} Time taken: {}'.format(number, time_taken)) print('Total time taken: {}'.format(time.time() - start)) if __name__ == '__main__': main()

Runebook.dev

runebook.dev › en › docs › python › library › concurrent.futures › concurrent.futures.Executor.map

Mastering Executor.map(): Pitfalls and Alternatives in Python Concurrency

The Executor.map() method is a straightforward way to apply a function to a sequence of arguments asynchronously. It works similar to Python's built-in map(), but it executes the function calls concurrently using a pool of threads ...

Stack Overflow

stackoverflow.com › questions › 41903230 › understanding-concurrent-futures-executor-map

python - Understanding concurrent.futures.Executor.map() - Stack Overflow

Top answer

1 of 1

4

It will allow you to execute a function multiple times concurrently instead true parallel execution.

Performance wise, I recently found that the ProcessPoolExecutor.submit() and ProcessPoolExecutor.map() consumed the same amount of compute time to complete the same task. Note: .submit() returns a future object (let's call it f) and you need to use it's f.result option to see it's result. On the other hand, .map() directly returns an iterator.

When converting their results into an ordered list using the sorted method, I have found that compute time of the entire .map()code can be faster than entire .submit() code in certain scenarios.

When converting their results into an unordered list using the list method, the compute time of the entire .submit() and .map() codes are the same. Also, these codes performed faster than the codes using the sorted method.

You can read the details in my answer. There, I have also shared my codes where you can see how they work. I hope they can be helpful to you.

I have not used ThreadPoolExecutor so I can't comment in detail. However, I have read that they are implemented the same way as the ProcessPoolExecutor and they are more suited to be used for I/O bound tasks instead of CPU bound tasks. You do need to specify the max_workers argument, i.e. the max number of threads, whereas in the ProcessPoolExecutor max_workers is an optional argument which defaults to the number of CPUs returned by os.cpu_count().

Python

bugs.python.org › issue30323

Issue 30323: concurrent.futures.Executor.map() consumes all memory when big generators are used - Python tracker

This issue tracker has been migrated to GitHub, and is currently read-only. For more information, see the GitHub FAQs in the Python's Developer Guide · This issue has been migrated to GitHub: https://github.com/python/cpython/issues/74508

jdhao's digital space

jdhao.github.io › 2020 › 12 › 29 › python_concurrent_futures

Using Concurrent.futures in Python · jdhao's digital space

April 26, 2022 - import time from contextlib import contextmanager from concurrent.futures import ThreadPoolExecutor import concurrent.futures @contextmanager def report_time(des): start = time.time() yield end = time.time() print(f"Time for {des}: {end-start}") def square(x): time.sleep(0.5) return x*x def main(): num = 10 with report_time("using executor.map"): # with ThreadPoolExecutor(max_workers=10) as executor: with ThreadPoolExecutor() as executor: res = executor.map(square, range(num)) res = list(res) with report_time('using executor.submit'): with ThreadPoolExecutor() as executor: my_futures = [executor.submit(square, x) for x in range(num)] res = [] for future in concurrent.futures.as_completed(my_futures): res.append(future.result()) print(res) if __name__ == "__main__": main()

Miguendes

miguendes.me › how-to-pass-multiple-arguments-to-a-map-function-in-python

Python: Using the map() Function With Multiple Arguments (2021)

November 28, 2020 - Learn the easy way to pass multiple params to map(), multiprocessing pool.map, threadpool, processpool executor.map ... The map() function is everywhere in Python. It's a built in, it's part of the concurrent.futures.Executor, and also multiprocessing.Pool; but...