Simply create the list using list comprehension:

[[foo(m, f) for f in F] for m in M]

Related to pre-allocation: Pre-allocating a list of None

Answer from Ashwini Chaudhary on Stack Overflow
🌐
Hacker News
news.ycombinator.com › item
Pre-Allocated Lists in Python | Hacker News
March 31, 2022 - In [4]: length = 10000 In [5]: %%timeit ...: #preallocating list ...: pre = [None] \* length ...: for i in range(length): ...: pre[i] = i ...: 346 µs ± 10.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) In [6]: %%timeit ...: pre = np.empty(length, dtype=int) ...: for i in ...
🌐
Reddit
reddit.com › r/learnpython › is it really best practice to preallocate lists
r/learnpython on Reddit: Is it really best practice to preallocate lists
July 25, 2020 -

The thought of preallocating memory brings back trauma from when I had to learn C, but in a recent non-computing class that heavily uses Python I was told that preallocating lists is "best practices".

As an example, add the number c to every element of list a:

c=20
a=[1,2,3,4,5]
b=[None] *len(a) 

for i in range(len(a)): # this was also how they iterated a lot of things 
  b[i] = a[i] + c

Now I'm a comp sci student and I don't remember ever being told that this is best practice and I have never been penalised for not preallocating and I have never seen any of my friends preallocate lists either. From my understanding it's not "Pythonic" , but it can be more efficient when working with larger data sets as you don't have to constantly grow the list.

I argued to just use append (and a different iterator):

c=20
a=[1,2,3,4,5]
b=[] 

for n in a:  
  b.append(n+c)

The python taught is relatively introductory so they don't even mention list comprehension.

Thoughts? Because apparently they spent 30 minutes discussing whether or not to preallocate lists before settling on teaching us to preallocate lists.

Discussions

To preallocate or not to preallocate lists in Python - Stack Overflow
When should and shouldn't I preallocate a list of lists in python? For example, I have a function that takes 2 lists and creates a lists of lists out of it. Quite like, but not exactly, matrix More on stackoverflow.com
🌐 stackoverflow.com
python - What is the preferred way to preallocate NumPy arrays? - Stack Overflow
I am new to NumPy/SciPy. From the documentation, it seems more efficient to preallocate a single array rather than call append/insert/concatenate. For example, to add a column of 1's to an array, i More on stackoverflow.com
🌐 stackoverflow.com
python - Pre-allocating a list of None - Stack Overflow
Additionally, the overhead of the rest of your Python code is usually so large, that the tiny speedup that can be obtained by pre-allocating is insignificant. So in most cases, simply forget about pre-allocating, unless your profiler tells you that appending to a list is a bottleneck. The other answers show some profiling of the list preallocation ... More on stackoverflow.com
🌐 stackoverflow.com
python - Should I preallocate a numpy array? - Stack Overflow
I have a class and its method. The method repeats many times during execution. This method uses a numpy array as a temporary buffer. I don't need to store values inside the buffer between method ca... More on stackoverflow.com
🌐 stackoverflow.com
🌐
Quora
quora.com › How-do-you-preallocate-space-for-lists-in-Python
How to preallocate space for lists in Python - Quora
If mutable placeholders are used ... for lists in Python means creating a list with a fixed length (filled with placeholders) so you can assign by index without repeated append reallocations....
🌐
Rednafi
rednafi.com › home › python › pre-allocated lists in python
Pre-allocated lists in Python | Redowan's Reflections
March 27, 2022 - Understand CPython list memory allocation: how lists store pointer references, grow dynamically, and when pre-allocation with [None]*n helps.
Top answer
1 of 3
28

Preallocation mallocs all the memory you need in one call, while resizing the array (through calls to append,insert,concatenate or resize) may require copying the array to a larger block of memory. So you are correct, preallocation is preferred over (and should be faster than) resizing.

There are a number of "preferred" ways to preallocate numpy arrays depending on what you want to create. There is np.zeros, np.ones, np.empty, np.zeros_like, np.ones_like, and np.empty_like, and many others that create useful arrays such as np.linspace, and np.arange.

So

ar0 = np.linspace(10, 20, 16).reshape(4, 4)

is just fine if this comes closest to the ar0 you desire.

However, to make the last column all 1's, I think the preferred way would be to just say

ar0[:,-1]=1

Since the shape of ar0[:,-1] is (4,), the 1 is broadcasted to match this shape.

2 of 3
11

In cases where performance is important, np.empty and np.zeros appear to be the fastest ways to initialize numpy arrays.

Below are test results for each method and a few others. Values are in seconds.

>>> timeit("np.empty(1000000)",number=1000, globals=globals())
0.033749611208094166
>>> timeit("np.zeros(1000000)",number=1000, globals=globals())
0.03421245135849915
>>> timeit("np.arange(0,1000000,1)",number=1000, globals=globals())
1.2212416112155324
>>> timeit("np.ones(1000000)",number=1000, globals=globals())
2.2877375495381145
>>> timeit("np.linspace(0,1000000,1000000)",number=1000, globals=globals())
3.0824269766860652
🌐
Python
docs.python.org › 3 › c-api › memory.html
Memory Management — Python 3.14.4 documentation
Memory management in Python involves a private heap containing all Python objects and data structures. The management of this private heap is ensured internally by the Python memory manager. The Python memory manager has different components which deal with various dynamic storage management aspects, like sharing, segmentation, preallocation or caching.
Find elsewhere
🌐
Python
bugs.python.org › issue36551
Issue 36551: Optimize list comprehensions with preallocate size and protect against overflow - Python tracker
This issue tracker has been migrated to GitHub, and is currently read-only. For more information, see the GitHub FAQs in the Python's Developer Guide · This issue has been migrated to GitHub: https://github.com/python/cpython/issues/80732
Top answer
1 of 3
23

When you append an item to a list, Python 'over-allocates', see the source-code of the list object. This means that for example when adding 1 item to a list of 8 items, it actually makes room for 8 new items, and uses only the first one of those. The next 7 appends are then 'for free'.

In many languages (e.g. old versions of Matlab, the newer JIT might be better) you are always told that you need to pre-allocate your vectors, since appending during a loop is very expensive. In the worst case, appending of a single item to a list of length n can cost O(n) time, since you might have to create a bigger list and copy all the existing items over. If you need to do this on every iteration, the overall cost of adding n items is O(n^2), ouch. Python's pre-allocation scheme spreads the cost of growing the array over many single appends (see amortized costs), effectively making the cost of a single append O(1) and the overall cost of adding n items O(n).

Additionally, the overhead of the rest of your Python code is usually so large, that the tiny speedup that can be obtained by pre-allocating is insignificant. So in most cases, simply forget about pre-allocating, unless your profiler tells you that appending to a list is a bottleneck.

The other answers show some profiling of the list preallocation itself, but this is useless. The only thing that matters is profiling your complete code, with all your calculations inside your loop, with and without pre-allocation. If my prediction is right, the difference is so small that the computation time you win is dwarfed by the time spent thinking about, writing and maintaining the extra lines to pre-allocate your list.

2 of 3
21

In between those two options the first one is clearly better as no Python for loop is involved.

>>> %timeit [None] * 100
1000000 loops, best of 3: 469 ns per loop
>>> %timeit [None for x in range(100)] 
100000 loops, best of 3: 4.8 us per loop

Update:

And list.append has an O(1) complexity too, it might be a better choice than pre-creating list if you assign the list.append method to a variable.

>>> n = 10**3
>>> %%timeit
lis = [None]*n           
for _ in range(n):
    lis[_] = _
... 
10000 loops, best of 3: 73.2 us per loop
>>> %%timeit
lis = []                 
for _ in range(n):
    lis.append(_)
... 
10000 loops, best of 3: 92.2 us per loop
>>> %%timeit
lis = [];app = lis.append
for _ in range(n):
    app(_)
... 
10000 loops, best of 3: 59.4 us per loop

>>> n = 10**6
>>> %%timeit
lis = [None]*n
for _ in range(n):
    lis[_] = _
... 
10 loops, best of 3: 106 ms per loop
>>> %%timeit
lis = []      
for _ in range(n):
    lis.append(_)
... 
10 loops, best of 3: 122 ms per loop
>>> %%timeit
lis = [];app = lis.append
for _ in range(n):
    app(_)
... 
10 loops, best of 3: 91.8 ms per loop
🌐
MicroPython
docs.micropython.org › en › v1.9 › unix › reference › speed_python.html
Maximising Python Speed — MicroPython 1.9 documentation
May 26, 2017 - Nonetheless, memoryview is indispensable for advanced preallocated buffer management. .readinto() method discussed above puts data at the beginning of buffer and fills in entire buffer. What if you need to put data in the middle of existing buffer? Just create a memoryview into the needed section of buffer and pass it to .readinto(). This is a process known as profiling and is covered in textbooks and (for standard Python...
🌐
IncludeHelp
includehelp.com › python › preferred-way-to-preallocate-numpy-arrays.aspx
Preferred way to preallocate NumPy arrays
When we resize an array by either append(), insert(), concatenate(), or resize(), there may be a need for copying the array to a larger block of memory and that is why preallocation is preferred over resizing.
🌐
DEV Community
dev.to › srbhr › understanding-lists-in-python-an-in-depth-overview-3aak
Understanding Lists in Python: An In-Depth Overview - DEV Community
July 22, 2023 - This tells us about where we can use Python Lists and where we should think of alternatives. Preallocate List Space: If you know how many items your list will hold, preallocate space for it using the [None] * n syntax.
🌐
GitHub
gist.github.com › richbai90 › 808179585438767de960c6759f68154f
A guide for memory optimization in machine learning · GitHub
By default, python treats all numerical values as float64, which requires 8 bytes of memory per number. It may be bennefitical to specify the data type of the array if we know that our value will require less precision. This can be done by passing the dtype argument to the np.zeros function.
Top answer
1 of 4
11

Maybe you should consider using NumPy. It seems like you're doing numerical work, which is what it's made for. This is the fastest so far, not including the import statement:

import numpy
Np = 80
zeroMatrix = numpy.zeros((Np, Np))

Times:

>python -m timeit -s "import numpy; Np = 80" "zeroMatrix = numpy.zeros((Np, Np))"
100000 loops, best of 3: 4.36 usec per loop

>python -m timeit -s "Np = 80" "zeroArray = [0]*Np" "zeroMatrix = [None] * Np" "for i in range(Np):" "  zeroMatrix[i] = zeroArray[:]"
10000 loops, best of 3: 62.5 usec per loop

>python -m timeit -s "Np = 80" "zeroMatrix = [[0] * Np for i in range (Np)]"
10000 loops, best of 3: 77.5 usec per loop

>python -m timeit -s "Np = 80" "zeroMatrix = [[0 for _ in range(Np)] for _ in range(Np)]"
1000 loops, best of 3: 474 usec per loop
2 of 4
10

You could do this:

zeroMatrix = [[0] * Np for i in range(Np)]

Update: Well if we're going to make it into a race, I've found something faster (on my computer) than Omnifarious' method. This doesn't beat numpy of course; but this is all academic anyway right? I mean we're talking about microseconds here.

I think this works because it avoids append and avoids preallocating zeroMatrix.

zeroArray = [0] * Np
zeroMatrix = [zeroArray[:] for i in range(Np)]

My test results:

$ python -m timeit -s "Np = 80" "zeroMatrix = [[0] * Np for i in range(Np)]"
1000 loops, best of 3: 200 usec per loop
$ python -m timeit -s "Np = 80" "zeroArray = [0] * Np" "zeroMatrix = [None] * Np" "for i in range(Np):" "    zeroMatrix[i] = zeroArray[:]"
10000 loops, best of 3: 171 usec per loop
$ python -m timeit -s "Np = 80" "zeroArray = [0] * Np" "zeroMatrix = [zeroArray[:] for i in range(Np)]"
10000 loops, best of 3: 165 usec per loop
🌐
sqlpey
sqlpey.com › python › top-8-ways-to-preallocate-a-list-in-python
Top 8 Ways to Preallocate a List in Python for Optimal Performance
November 6, 2024 - import time def append_items(size): result = [] for i in range(size): result.append(i) def allocate_items(size): result = [None] * size for i in range(size): result[i] = i def time_function(func, size): start_time = time.time() func(size) return time.time() - start_time size = 100000 append_time = time_function(append_items, size) allocate_time = time_function(allocate_items, size) print(f"Appending time: {append_time:.5f} seconds") print(f"Preallocating time: {allocate_time:.5f} seconds") Using a list comprehension provides a more Pythonic and readable way to create lists:
🌐
Python
mail.python.org › pipermail › python-ideas › 2013-March › 019809.html
[Python-ideas] Length hinting and preallocation for container types
May 3, 2013 - > > A new context manager aids users with preallocation and shrinking: > > class LengthHint: > def __init__(self, container, hint): > self.container = container > self.hint = hint > self.instance = None > > def __enter__(self): > self.instance = self.container() > self.instance.__preallocate__(self.hint) > return self.instance > > def __exit__(self, exc_type, exc_val, exc_tb): > self.instance.__shrink__() > > > with LengthHint(list, 200) as lst: > # lst has 200 ob_item slots but len(lst) == 0 > for i in range(100): > lst.append(i*i) > # __exit__ shrinks ob_item to 100 > The real problem is that this code is not idiomatic Python, especially if you want it to be reasonably fast: lst = [] for i in range(100): lst.append(i*i) Why not: lst = [i*i for i in range(100)] If the "append" pattern is complex, just "preallocate" like this: lst = [0] * 100 And then fill it.
Top answer
1 of 2
1

I did a bit of googling and found this lovely article with some C code to do exactly what you're asking on Windows. Here's that C code translated to ctypes (written for readability):

    import ctypes
    import msvcrt
    # https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-setfileinformationbyhandle
    set_file_information = ctypes.windll.kernel32.SetFileInformationByHandle

    class AllocationInfo(ctypes.Structure):
        _fields_ = [('AllocationSize', ctypes.c_longlong)]
    
    def allocate(file, length):
        """Tell the filesystem to preallocate `length` bytes on disk for the specified `file` without increasing the
        file's length.
        In other words, advise the filesystem that you intend to write at least `length` bytes to the file.
        """
        allocation_info = AllocationInfo(length)
        retval = set_file_information(ctypes.c_long(msvcrt.get_osfhandle(file.fileno())),
                                      ctypes.c_long(5),  # constant for FileAllocationInfo in the FILE_INFO_BY_HANDLE_CLASS enum
                                      ctypes.pointer(allocation_info),
                                      ctypes.sizeof(allocation_info)
                                      )
        if retval != 1:
            raise OSError('SetFileInformationByHandle failed')

This will change the file's Size on disk: as shown in file explorer to the length you specify (plus a few kilobytes for metadata), but leave the Size: unchanged.

However, in the half hour I've spent googling, I've not found a way to do that on POSIX. fallocate() actually does the exact opposite of what you're after: it sets the file's apparent length to the length you give it, but allocates it as a sparse extent on the disk, so writing to multiple files simultaneously will still result in fragmentation. Ironic, isn't it, that Windows has a file management feature that POSIX lacks?

I'd love nothing more than to be proven wrong, but I don't think it's possible.

2 of 2
1
FILENAME = "somefile.bin"
SIZE = 4200000

with open(FILENAME, "wb") as file:
    file.seek(SIZE - 1)
    file.write(b"\0")

Advantages:

  1. Portable across all platforms.
  2. Very efficient if you'd be mmaping (memory-mapping) the files to perform writes on them (via MADV_SEQUENTIAL if sequential access is needed).