Why is removing elements from a list so slow, and is there a faster way?
python - Speed of del vs remove on list - Stack Overflow
how much time will it take to remove a key, value pair from a dictionary?
How to delete from a deque in constant time without "pointers"?
There's a technique which I call 'lazy popping' which can help here.
The idea is that you don't delete immediately from the queue. Rather, you leave deleted items in the queue, but mark them as deleted in another data structure -- usually a set. Whenever you have to pop an item to execute, keep popping until you reach an item that hasn't yet been deleted.
This gives you constant-time push, amortized constant-time pop (although you may pop multiple deleted items off the queue each time you pop an item to execute, each item only gets popped exactly once) , and constant-time deletion, which is better than what you can get by maintaining a list and deleting from start or middle.
In this case, you'd save the IDs of deleted items in the set. It looks like this (untested code):
import collections
class DeletableQueue:
def __init__(self):
self.deleted = set()
self.queue = collections.deque()
def push(self, item):
self.queue.append(item)
def pop(self):
# Precondition: there is at least one non-deleted item on the queue.
while id(q[0]) in deleted:
q[0].pop_left() # Discard an already-deleted item.
return q.pop_left() # Return the actual item to pop
def delete(self, item_to_delete):
self.deleted.add(id(item_to_delete)) More on reddit.com I was trying to write a simple application, which is ao supposed to filter a list of words down to a list of words of a certain length. For that I could either remove the words of the wrong length, or create a new list of words with the correct length.
I had a list of around 58000 words, and wanted to filter out all the 6 letter words, which are around 6900.
with open('words.txt') as f:
words = f.readlines()
for i in range(len(words)):
words[i] = words[i].strip()
length = int(input("Desired word length "))
for i in reversed(words):
if len(i) != length:
words.remove(i)This took 22 seconds.
Another way is to just create a new list with words of the correct length. I did this as follows:
with open('words.txt') as f:
words = f.readlines()
for i in range(len(words)):
words[i] = words[i].strip()
length = int(input("Desired word length "))
clw = []
for i in words:
if len(i) == length:
clw.append(i)This only took 0.03 seconds. How can it be that creating a list of 6900 words takes 0.03 seconds, but removing 51100 words takes 22? It's only 7 times as many words, but takes 700 times as long. And is there a better and faster way to quickly remove list elements?
If you know the index already, you'd use del.
Otherwise, remove first needs to traverse the list, find the (first) index for the element, then del it. This would, therefore, make it slower.
Without any knowledge on how del or remove() performs, we can write a test using the timeit library to determine which one is faster. In this example, I simulate using both methods 10,000 times and print the average time required:
import timeit
num_runs = 10000
del_method = 'lst = [1, 2, 3]; del lst[i]'
del_setup = 'i = 0'
print(timeit.Timer(del_method, setup=del_setup).timeit(number=num_runs))
remove_method = 'lst = [1, 2, 3]; lst.remove(ele)'
remove_setup = 'ele = 1'
print(timeit.Timer(remove_method, setup=remove_setup).timeit(number=num_runs))
Ouput:
0.0005947000000000036
0.0007260000000000044
As we can see, del performs faster in this simple scenario. This makes sense knowing that remove() performs a search before removing the element. I can imagine with an even larger list the difference between the times would only grow.