dictionary - how is the time complexity of dict.clear() is O(1) in python? - Stack Overflow
How to delete from a deque in constant time without "pointers"?
There's a technique which I call 'lazy popping' which can help here.
The idea is that you don't delete immediately from the queue. Rather, you leave deleted items in the queue, but mark them as deleted in another data structure -- usually a set. Whenever you have to pop an item to execute, keep popping until you reach an item that hasn't yet been deleted.
This gives you constant-time push, amortized constant-time pop (although you may pop multiple deleted items off the queue each time you pop an item to execute, each item only gets popped exactly once) , and constant-time deletion, which is better than what you can get by maintaining a list and deleting from start or middle.
In this case, you'd save the IDs of deleted items in the set. It looks like this (untested code):
import collections
class DeletableQueue:
def __init__(self):
self.deleted = set()
self.queue = collections.deque()
def push(self, item):
self.queue.append(item)
def pop(self):
# Precondition: there is at least one non-deleted item on the queue.
while id(q[0]) in deleted:
q[0].pop_left() # Discard an already-deleted item.
return q.pop_left() # Return the actual item to pop
def delete(self, item_to_delete):
self.deleted.add(id(item_to_delete)) More on reddit.com how much time will it take to remove a key, value pair from a dictionary?
Time complexity of Counters in Python
Some rationale for why it is being claimed to be O(1):
The clear() method is actually just assigning the internal dictionary structures to new empty values (as can be seen in the source). The seemingly O(n) part is a result of decrementing reference counts, and other GC-related stuff. But this is purely a function of the GC approach that CPython uses (i.e. reference counting); you can envision different approaches that wouldn't require explicit cleanup like this, or where the cleanup would happen much later (or even be amortized away). Since ideally the time complexity of clear() shouldn't depend on the underlying GC approach, all GC-related parts are omitted, making it "O(1)". IMO this is mostly a definitional argument than anything else, but this is at least some justification.
I first thought that dict.clear just performed some reference decrease to let the garbage collector do the dirty non-O(1) work, but looking at the source code (thanks timgeb for providing the link) it doesn't seem to be that:
oldvalues = mp->ma_values;
if (oldvalues == empty_values)
return;
/* Empty the dict... */
dictkeys_incref(Py_EMPTY_KEYS);
mp->ma_keys = Py_EMPTY_KEYS;
mp->ma_values = empty_values;
mp->ma_used = 0;
mp->ma_version_tag = DICT_NEXT_VERSION();
/* ...then clear the keys and values */
if (oldvalues != NULL) {
n = oldkeys->dk_nentries;
for (i = 0; i < n; i++)
Py_CLEAR(oldvalues[i]);
What I see is that if the dictionary had values, then a loop is performed to decrease references those values and set the pointers to NULL. So seems to be O(n) not O(1) since it depends on the number of the values.
When you assign to a new dict like this d = {}, this is O(1), but the garbage collector must delete the old object when not referenced anymore. That may not be right when assigning, but that will happen, unless python quits abruptly.