The complexity of in depends entirely on what L is. e in L will become L.__contains__(e).
See this time complexity document for the complexity of several built-in types.
Here is the summary for in:
- list - Average: O(n)
- set/dict - Average: O(1), Worst: O(n)
The O(n) worst case for sets and dicts is very uncommon, but it can happen if __hash__ is implemented poorly. This only happens if everything in your set has the same hash value.
The complexity of in depends entirely on what L is. e in L will become L.__contains__(e).
See this time complexity document for the complexity of several built-in types.
Here is the summary for in:
- list - Average: O(n)
- set/dict - Average: O(1), Worst: O(n)
The O(n) worst case for sets and dicts is very uncommon, but it can happen if __hash__ is implemented poorly. This only happens if everything in your set has the same hash value.
It depends entirely on the type of the container. Hashing containers (dict, set) use the hash and are essentially O(1). Typical sequences (list, tuple) are implemented as you guess and are O(n). Trees would be average O(log n). And so on. Each of these types would have an appropriate __contains__ method with its big-O characteristics.
What is the time complexity of the “in” operation
algorithm - Complexity of creating list with concat operator in python - Stack Overflow
Time complexity of Counters in Python
What is the time complexity of slicing a list?
Videos
I’m not the biggest python user. But I was looking at a friends code yesterday and they had something like:
For x in (list of 40000)
For y in (list of 2.7 million)
If x = y
Append something This was obviously super slow so they changed it to something like:
For x in (list of 2.7 million)
If y in (list of 40000)
Append something
This moved much faster. I get the point of one for loop being faster than two, but what is that “in” exists function doing that makes it so much faster. I always thought that to check if something exists is O(n) which shouldn’t be faster. Also this was for ML purposes so they were likely using numpy stuff.
The time complexity is not O(N)
The time complexity of the concat operation for two lists, A and B, is O(A + B). This is because you aren't adding to one list, but instead are creating a whole new list and populating it with elements from both A and B, requiring you to iterate through both.
Therefore, doing the operation l = l + [i] is O(len(l)), leaving you with N steps of doing an N operation, resulting in an overall complexity of O(N^2)
You are confusing concat with the append or extend function, which doesn't create a new list but adds to the original. If you used those functions, your time complexity would indeed be O(N)
An additional note:
The notation l = l + [i] can be confusing because intuitively it seems like [i] is simply being added to the existing l. This isn't true!
l + [i] builds a entirely new list and then has l point to that list.
On the other hand l += [i] modifies the original list and behaves like extend
Here is my thought process: I know that the concat operator is O(k) where k is the size of the list being added to the original list. Since the size of k is always 1 in this case because we are adding one character lists at a time, the concat operation takes 1 step.
This assumption is incorrect. If you write:
l + [i]
you construct a new list, this list will have m+1 elements, with m the number of elements in l, given a list is implemented like an array, we know that constructing such list will take O(m) time. We then assign the new list to l.
So that means that the total number of steps is:
n
---
\ 2
/ O(m) = O(n )
---
m=0
so the time complexity is O(n2).
You can however boost performance, by using l += [i], or even faster l.append(i), where the amortize cost is, for both l += [i] and l.append(i) O(1), so then the algorithm is O(n), the l.append(i) will however likely be a bit faster because we save on constructing a new list, etc.