Python is recursively checking each element of the dictionaries to ensure equality. See the C dict_equal() implementation, which checks each and every key and value (provided the dictionaries are the same length); if dictionary b has the same key, then a PyObject_RichCompareBool tests if the values match too; this is essentially a recursive call.
Dictionaries are not hashable because their __hash__ attribute is set to None, and most of all they are mutable, which is disallowed when used as a dictionary key.
If you were to use a dictionary as a key, and through an existing reference then change the key, then that key would no longer slot to the same position in the hash table. Using another, equal dictionary (be it equal to the unchanged dictionary or the changed dictionary) to try and retrieve the value would now no longer work because the wrong slot would be picked, or the key would no longer be equal.
Answer from Martijn Pieters on Stack OverflowWhat does the == operator actually do on a Python dictionary? - Stack Overflow
python - When to use dictionary | (merge) vs |= (update) operator - Stack Overflow
in operator for dictionary VS. in operator for dict.keys()
python - What does `**` mean in the expression `dict(d1, **d2)`? - Stack Overflow
Videos
Python is recursively checking each element of the dictionaries to ensure equality. See the C dict_equal() implementation, which checks each and every key and value (provided the dictionaries are the same length); if dictionary b has the same key, then a PyObject_RichCompareBool tests if the values match too; this is essentially a recursive call.
Dictionaries are not hashable because their __hash__ attribute is set to None, and most of all they are mutable, which is disallowed when used as a dictionary key.
If you were to use a dictionary as a key, and through an existing reference then change the key, then that key would no longer slot to the same position in the hash table. Using another, equal dictionary (be it equal to the unchanged dictionary or the changed dictionary) to try and retrieve the value would now no longer work because the wrong slot would be picked, or the key would no longer be equal.
From docs:
Mappings (dictionaries) compare equal if and only if their sorted (key, value) lists compare equal .[5] Outcomes other than equality are resolved consistently, but are not otherwise defined. [6]
Footnote [5]:
The implementation computes this efficiently, without constructing lists or sorting.
Footnote [6]:
Earlier versions of Python used lexicographic comparison of the sorted (key, value) lists, but this was very expensive for the common case of comparing for equality. An even earlier version of Python compared dictionaries by identity only, but this caused surprises because people expected to be able to test a dictionary for emptiness by comparing it to {}.
Main Questions are at end of post...
I'm going over the Two Sum problem on LeetCode, and the optimal solution was to use a dictionary. This involved a "look up" to see if a dictionary key existed. Can someone explain to me how exactly python "looks up" a dictionary key to see if it exists. Particularly in comparing these two lines of code:
# where n is a key, and d is a dictionary if n in d: # where n is a key, and d.keys() is a viewing object (a list??) if n in d.keys():
I was confused because these two methods took around the same time on leetcode, and I thought the second method would have a time complexity of O(n) because (I thought) it was just a list.
I did further testing with the timeit module... my code and its results are below:
# checking time complexity of dict key look up methods # basically I'm running both methods and checking how much time each takes as the dictionary grows # I thought that the second method would take longer and longer as the dictionary grew, but that doesn't seem to be the case
Code:
from timeit import Timer
in_op = Timer("x in d", "from __main__ import x, d")
key_method = Timer("x in d.keys()", "from __main__ import x, d")
for i in range(1_000_000, 10_000_001, 1_000_000):
x = i
d = {i:0 for i in range(i)}
d[i] = 'yes'
in_op_time = in_op.timeit(1000)
d = {i:0 for i in range(i)}
d[i] = 'yes'
key_method_time = key_method.timeit(1000)
print(f"{in_op_time=} and {key_method_time=}")Results:
in_op_time=3.087499999999965e-05 and key_method_time=7.250000000000312e-05 in_op_time=3.008300000001407e-05 and key_method_time=6.941599999998882e-05 in_op_time=2.9749999999939547e-05 and key_method_time=6.862500000004434e-05 in_op_time=3.0790999999918967e-05 and key_method_time=6.945800000002222e-05 in_op_time=3.0165999999942628e-05 and key_method_time=7.025000000027148e-05 in_op_time=3.0250000000009436e-05 and key_method_time=6.829099999983157e-05 in_op_time=3.00840000000413e-05 and key_method_time=6.845799999988245e-05 in_op_time=2.958299999988867e-05 and key_method_time=6.824999999999193e-05 in_op_time=2.954200000004903e-05 and key_method_time=6.891600000002995e-05 in_op_time=2.9916999999990423e-05 and key_method_time=6.833300000064213e-05
My questions are:
-
How does the first case "know" that the key doesn't exist? I understand when you try to access a key that doesn't exist, python gives an error, so I'm assuming that's the same mechanism?
-
If the dict.keys() doesn't return a list, then what does it return?
-
How does the in operator work with dict.keys()?... Like how does it bypass the returned object and go straight to looking up the key with normal dictionary methods?
** in argument lists has a special meaning, as covered in section 4.7 of the tutorial. The dictionary (or dictionary-like) object passed with **kwargs is expanded into keyword arguments to the callable, much like *args is expanded into separate positional arguments.
The ** turns the dictionary into keyword parameters:
Copy>>> d1 = {'a': 1, 'b': 2}
>>> d2 = {'c': 3, 'd': 4}
>>> d3 = dict(d1, **d2)
Becomes:
Copy>>> d3 = dict(d1, c=3, d=4)