I have a script that continuosly generates dictionaries with the same keys but different values. I want to write the results out in a json file but to do that i need to do load & dump into the file.
with open('sample.json') as f:
data = json.load(f)
data.update(results)
with open('sample.json', 'w') as f:
json.dump(data, f)The above code only overwrites the existing data it doesn't append it. I figured it is beacuse the dictionaries have the same keys because if i try it with a different dictionary template, the append does happen.
Is there a way to append similar dictionaries without overwriting?
TLDR
Instead of
my_dict[word] = count
You may want to use
my_dict[word] = my_dict[word] + [count]
This is assuming:
- count is the line(s) that the word was found in
- you want each key to have a list of count denoting which lines the word was found on
Explanation
To assign values to a dict you assign to a specific key in the same way that you might assign a value to an index of a list. You did that in this line here:
# assign value "count" to dict "my_dict" at key "word"
my_dict[word] = count
So you're already partway there! What we want now is to add to a preexisting value. Remembering that in Python assignment replaces the value of a variable, then we want to be careful to assign with our updated value. With int variables we do this by taking the previous value and adding to it (ex a = a + 2), likewise to add to a list in a dict key we take the previous value and assign that plus what was added to it.
Example
my_dict = {} # {}
my_dict["a"] = [1] # {"a": [1]}
my_dict["a"] = my_dict["a"] + [2] # {"a": [1, 2]}
This works because my_dict["a"] evaluates to it's current value (a list), and you can then append another list to that list to create a list with the values of both lists. Like
list_a = [1, 2]
list_b = [3, 4]
list_a = list_a + list_b # == [1, 2, 3, 4]
Cool Tip
Since the operation of adding to an existing value is common in Python, it actually has a shorthand for this pattern.
Instead of
a = a + 2
You also have the option to write the shorthand of that which is
a += 2
May be this is what you want not sure.if word not in dict initiate a empty list. Then append when found.use strip removetrailing characters.
def inverse_index():
my_dict = {}
with open('doc0.txt','r') as f:
for count, value in enumerate(f):
for word in value.strip().split():
if(word.strip() not in my_dict):
my_dict[word.strip()] = []
print(count, word.strip())
my_dict[word.strip()].append(count)
print(my_dict)
Hey guys, I've been trying to merge two dictionaries using the update command, but it always overwrites one of the value if there's a shared key between the two dictionaries. How would I be able to merge two dictionaries without losing any values.
An oversimplified example would be:
Input:
d1= {'boy': 1,'girl':2}
d2= {'boy': 'tall','girl':'short'}The output I want:
d3= {'boy':(1,'tall'),'girl':(2,'short')}Thanks in advance.
Just switch the order:
z = dict(d2.items() + d1.items())
By the way, you may also be interested in the potentially faster update method.
In Python 3, you have to cast the view objects to lists first:
z = dict(list(d2.items()) + list(d1.items()))
If you want to special-case empty strings, you can do the following:
def mergeDictsOverwriteEmpty(d1, d2):
res = d2.copy()
for k,v in d2.items():
if k not in d1 or d1[k] == '':
res[k] = v
return res
Updates d2 with d1 key/value pairs, but only if d1 value is not None, '' (False):
>>> d1 = dict(a=1, b=None, c=2)
>>> d2 = dict(a=None, b=2, c=1)
>>> d2.update({k: v for k, v in d1.items() if v})
>>> d2
{'a': 1, 'c': 2, 'b': 2}
(Use iteritems() instead of items() in Python 2.)
You can subclass dict and in particular, override the __setitem__ method.
This sounds like what you want:
class SpecialDict(dict):
def __setitem__(self, key, value):
if not key in self:
super(SpecialDict, self).__setitem__(key, value)
else:
raise Exception("Cannot overwrite key!") # You can make your own type here if needed
x = SpecialDict()
x['a'] = 1
x['b'] = 2
x['a'] = 3 #raises Exception
Instead of subclassing dict as suggested by JacobIRR, you could also define a helper function for storing a key-value pair in a dict that throws an exception when the key already exists:
class KeyExistsException(Exception):
pass
def my_add(the_dict, the_key, the_value):
if the_key in the_dict:
raise KeyExistsException("value already exists")
the_dict[the_key] = the_value
d = {}
my_add(d, 'a', 1)
my_add(d, 'b', 2)
my_add(d, 'a', 3) # <-- raise Exception! "KeyExistsException'
You need to append these names to a list. The following is a naive example and might break.
for name in file:
if name in list:
if dict.has_key(param):
dict[param].append(name)
else:
dict[param] = [name]
Or if you want to be a little cleaner, you can use collections.defaultdict. This is the pythonic way if you ask me.
d = collections.defaultdict(list)
for name in file:
if name in lst: # previously overwrote list()
d[param].append(name)
Please do not overwrite the builtin dict() or list() function with your own variable.
You are looking for a multi map, i.e. a map or dictionary which does not have a key -> value relation but a key -> many values, see:
See Is there a 'multimap' implementation in Python?
You need make sure each value of the new dictionary is a list. Instead of
new_dict[item] = k # the value is a string
you need
new_dict[item] = [k] # the value is a list with the item
Then you can append onto that list. Full code:
latin_dic ={'apple': ['malum', 'pomum', 'popula'], 'fruit': ['baca', 'bacca', 'popum'], 'punishment': ['malum', 'multa']}
new_dict={}
for k,v in latin_dic.items():
for item in v:
if item in new_dict:
new_dict[item].append(k)
else:
new_dict[item] = [k]
for k,v in sorted(new_dict.items()):
print(k, '-', v)
You can't use append because it only works for lists. But in your new dictionary, the keys are always strings so when you write new_dict[item] = k you will say that the value of your new dictionary is just a string and not a list. We can fix that very easily:
latin_dic ={'apple': ['malum', 'pomum', 'popula'], 'fruit': ['baca', 'bacca', 'popum'], 'punishment': ['malum', 'multa']}
new_dict={}
for k,v in latin_dic.items():
for item in v:
if item in new_dict:
new_dict[item].append(k)
else:
new_dict[item] = [k]
for k,v in sorted(new_dict.items()):
print(k, '-', v)
Now we declare your new_dict values always as lists and you can use append.
You'll have to decide how many levels of nested dict you want to merge, but it's very doable:
dict_a = {"2": {"2": 0, "12": "94960", "25": "61026"},
"229": {"101": "29043", "106": "25298", "110": "48283", "112": "16219", "126": "35669", "147": "37675"}}
dict_b = {"1": {"1": 0, "2": "84543", "3": "34854", "5": "123439"},
"229": {"2": "71355", "12": "24751", "25": "33600", "229": 0}}
for k, v in dict_b.items():
if k in dict_a:
for sub_k, sub_v in v.items():
if sub_k in dict_a[k]: #decide what to do when there's a conflict at this level
print("conflict: keeping value from dict_a")
else:
dict_a[k][sub_k] = sub_v #add missing item to dict_a[k]
else:
dict_a[k] = v #add missing item to dict_a
print(dict_a)
extending the problem to merge an arbitrary number of levels is a simple problem of recursion (though you have to make sure your data structure is consistent, and make decisions on how to resolve conflicts in data):
def recursive_merge(dict_a, dict_b, levels):
for k, v in dict_b.items():
if k in dict_a:
if levels > 1 and isinstance(v, dict) and isinstance(dict_a[k], dict):
recursive_merge(dict_a[k], dict_b[k], levels - 1)
else:
pass #keep the value from dict_a
else:
dict_a[k] = v
return dict_a
print(recursive_merge(dict_a, dict_b, 2))
You'll want to merge every nested dict from both dicts to produce a new nested merged dict of merged dicts. You can do that with a dictionary comprehension:
dict_a_updated = {k: {**dict_a.get(k, {}), **dict_b.get(k, {})}
for k in {*dict_a, *dict_b}}
This produces a set of all keys of the outer dict with {*dict_a, *dict_b}, and based on that a new dict with those keys (k), the value of which is a merger of the value of those keys of both dicts ({**dict_a.get(k, {}), **dict_b.get(k, {})}).