python difflib dictionaries

stackoverflow.com › questions › 12956957 › print-diff-of-python-dictionaries

Adapted from the cpython source:

https://github.com/python/cpython/blob/01fd68752e2d2d0a5f90ae8944ca35df0a5ddeaa/Lib/unittest/case.py#L1091

import difflib
import pprint

def compare_dicts(d1, d2):
    return ('\n' + '\n'.join(difflib.ndiff(
                   pprint.pformat(d1).splitlines(),
                   pprint.pformat(d2).splitlines())))

Answer from user2733517 on Stack Overflow

Python

docs.python.org › 3 › library › difflib.html

difflib — Helpers for computing deltas

Source code: Lib/difflib.py This module provides classes and functions for comparing sequences. It can be used for example, for comparing files, and can produce information about file differences i...

W3Schools

w3schools.com › python › ref_module_difflib.asp

Python difflib Module

Python Examples Python Compiler Python Exercises Python Quiz Python Challenges Python Server Python Syllabus Python Study Plan Python Interview Q&A Python Bootcamp Python Certificate Python Training ... import difflib words = ["ape", "apple", "peach", "puppy"] print(difflib.get_close_matches("appel", words, n=1)) Try it Yourself »

Videos

06:51

YouTube

An Intro to Python's difflib module - YouTube

How to Create a Simple but Effective Diff-Tool in Python - YouTube

November 9, 2021

09:32

YouTube

Python difflib | Exploring the Python 3 standard library | | Pt ...

January 18, 2019

View all

PyPI

pypi.org › project › datadiff

datadiff · PyPI

DataDiff is a library to provide human-readable diffs of python data structures. It can handle sequence types (lists, tuples, etc), sets, and dictionaries.

      » pip install datadiff

Published Aug 27, 2023

Version 2.2.0

Homepage http://sourceforge.net/projects/datadiff/

Stack Overflow

stackoverflow.com › questions › 12956957 › print-diff-of-python-dictionaries

Print diff of Python dictionaries - Stack Overflow

Top answer

1 of 8

Adapted from the cpython source:

https://github.com/python/cpython/blob/01fd68752e2d2d0a5f90ae8944ca35df0a5ddeaa/Lib/unittest/case.py#L1091

import difflib
import pprint

def compare_dicts(d1, d2):
    return ('\n' + '\n'.join(difflib.ndiff(
                   pprint.pformat(d1).splitlines(),
                   pprint.pformat(d2).splitlines())))

2 of 8

You can use difflib, but the use unittest method seems more appropriate to me. But if you wanted to use difflib. Let's say say the following are the two dicts.

In [50]: dict1
Out[50]: {1: True, 2: False}

In [51]: dict2
Out[51]: {1: False, 2: True}

You may need to convert them to strings (or list of strings) and then go about using difflib as a normal business.

In [43]: a = '\n'.join(['%s:%s' % (key, value) for (key, value) in sorted(dict1.items())])
In [44]: b = '\n'.join(['%s:%s' % (key, value) for (key, value) in sorted(dict2.items())])
In [45]: print a
1:True
2:False
In [46]: print b
1:False
2:True
In [47]: for diffs in difflib.unified_diff(a.splitlines(), b.splitlines(), fromfile='dict1', tofile='dict2'):
    print diffs

THe output would be:

--- dict1

+++ dict2

@@ -1,2 +1,2 @@

-1:True
-2:False
+1:False
+2:True

GitHub

github.com › inveniosoftware › dictdiffer

GitHub - inveniosoftware/dictdiffer: Dictdiffer is a module that helps you to diff and patch dictionaries. · GitHub

Dictdiffer is a helper module that helps you to diff and patch dictionaries.

Starred by 849 users

Forked by 96 users

Languages Python 99.3% | Shell 0.7%

Readthedocs

dictdiffer.readthedocs.io

Dictdiffer — Dictdiffer 0.7.2.dev20180504 documentation

Dictdiffer is a helper module that helps you to diff and patch dictionaries.

ActiveState

code.activestate.com › recipes › 576644-diff-two-dictionaries

Diff Two Dictionaries « Python recipes « ActiveState Code

February 5, 2009 - class DictDiffer(object): """ Calculate the difference between two dictionaries as: (1) items added (2) items removed (3) keys same in both but changed values (4) keys same in both and unchanged values """ def __init__(self, current_dict, past_dict): self.current_dict, self.past_dict = current_dict, past_dict self.set_current, self.set_past = set(current_dict.keys()), set(past_dict.keys()) self.intersect = self.set_current.intersection(self.set_past) def added(self): return self.set_current - self.intersect def removed(self): return self.set_past - self.intersect def changed(self): return set(o for o in self.intersect if self.past_dict[o] != self.current_dict[o]) def unchanged(self): return set(o for o in self.intersect if self.past_dict[o] == self.current_dict[o])

GitHub

gist.github.com › z0mbiehunt3r › b117d0d5baf76a3b4b3b

python dict diff · GitHub

python dict diff. GitHub Gist: instantly share code, notes, and snippets.

Find elsewhere

Google Bing Mojeek

Python Module of the Week

pymotw.com › 2 › difflib

difflib – Compare sequences - Python Module of the Week

While the Differ class shows all of the input lines, a unified diff only includes modified lines and a bit of context. In Python 2.3, the unified_diff() function was added to produce this sort of output: import difflib from difflib_data import * diff = difflib.unified_diff(text1_lines, text2_lines, lineterm='') print '\n'.join(list(diff))

Stack Overflow

stackoverflow.com › questions › 70379779 › how-to-efficiently-compare-two-dictionaries-of-lists-of-strings-using-difflib

python - how to efficiently compare two dictionaries of lists of strings using difflib? - Stack Overflow

Top answer

1 of 2

Your code seems legit. I did a couple of little tweaks that would shave off a couple of microseconds per loop:

No need for the two sorted calls because difflib can calculate an order-indifferent comparison with quick_ratio (Checkout the documentation here for the difference between ratio, quick_ratio, and real_quick_ratio).
No need for the enumerate to access mat by i and j.
Removed the access of the list through index first_dict[index] and second_dict[index]

def naive_ratio_comparison(first_dict, second_dict):
    mat = []
    for second in second_dict.values():
        for first in first_dict.values():
            sm = difflib.SequenceMatcher(None, first, second)
            mat.append(sm.quick_ratio())
    result = np.resize(mat, (len(second_dict), len(first_dict)))
    return result

2 of 2

If one dict has M entries and the other N, then you're going to have to do M*N .ratio() calls. There's no way around that, and it's going to be costly.

However, you can easily arrange to do only M+N sorts instead of (as shown) M*N sorts.

For computing .ratio(), the most valuable hint is in the docs:

SequenceMatcher computes and caches detailed information about the second sequence, so if you want to compare one sequence against many sequences, use set_seq2() to set the commonly used sequence once and call set_seq1() repeatedly, once for each of the other sequences.

Putting that all together:

firsts = list(map(sorted, first_dict.values())) # sort these only once

sm = difflib.SequenceMatcher(None)
for i, second in enumerate(second_dict.values()):
    sm.set_seq2(sorted(second))
    for j, first in enumerate(firsts):
        sm.set_seq1(first)
        mat[i, j] = sm.ratio()

That should deliver exactly the same results. To minimize the number of expensive .set_seq2() calls, it would - of course - be best to arrange for the shorter dict to be called "second_dict".

Alternative

It's worth asking whether you actually want difflib at all here. What are you really trying to accomplish? Nothing here looks at the contents of the strings of at all, beyond noting whether or not two strings are equal.

Perhaps what you really want is a different measure of "similarity". For example, one based on how many strings two lists have in common. If so, here's a way that doesn't use difflib:

    from collections import Counter
    cfirst = [(Counter(v), len(v)) for v in first_dict.values()]
    csecond = [(Counter(v), len(v)) for v in second_dict.values()]
    for i, (second, n2) in enumerate(csecond):
        for j, (first, n1) in enumerate(cfirst):
            mat[i, j] = sum((first & second).values()) * 2 / (n1 + n2)

That gives the same results on the specific example you gave, but is significantly cheaper to compute. The "ratio" computed here is the the total number of strings the two lists have in common, divided by the total number of strings in the two lists. That's easy to compute using Counters directly.

@Bilal Qandeel's answer suggested using difflib's .quick_ratio() instead, which happens to compute something similar under the covers. But that .quick_ratio() is order-independent is an undocumented implementation detail, and it's quicker to leave difflib out of it entirely if that is good enough.

NOTE: starting with Python 3.10,

            mat[i, j] = sum((first & second).values()) * 2 / (n1 + n2)

can be replaced by

            mat[i, j] = (first & second).total() * 2 / (n1 + n2)

Beautiful Soup

tedboy.github.io › python_stdlib › generated › generated › difflib.Differ.html

difflib.Differ — Python Standard Library

difflib » · difflib.Differ · View page source · class difflib.Differ(linejunk=None, charjunk=None)[source]¶ · Differ is a class for comparing sequences of lines of text, and producing human-readable differences or deltas. Differ uses SequenceMatcher both to compare sequences of lines, ...

Stack Exchange

codereview.stackexchange.com › questions › 50947 › searching-a-dictionary-of-words-using-difflib

python - Searching a dictionary of words using difflib - Code Review Stack Exchange

Top answer

1 of 1

Using difflib is probably the best choice. So the problem comes down to the size of your word list. If we can pare down the number of words difflib needs to compare, then we could get a faster time.

One idea to implement, is take only words that are near-enough in length:

# Returns the min and max thresholds for word length
def get_thresholds(word, min=3, max=3):
    length = len(word)
    return max(1, length-min), length+max 

# Returns only words whose length is within a certain range
def filter_word_list(min, max):
    return [word for word in words_list if min <= len(word) <= max]

So the call to get_close_matches() would be:

closeMatches = difflib.get_close_matches(termL,
                       dictionaryFile.filter_word_list(*get_thresholds(termL)))

Another idea would be to filter words that begin with a letter that is spatially related to the word's first letter on the keyboard. However, this suggestion is not as plausible due to different keyboard layouts.

As some general comments:

Take a look at your variable names. They are descriptive (for the most part) but the official Python style guide recommends using underscores_between_words instead of camelCaseNames.
Same basic idea for module names. Pythonic module names are written as MyModule instead of myModule. Also, each different module import gets its own line:
```
# Not `import os, sys`
import os
import sys
```

Stack Overflow

stackoverflow.com › questions › 21345952 › difflib-to-compare-two-python-dictionaries

dictionary - difflib to compare two python dictionaries - Stack Overflow

Top answer

1 of 1

You have a few issues here. First off, you're trying to use the method difflib.Differ.compare, but you're calling it as a plain function - you have not actually created a difflib.Differ object.

Second, this compare method expects you to operate upon a sequence of strings (for each of the two things being compared). Your convert function is sometimes returning strings, sometimes dicts, sometimes other stuff... in general, you're not getting back sequences of strings.

The natural way to get what you want is to just compare the actual JSON data, because that's a string. However, there are two issues there:

you want a sequence of strings (line-by-line) instead of a single string with the whole JSON document, but that's trivial - just split it up into lines with the string .splitlines method.
your input might have differences in whitespace that you want to ignore. The simple way around this is to, after loading each JSON document into an object, re-create a string for it with dumps. The idea is that for both documents that you're comparing, you will dump with the same whitespace settings. You need to read the documentation and decide what settings you want to use.

Stack Overflow

stackoverflow.com › questions › tagged › difflib

Highest scored 'difflib' questions - Stack Overflow

How can I tell difflib.get_close_matches() to ignore case? I have a dictionary which has a defined format which includes capitalisation. However, the test string might have full capitalisation or no ... ... Using Python, I'd like to output the difference between two strings as a unified diff (-u) while, optionally, ignoring blank lines (-B) and spaces (-w).

Beautiful Soup

tedboy.github.io › python_stdlib › generated › difflib.html

difflib — Python Standard Library

Module difflib – helpers for computing deltas between objects.

HexDocs

hexdocs.pm › difflib

Difflib v0.1.0 — Documentation

We cannot provide a description for this page right now

GitHub

github.com › python › cpython › blob › main › Lib › difflib.py

cpython/Lib/difflib.py at main · python/cpython

Module difflib -- helpers for computing deltas between objects.

Author python

Medium

medium.com › boring-tech › python-programming-the-standard-library-difflib-28ffaf5c1155

Python Programming | difflib. This module in the python standard… | by rnab | Boring Tech | Medium

January 19, 2019 - from difflib import get_close_matches word_list = ['acdefgh', 'abcd','adef','cdea'] str1 = 'abcd' matches = get_close_matches(str1, word_list, n=2, cutoff=0.3) print(matches)

PyPI

pypi.org › project › deepdiff

deepdiff · PyPI

Deep Difference and Search of any Python object/data. Recreate objects by adding adding deltas to each other. ... DeepDiff is now part of Qluster. If you're building workflows around data validation and correction, Qluster gives your team a structured way to manage rules, review failures, approve fixes, and reuse decisions—without building the entire system from scratch. DeepDiff: Deep Difference of dictionaries, iterables, strings, and ANY other object.

      » pip install deepdiff

Published Mar 30, 2026

Version 9.0.0

Homepage https://zepworks.com/deepdiff/