This is actually a reasonably challenging problem (due to what "difference" means often being in the eye of the beholder here, as there will be semantically "equivalent" information that you probably don't want marked as differences).

You could try using xmldiff, which is based on work in the paper Change Detection in Hierarchically Structured Information.

Answer from Nick Bastin on Stack Overflow
Top answer
1 of 4
12

This is actually a reasonably challenging problem (due to what "difference" means often being in the eye of the beholder here, as there will be semantically "equivalent" information that you probably don't want marked as differences).

You could try using xmldiff, which is based on work in the paper Change Detection in Hierarchically Structured Information.

2 of 4
7

My approach to the problem was transforming each XML into a xml.etree.ElementTree and iterating through each of the layers. I also included the functionality to ignore a list of attributes while doing the comparison.

The first block of code holds the class used:

import xml.etree.ElementTree as ET
import logging

class XmlTree():

    def __init__(self):
        self.hdlr = logging.FileHandler('xml-comparison.log')
        self.formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')

    @staticmethod
    def convert_string_to_tree( xmlString):

        return ET.fromstring(xmlString)

    def xml_compare(self, x1, x2, excludes=[]):
        """
        Compares two xml etrees
        :param x1: the first tree
        :param x2: the second tree
        :param excludes: list of string of attributes to exclude from comparison
        :return:
            True if both files match
        """

        if x1.tag != x2.tag:
            self.logger.debug('Tags do not match: %s and %s' % (x1.tag, x2.tag))
            return False
        for name, value in x1.attrib.items():
            if not name in excludes:
                if x2.attrib.get(name) != value:
                    self.logger.debug('Attributes do not match: %s=%r, %s=%r'
                                 % (name, value, name, x2.attrib.get(name)))
                    return False
        for name in x2.attrib.keys():
            if not name in excludes:
                if name not in x1.attrib:
                    self.logger.debug('x2 has an attribute x1 is missing: %s'
                                 % name)
                    return False
        if not self.text_compare(x1.text, x2.text):
            self.logger.debug('text: %r != %r' % (x1.text, x2.text))
            return False
        if not self.text_compare(x1.tail, x2.tail):
            self.logger.debug('tail: %r != %r' % (x1.tail, x2.tail))
            return False
        cl1 = x1.getchildren()
        cl2 = x2.getchildren()
        if len(cl1) != len(cl2):
            self.logger.debug('children length differs, %i != %i'
                         % (len(cl1), len(cl2)))
            return False
        i = 0
        for c1, c2 in zip(cl1, cl2):
            i += 1
            if not c1.tag in excludes:
                if not self.xml_compare(c1, c2, excludes):
                    self.logger.debug('children %i do not match: %s'
                                 % (i, c1.tag))
                    return False
        return True

    def text_compare(self, t1, t2):
        """
        Compare two text strings
        :param t1: text one
        :param t2: text two
        :return:
            True if a match
        """
        if not t1 and not t2:
            return True
        if t1 == '*' or t2 == '*':
            return True
        return (t1 or '').strip() == (t2 or '').strip()

The second block of code holds a couple of XML examples and their comparison:

xml1 = "<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"

xml2 = "<note><to>Tove</to><from>Daniel</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"

tree1 = XmlTree.convert_string_to_tree(xml1)
tree2 = XmlTree.convert_string_to_tree(xml2)

comparator = XmlTree()

if comparator.xml_compare(tree1, tree2, ["from"]):
    print "XMLs match"
else:
    print "XMLs don't match"

Most of the credit for this code must be given to syawar

🌐
PyPI
pypi.org › project › xmldiff
xmldiff · PyPI
result = main.patch_file('file.diff', 'file1.xml') ... Easier to maintain, the code is less complex and more Pythonic, and uses more custom classes instead of just nesting lists and dicts.
      » pip install xmldiff
    
Published   May 13, 2024
Version   2.7.0
Discussions

Best ways to compare xml files with python
What are your goals? We work with XML trees over RPC using the Python ElementTree XML API. Also, the Eclipse IDE has a nice XML editor. However, it seems that everything is moving to JSON, and Javascript is the way to process it. More on reddit.com
🌐 r/Python
3
2
March 14, 2019
bash - how to compare two xml files having same data in different lines? - Unix & Linux Stack Exchange
You could try xmldiff, but I think ... is to use an XML parser & generator to put each file in a canonical order and format, then use xmldiff or diff. A job for your favorite scripting language (Perl, Ruby, Python, etc.).... More on unix.stackexchange.com
🌐 unix.stackexchange.com
Question on parsing and comparing two different XML docs.
ESPError: Not enough information in post You might want to post an example of the data contained in the xml files and what you would expect from the output. It's not clear exactly what you're after. If you're just looking to parse xml files, then have a look at the xml modules in the standard library. Personally, I'd use xml.etree, as I find it simple to use. More on reddit.com
🌐 r/Python
6
2
February 10, 2012
Comparing xml/html files in Beautiful Soup

At a very high level, this seems like a use case where sets would be helpful. Create one set of saved events and one set of live events, and then use the difference or symmetric_difference functions to pull out the changes.

More on reddit.com
🌐 r/learnpython
11
1
May 29, 2011
🌐
GeeksforGeeks
geeksforgeeks.org › python › compare-two-xml-files-in-python
Compare Two Xml Files in Python - GeeksforGeeks
July 23, 2025 - In this article, we will see how we can compare two XML files in Python.
🌐
Reddit
reddit.com › r/python › best ways to compare xml files with python
r/Python on Reddit: Best ways to compare xml files with python
March 14, 2019 - We work with XML trees over RPC using the Python ElementTree XML API. Also, the Eclipse IDE has a nice XML editor. However, it seems that everything is moving to JSON, and Javascript is the way to process it. ... a view weeks ago I started a git hub projekt for comparing xml files.
🌐
sourcehut
sr.ht › ~nolda › xdiff
xdiff: A Python script for comparing XML files for structural or textual differences.
xdiff.py is a Python 3 script for comparing XML files. It outputs structural and textual differences -- i.e.
🌐
GitHub
github.com › MartinPetkov › XMLFileCompare
GitHub - MartinPetkov/XMLFileCompare: Python script for comparing two XML files for content, regardless of order · GitHub
Compares two XML files and returns true if they have the same elements with the same data and attributes, but not necessarily in the same order
Author   MartinPetkov
🌐
Quora
quora.com › If-you-need-to-compare-two-XML-files-and-generate-a-third-containing-subtraction-between-values-in-the-two-XML-files-on-a-Windows-machine-Had-you-used-Powershell-or-Python
If you need to compare two XML files and generate a third containing subtraction between values in the two XML files on a Windows machine. Had you used Powershell or Python? - Quora
Answer (1 of 3): > If you need to compare two XML files and generate a third containing subtraction between values in the two XML files on a Windows machine. Had you used Powershell or Python? PowerShell is built-in to all modern versions of Windows so the obvious solution starts with using it t...
Find elsewhere
🌐
dale lane
dalelane.co.uk › blog
Comparing XML files ignoring order of attributes and child elements « dale lane
October 6, 2014 - On Mac, I run: $ python xmldiff.py diffmerge testA.xml testB.xml ... The source showing how this works is available in a gist at gist.github.com/dalelane. It’s a quick hack to let me compare a handful of files, so it’s not been rigorously tested.
🌐
Complianceascode
complianceascode.github.io › template › 2022 › 10 › 24 › xmldiff-unit-tests.html
Using xmldiff in Python unit tests - ComplianceAsCode Blog
October 24, 2022 - That method isn’t sensitive to whitespace or formatting of the XML files, so we can save them in a pretty format. So we reworked our tests so that the test first saved the output of the tested method to a temporary file and then we called xmllint.main.diff_files() to compare this temporary file with our static file in test data.
🌐
Readthedocs
xmldiff.readthedocs.io › en › stable › api.html
Python API — xmldiff documentation
By default xmldiff will compare each node from one tree with all nodes from the other tree.
🌐
GitHub
github.com › JoshData › xml_diff
GitHub - JoshData/xml_diff: Compares two XML documents by diffing their text. · GitHub
The comparison is completely blind to the structure of the two XML documents. It does a word-by-word comparison on the text content only, and then it goes back into the original documents and wraps changed text in new <del> and <ins> wrapper elements. The documents are then concatenated to form a new document and the new document is printed on standard output. Or use this as a library and call compare yourself with two lxml.etree.Element nodes (the roots of your documents). The script is written in Python 3.
Starred by 44 users
Forked by 10 users
Languages   Python
🌐
Nick Janetakis
nickjanetakis.com › blog › how-i-used-the-lxml-library-to-parse-xml-20x-faster-in-python
How I Used the lxml Library to Parse XML 20x Faster in Python — Nick Janetakis
August 20, 2019 - This article will have a few code snippets and if you plan to follow along you will need to install the xmltodict library as well as the lxml library so we can compare both libraries. It doesn’t matter where you create this directory but we will be creating a few Python files, an XML file ...
🌐
Python Forum
python-forum.io › thread-36089.html
XML compare
Im looking for a script what can compare xml and works simular as the compare function in notepad++. Is there a script like that?
🌐
GitHub
gist.github.com › guillaumevincent › 74e5a9551ee14a774e5e
compare two XML in python · GitHub
compare two XML in python · Raw · test_xmldiff.py · This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
🌐
>> Decalage
decalage.info › posts › xfl - a python module to create and compare file lists in xml
xfl - a Python module to create and compare file lists in XML | >> Decalage
September 7, 2008 - xfl is a simple Python module to store and compare lists of files and complete directory trees in XML. It uses the ElementTree module to provide a pythonic interface to XML.
🌐
GitHub
gist.github.com › dalelane › a0514b2e283a882d9ef3
Comparing XML files ignoring order of attributes and elements - see http://dalelane.co.uk/blog/?p=3225 for background · GitHub
parser = le.XMLParser(remove_comments=True) # parse the XML file and get a pointer to the top xmldoc = le.parse(original, parser=parser)
🌐
Blogger
qfyilyi.blogspot.com › 2019 › 03 › compare-xml-files-using-python.html
compare xml files using python
March 10, 2019 - For comparing differences in general I use WinMerge, so if you don't need to do it in python, it's a pretty handy tool. But if you must, it seems the output already tells you the difference exactly? (That the second type tag under ngs_sample/...prelim_st/ was deleted). Did you mean you wanted to see the values being deleted? – Idlehands Nov 22 at 14:33 · Yes I want to see what has been deleted, i.e. what is the difference between the two xmls.
🌐
Adoclib
adoclib.com › blog › comparing-two-xml-files-in-python.html
Comparing Two Xml Files In Python
A library and command line utility for diffing xml - Shoobx/xmldiff. A traditional diff on such a format would tell you line by line the differences, but this would not be be readable by a human. xmldiff xmldiff is both a command-line tool and a Python library.