This is actually a reasonably challenging problem (due to what "difference" means often being in the eye of the beholder here, as there will be semantically "equivalent" information that you probably don't want marked as differences).

You could try using xmldiff, which is based on work in the paper Change Detection in Hierarchically Structured Information.

Answer from Nick Bastin on Stack Overflow
Top answer
1 of 4
12

This is actually a reasonably challenging problem (due to what "difference" means often being in the eye of the beholder here, as there will be semantically "equivalent" information that you probably don't want marked as differences).

You could try using xmldiff, which is based on work in the paper Change Detection in Hierarchically Structured Information.

2 of 4
7

My approach to the problem was transforming each XML into a xml.etree.ElementTree and iterating through each of the layers. I also included the functionality to ignore a list of attributes while doing the comparison.

The first block of code holds the class used:

import xml.etree.ElementTree as ET
import logging

class XmlTree():

    def __init__(self):
        self.hdlr = logging.FileHandler('xml-comparison.log')
        self.formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')

    @staticmethod
    def convert_string_to_tree( xmlString):

        return ET.fromstring(xmlString)

    def xml_compare(self, x1, x2, excludes=[]):
        """
        Compares two xml etrees
        :param x1: the first tree
        :param x2: the second tree
        :param excludes: list of string of attributes to exclude from comparison
        :return:
            True if both files match
        """

        if x1.tag != x2.tag:
            self.logger.debug('Tags do not match: %s and %s' % (x1.tag, x2.tag))
            return False
        for name, value in x1.attrib.items():
            if not name in excludes:
                if x2.attrib.get(name) != value:
                    self.logger.debug('Attributes do not match: %s=%r, %s=%r'
                                 % (name, value, name, x2.attrib.get(name)))
                    return False
        for name in x2.attrib.keys():
            if not name in excludes:
                if name not in x1.attrib:
                    self.logger.debug('x2 has an attribute x1 is missing: %s'
                                 % name)
                    return False
        if not self.text_compare(x1.text, x2.text):
            self.logger.debug('text: %r != %r' % (x1.text, x2.text))
            return False
        if not self.text_compare(x1.tail, x2.tail):
            self.logger.debug('tail: %r != %r' % (x1.tail, x2.tail))
            return False
        cl1 = x1.getchildren()
        cl2 = x2.getchildren()
        if len(cl1) != len(cl2):
            self.logger.debug('children length differs, %i != %i'
                         % (len(cl1), len(cl2)))
            return False
        i = 0
        for c1, c2 in zip(cl1, cl2):
            i += 1
            if not c1.tag in excludes:
                if not self.xml_compare(c1, c2, excludes):
                    self.logger.debug('children %i do not match: %s'
                                 % (i, c1.tag))
                    return False
        return True

    def text_compare(self, t1, t2):
        """
        Compare two text strings
        :param t1: text one
        :param t2: text two
        :return:
            True if a match
        """
        if not t1 and not t2:
            return True
        if t1 == '*' or t2 == '*':
            return True
        return (t1 or '').strip() == (t2 or '').strip()

The second block of code holds a couple of XML examples and their comparison:

xml1 = "<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"

xml2 = "<note><to>Tove</to><from>Daniel</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"

tree1 = XmlTree.convert_string_to_tree(xml1)
tree2 = XmlTree.convert_string_to_tree(xml2)

comparator = XmlTree()

if comparator.xml_compare(tree1, tree2, ["from"]):
    print "XMLs match"
else:
    print "XMLs don't match"

Most of the credit for this code must be given to syawar

🌐
PyPI
pypi.org › project › xmldiff
xmldiff · PyPI
result = main.patch_file('file.diff', 'file1.xml') ... Easier to maintain, the code is less complex and more Pythonic, and uses more custom classes instead of just nesting lists and dicts.
      » pip install xmldiff
    
Published   May 13, 2024
Version   2.7.0
Discussions

Best ways to compare xml files with python
What are your goals? We work with XML trees over RPC using the Python ElementTree XML API. Also, the Eclipse IDE has a nice XML editor. However, it seems that everything is moving to JSON, and Javascript is the way to process it. More on reddit.com
🌐 r/Python
3
2
March 14, 2019
bash - how to compare two xml files having same data in different lines? - Unix & Linux Stack Exchange
You could try xmldiff, but I think ... is to use an XML parser & generator to put each file in a canonical order and format, then use xmldiff or diff. A job for your favorite scripting language (Perl, Ruby, Python, etc.).... More on unix.stackexchange.com
🌐 unix.stackexchange.com
compare xml files using python - Stack Overflow
Anyone know how to identify what is the difference between the two xml files, i.e. what has been deleted compared to the file b.xml. Anyone recommend any other way of comparing xml files in python? More on stackoverflow.com
🌐 stackoverflow.com
Question on parsing and comparing two different XML docs.
ESPError: Not enough information in post You might want to post an example of the data contained in the xml files and what you would expect from the output. It's not clear exactly what you're after. If you're just looking to parse xml files, then have a look at the xml modules in the standard library. Personally, I'd use xml.etree, as I find it simple to use. More on reddit.com
🌐 r/Python
6
2
February 10, 2012
🌐
sourcehut
sr.ht › ~nolda › xdiff
xdiff: A Python script for comparing XML files for structural or textual differences.
xdiff.py is a Python 3 script for comparing XML files. It outputs structural and textual differences -- i.e.
🌐
GeeksforGeeks
geeksforgeeks.org › python › compare-two-xml-files-in-python
Compare Two Xml Files in Python - GeeksforGeeks
July 23, 2025 - In this article, we will see how we can compare two XML files in Python.
🌐
Readthedocs
xmldiff.readthedocs.io › en › stable › api.html
Python API — xmldiff documentation - Read the Docs
By default xmldiff will compare each node from one tree with all nodes from the other tree.
🌐
Reddit
reddit.com › r/python › best ways to compare xml files with python
r/Python on Reddit: Best ways to compare xml files with python
March 14, 2019 - We work with XML trees over RPC using the Python ElementTree XML API. Also, the Eclipse IDE has a nice XML editor. However, it seems that everything is moving to JSON, and Javascript is the way to process it. ... a view weeks ago I started a git hub projekt for comparing xml files.
🌐
GitHub
github.com › JoshData › xml_diff
GitHub - JoshData/xml_diff: Compares two XML documents by diffing their text. · GitHub
The comparison is completely blind to the structure of the two XML documents. It does a word-by-word comparison on the text content only, and then it goes back into the original documents and wraps changed text in new <del> and <ins> wrapper elements. The documents are then concatenated to form a new document and the new document is printed on standard output. Or use this as a library and call compare yourself with two lxml.etree.Element nodes (the roots of your documents). The script is written in Python 3.
Starred by 44 users
Forked by 10 users
Languages   Python
Find elsewhere
🌐
GitHub
gist.github.com › guillaumevincent › 74e5a9551ee14a774e5e
compare two XML in python · GitHub
compare two XML in python · Raw · test_xmldiff.py · This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
🌐
dale lane
dalelane.co.uk › blog
Comparing XML files ignoring order of attributes and child elements « dale lane
October 6, 2014 - On Mac, I run: $ python xmldiff.py diffmerge testA.xml testB.xml ... The source showing how this works is available in a gist at gist.github.com/dalelane. It’s a quick hack to let me compare a handful of files, so it’s not been rigorously tested.
🌐
Quora
quora.com › If-you-need-to-compare-two-XML-files-and-generate-a-third-containing-subtraction-between-values-in-the-two-XML-files-on-a-Windows-machine-Had-you-used-Powershell-or-Python
If you need to compare two XML files and generate a third containing subtraction between values in the two XML files on a Windows machine. Had you used Powershell or Python? - Quora
Answer (1 of 3): > If you need to compare two XML files and generate a third containing subtraction between values in the two XML files on a Windows machine. Had you used Powershell or Python? PowerShell is built-in to all modern versions of Windows so the obvious solution starts with using it t...
🌐
GitHub
github.com › MartinPetkov › XMLFileCompare
GitHub - MartinPetkov/XMLFileCompare: Python script for comparing two XML files for content, regardless of order · GitHub
Compares two XML files and returns true if they have the same elements with the same data and attributes, but not necessarily in the same order
Author   MartinPetkov
🌐
GitHub
gist.github.com › dalelane › a0514b2e283a882d9ef3
Comparing XML files ignoring order of attributes and elements - see http://dalelane.co.uk/blog/?p=3225 for background · GitHub
parser = le.XMLParser(remove_comments=True) # parse the XML file and get a pointer to the top xmldoc = le.parse(original, parser=parser)
🌐
Complianceascode
complianceascode.github.io › template › 2022 › 10 › 24 › xmldiff-unit-tests.html
Using xmldiff in Python unit tests - ComplianceAsCode Blog
October 24, 2022 - That method isn’t sensitive to whitespace or formatting of the XML files, so we can save them in a pretty format. So we reworked our tests so that the test first saved the output of the tested method to a temporary file and then we called xmllint.main.diff_files() to compare this temporary file with our static file in test data.
🌐
PyPI
pypi.org › project › xml_diff
xml_diff
JavaScript is disabled in your browser. Please enable JavaScript to proceed · A required part of this site couldn’t load. This may be due to a browser extension, network issues, or browser settings. Please check your connection, disable any ad blockers, or try using a different browser
🌐
Python Forum
python-forum.io › thread-36089.html
XML compare
Im looking for a script what can compare xml and works simular as the compare function in notepad++. Is there a script like that?
🌐
GitHub
github.com › cfpb › xtdiff
GitHub - cfpb/xtdiff: :warning: THIS REPO IS DEPRECATED Python library to compare two XML trees and generate a set of actions that transform one into the other · GitHub
February 1, 2019 - XML Tree Diff is a Python library that implements "Change detection in hierarchically structured information", by Sudarshan S. Chawathe, Anand Rajaraman, Hector Garcia-Molina, and Jennifer Widom..
Starred by 27 users
Forked by 6 users
Languages   Python
🌐
GitHub
github.com › joh › xmldiffs
GitHub - joh/xmldiffs: Compare two XML files, ignoring element and attribute order.
xmldiffs first parses each XML file and spits them out sorted by element (tag) name and attributes. The result is then passed to diff for a semantic XML comparison.
Starred by 96 users
Forked by 27 users
Languages   Python 100.0% | Python 100.0%