Use the xmldiff to perform this exact task.

main.py

from xmldiff import main
diff = main.diff_files("file1.xml", "file2.xml")
print(diff)

output

[DeleteNode(node='/ngs_sample/results/gastro_prelim_st/type[2]')]
Answer from Victor 'Chris' Cabral on Stack Overflow
🌐
PyPI
pypi.org › project › xmldiff
xmldiff · PyPI
A nice, easy to use Python API for using it as a library. Adds support for showing the diffs in different formats, mainly one where differences are marked up in the XML, useful for making human readable diffs. These formats can show text differences in a semantically meaningful way. An output format compatible with 0.6/1.x is also available. 2.0 is currently significantly slower than xmldiff ...
      » pip install xmldiff
    
Published   May 13, 2024
Version   2.7.0
🌐
Readthedocs
xmldiff.readthedocs.io › en › stable › api.html
Python API — xmldiff documentation - Read the Docs
It will return a string with the edit script printed out, one action per line. Each line is enclosed in brackets and consists of a string describing the action, and the actions arguments. This is the output format of xmldiff 0.6/1.x, however, the actions and arguments are not the same, so the ...
🌐
Complianceascode
complianceascode.github.io › template › 2022 › 10 › 24 › xmldiff-unit-tests.html
Using xmldiff in Python unit tests - ComplianceAsCode Blog
October 24, 2022 - In the example above, we can see there are two differences between the two XML files. First is that the attribute idref on element described by XPath expression /ns0:Rule/ns0:platform[1] is changed to virtual. Second is that the text of the element described by XPath expression /ns0:Rule/ns0:ident[1] is changed to 777777. In a Python script, you can call xmldiff this way:
🌐
Inl
mooseframework.inl.gov › python › testers › XMLDiff.html
XMLDiff | MOOSE
[Tests] issues = '#14634' design = 'XMLOutput.md' [parallel] requirement = "The system shall support XML output for vector data that is" [replicated] type = XMLDiff input = xml.i xmldiff = xml_out.xml min_parallel = 3 max_parallel = 3 detail = "replicated or " [] [distributed] type = XMLDiff input = xml.i cli_args = 'VectorPostprocessors/distributed/parallel_type=DISTRIBUTED Outputs/file_base=xml_distributed_out' xmldiff = 'xml_distributed_out.xml xml_distributed_out.xml.1 xml_distributed_out.xml.2' min_parallel = 3 max_parallel = 3 detail = "distributed in parallel."
🌐
Readthedocs
xmldiff.readthedocs.io › en › stable › advanced.html
Advanced Usage — xmldiff documentation - Read the Docs
>>> from xmldiff import main, formatting >>> left = '<body><p>Old Content</p></body>' >>> right = '<body><p>New Content</p></body>' >>> main.diff_texts(left, right) [UpdateTextIn(node='/body/p[1]', text='New Content')]
🌐
GitHub
github.com › Shoobx › xmldiff
GitHub - Shoobx/xmldiff: A library and command line utility for diffing xml
A nice, easy to use Python API for using it as a library. Adds support for showing the diffs in different formats, mainly one where differences are marked up in the XML, useful for making human readable diffs. These formats can show text differences in a semantically meaningful way. An output format compatible with 0.6/1.x is also available. 2.0 is currently significantly slower than xmldiff ...
Starred by 226 users
Forked by 53 users
Languages   Python 98.7% | Python 98.7%
🌐
sourcehut
sr.ht › ~nolda › xdiff
xdiff: A Python script for comparing XML files for structural or textual differences.
The script is released 'as is' with no warranty under the GNU General Public License, version 2.0 · The output of xdiff.py mimics the unified format of GNU diff, with three context lines by default. The -a option outputs all context lines, while the -n option outputs none.
🌐
Readthedocs
xmldiff.readthedocs.io
xmldiff — xmldiff documentation
Python API · Main diffing API · Unique Attributes · Using Formatters · The Edit Script · The patching API · Advanced Usage · Diffing Formatted Text · Making a Visual Diff · Performance Options · Contributing to xmldiff · Setting Up a Development Environment ·
Find elsewhere
🌐
Read the Docs
media.readthedocs.org › pdf › xmldiff › latest › xmldiff.pdf pdf
xmldiff Documentation Lennart Regebro May 21, 2023
May 21, 2023 - This formatter works like the ... to the xmldiff output ... This formatter return XML with tags describing the changes. These tags are designed so they easily can be changed · into something that will render nicely, for example with XSLT replacing the tags with the format you need. ... The default result of the diffing methods is to return an edit script, which is a list of Python objects called ...
🌐
GitHub
gist.github.com › guillaumevincent › 74e5a9551ee14a774e5e
compare two XML in python · GitHub
March 21, 2023 - compare two XML in python · Raw · test_xmldiff.py · This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Top answer
1 of 4
12

This is actually a reasonably challenging problem (due to what "difference" means often being in the eye of the beholder here, as there will be semantically "equivalent" information that you probably don't want marked as differences).

You could try using xmldiff, which is based on work in the paper Change Detection in Hierarchically Structured Information.

2 of 4
7

My approach to the problem was transforming each XML into a xml.etree.ElementTree and iterating through each of the layers. I also included the functionality to ignore a list of attributes while doing the comparison.

The first block of code holds the class used:

import xml.etree.ElementTree as ET
import logging

class XmlTree():

    def __init__(self):
        self.hdlr = logging.FileHandler('xml-comparison.log')
        self.formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s')

    @staticmethod
    def convert_string_to_tree( xmlString):

        return ET.fromstring(xmlString)

    def xml_compare(self, x1, x2, excludes=[]):
        """
        Compares two xml etrees
        :param x1: the first tree
        :param x2: the second tree
        :param excludes: list of string of attributes to exclude from comparison
        :return:
            True if both files match
        """

        if x1.tag != x2.tag:
            self.logger.debug('Tags do not match: %s and %s' % (x1.tag, x2.tag))
            return False
        for name, value in x1.attrib.items():
            if not name in excludes:
                if x2.attrib.get(name) != value:
                    self.logger.debug('Attributes do not match: %s=%r, %s=%r'
                                 % (name, value, name, x2.attrib.get(name)))
                    return False
        for name in x2.attrib.keys():
            if not name in excludes:
                if name not in x1.attrib:
                    self.logger.debug('x2 has an attribute x1 is missing: %s'
                                 % name)
                    return False
        if not self.text_compare(x1.text, x2.text):
            self.logger.debug('text: %r != %r' % (x1.text, x2.text))
            return False
        if not self.text_compare(x1.tail, x2.tail):
            self.logger.debug('tail: %r != %r' % (x1.tail, x2.tail))
            return False
        cl1 = x1.getchildren()
        cl2 = x2.getchildren()
        if len(cl1) != len(cl2):
            self.logger.debug('children length differs, %i != %i'
                         % (len(cl1), len(cl2)))
            return False
        i = 0
        for c1, c2 in zip(cl1, cl2):
            i += 1
            if not c1.tag in excludes:
                if not self.xml_compare(c1, c2, excludes):
                    self.logger.debug('children %i do not match: %s'
                                 % (i, c1.tag))
                    return False
        return True

    def text_compare(self, t1, t2):
        """
        Compare two text strings
        :param t1: text one
        :param t2: text two
        :return:
            True if a match
        """
        if not t1 and not t2:
            return True
        if t1 == '*' or t2 == '*':
            return True
        return (t1 or '').strip() == (t2 or '').strip()

The second block of code holds a couple of XML examples and their comparison:

xml1 = "<note><to>Tove</to><from>Jani</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"

xml2 = "<note><to>Tove</to><from>Daniel</from><heading>Reminder</heading><body>Don't forget me this weekend!</body></note>"

tree1 = XmlTree.convert_string_to_tree(xml1)
tree2 = XmlTree.convert_string_to_tree(xml2)

comparator = XmlTree()

if comparator.xml_compare(tree1, tree2, ["from"]):
    print "XMLs match"
else:
    print "XMLs don't match"

Most of the credit for this code must be given to syawar

🌐
PyPI
pypi.org › project › xmldiff › 1.1.0
xmldiff 1.1.0
JavaScript is disabled in your browser · Please enable JavaScript to proceed · A required part of this site couldn’t load. This may be due to a browser extension, network issues, or browser settings. Please check your connection, disable any ad blockers, or try using a different browser
🌐
GitHub
github.com › JoshData › xml_diff
GitHub - JoshData/xml_diff: Compares two XML documents by diffing their text. · GitHub
For really fast comparisons, get Google's Diff Match Patch library <https://code.google.com/p/google-diff-match-patch/>, as re-written and sped-up by @leutloff <https://github.com/leutloff/diff-match-patch-cpp-stl> and then turned into a Python extension module by me <https://github.com/JoshData/diff_match_patch-python>:
Starred by 44 users
Forked by 10 users
Languages   Python
🌐
GitHub
github.com › Shoobx › xmldiff › blob › master › xmldiff › formatting.py
xmldiff/xmldiff/formatting.py at master · Shoobx/xmldiff
from xmldiff.diff_match_patch import diff_match_patch · from xmldiff import utils · · · DIFF_NS = "http://namespaces.shoobx.com/diff" DIFF_PREFIX = "diff" · INSERT_NAME = "{%s}insert" % DIFF_NS · DELETE_NAME = "{%s}delete" % DIFF_NS ...
Author   Shoobx
🌐
GitHub
github.com › AG060982 › XmlDiff
GitHub - AG060982/XmlDiff: Simple Python program to compare XML files ignoring order of attributes and elements with x-path details
December 10, 2009 - Simple Python program to compare XML files ignoring order of attributes and elements with x-path details - AG060982/XmlDiff
Author   AG060982
🌐
GitHub
github.com › Shoobx › xmldiff › blob › master › xmldiff › main.py
xmldiff/xmldiff/main.py at master · Shoobx/xmldiff
November 8, 2021 - from xmldiff import diff, formatting, patch · · __version__ = metadata.version("xmldiff") · FORMATTERS = { "diff": formatting.DiffFormatter, "xml": formatting.XMLFormatter, "old": formatting.XmlDiffFormatter, } ·
Author   Shoobx
🌐
GitHub
github.com › joh › xmldiffs
GitHub - joh/xmldiffs: Compare two XML files, ignoring element and attribute order.
February 12, 2018 - Compare two XML files, ignoring element and attribute order. - joh/xmldiffs
Starred by 96 users
Forked by 27 users
Languages   Python 100.0% | Python 100.0%
🌐
dale lane
dalelane.co.uk › blog
Comparing XML files ignoring order of attributes and child elements « dale lane
October 6, 2014 - On Mac, I run: $ python xmldiff.py diffmerge testA.xml testB.xml