I wrote a simple python tool for this called xmldiffs:
Compare two XML files, ignoring element and attribute order.
Usage:
xmldiffs [OPTION] FILE1 FILE2Any extra options are passed to the
diffcommand.
Get it at https://github.com/joh/xmldiffs
Answer from joh on Stack OverflowI wrote a simple python tool for this called xmldiffs:
Compare two XML files, ignoring element and attribute order.
Usage:
xmldiffs [OPTION] FILE1 FILE2Any extra options are passed to the
diffcommand.
Get it at https://github.com/joh/xmldiffs
With Beyond Compare you can use in the File Formats-Settings the XML Sort Conversion. With this option the XML children will be sorted before the diff.
A trial / portable version of Beyond Compare is available.
Videos
Two approaches that I use are (a) to canonicalize both XML files and then compare their serializations, and (b) to use the XPath 2.0 deep-equal() function. Both approaches are OK for telling you whether the files are the same, but not very good at telling you where they differ.
A commercial tool that specializes in this problem is DeltaXML.
If you have things that you consider equivalent, but which aren't equivalent at the XML level - for example, elements in a different order - then you may have to be prepared to do a transformation to normalize the documents before comparison.
Good answer here:
Question: How can I diff two XML files? | Super User
Answer: How can I diff two XML files? | Super User
$ xmllint --format --exc-c14n one.xml > 1.xml
$ xmllint --format --exc-c14n two.xml > 2.xml
$ diff 1.xml 2.xml
Apologies for any failure to adhere to serverfault conventions ... I'm sure someone will let me know and I will amend appropriately.
For xmlunit 2.0 (I was looking for this) it is now done, by using DefaultNodeMatcher
Diff diff = Diffbuilder.compare(Input.fromFile(control))
.withTest(Input.fromFile(test))
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText))
.build()
Hope this helps this helps other people googling...
My original answer is outdated. If I would have to build it again i would use xmlunit 2 and xmlunit-matchers. Please note that for xml unit a different order is always 'similar' not equals.
@Test
public void testXmlUnit() {
String myControlXML = "<test><elem>a</elem><elem>b</elem></test>";
String expected = "<test><elem>b</elem><elem>a</elem></test>";
assertThat(myControlXML, isSimilarTo(expected)
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText)));
//In case you wan't to ignore whitespaces add ignoreWhitespace().normalizeWhitespace()
assertThat(myControlXML, isSimilarTo(expected)
.ignoreWhitespace()
.normalizeWhitespace()
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText)));
}
If somebody still want't to use a pure java implementation here it is. This implementation extracts the content from xml and compares the list ignoring order.
public static Document loadXMLFromString(String xml) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(xml));
return builder.parse(is);
}
@Test
public void test() throws Exception {
Document doc = loadXMLFromString("<test>\n" +
" <elem>b</elem>\n" +
" <elem>a</elem>\n" +
"</test>");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//test//elem");
NodeList all = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
List<String> values = new ArrayList<>();
if (all != null && all.getLength() > 0) {
for (int i = 0; i < all.getLength(); i++) {
values.add(all.item(i).getTextContent());
}
}
Set<String> expected = new HashSet<>(Arrays.asList("a", "b"));
assertThat("List equality without order",
values, containsInAnyOrder(expected.toArray()));
}
One approach would be to first turn both XML files into Canonical XML, and compare the results using diff. For example, xmllint can be used to canonicalize XML.
$ xmllint --c14n one.xml > 1.xml
$ xmllint --c14n two.xml > 2.xml
$ diff 1.xml 2.xml
Or as a one-liner.
$ diff <(xmllint --c14n one.xml) <(xmllint --c14n two.xml)
Jukka's answer did not work for me, but it did point to Canonical XML. Neither --c14n nor --c14n11 sorted the attributes, but i did find the --exc-c14n switch did sort the attributes. --exc-c14n is not listed in the man page, but described on the command line as "W3C exclusive canonical format".
$ xmllint --exc-c14n one.xml > 1.xml
$ xmllint --exc-c14n two.xml > 2.xml
$ diff 1.xml 2.xml
$ xmllint | grep c14
--c14n : save in W3C canonical format v1.0 (with comments)
--c14n11 : save in W3C canonical format v1.1 (with comments)
--exc-c14n : save in W3C exclusive canonical format (with comments)
$ rpm -qf /usr/bin/xmllint
libxml2-2.7.6-14.el6.x86_64
libxml2-2.7.6-14.el6.i686
$ cat /etc/system-release
CentOS release 6.5 (Final)
Warning --exc-c14n strips out the xml header whereas the --c14n prepends the xml header if not there.
One difference that may need to become clearer in the 2.x documentation is the default ElementSelector - roughly what used to be ElementQualifier in 1.x. Where 1.x defaults to match elements by name, 2.x defaults to match elements in order. Maybe this is a bad idea.
Your Diff should work if you switch to matching on element names.
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byName))
You might need to add something along the lines of
.withDifferenceEvaluator(((comparison, outcome) -> {
if (outcome == ComparisonResult.DIFFERENT &&
comparison.getType() == ComparisonType.CHILD_NODELIST_SEQUENCE) {
return ComparisonResult.EQUAL;
}
return outcome;
})).build();
to your Diff builder
for me the solution mentioned above will not work as compareNodeLists has this hardcoded in DOMDifferenceEngine.compareNodes()
new Comparison(ComparisonType.CHILD_NODELIST_SEQUENCE...
i have raised a new ticket for this though bear in mind it could just be my lack of understading :-)
https://github.com/xmlunit/xmlunit/issues/258