I wrote a simple python tool for this called xmldiffs:
Compare two XML files, ignoring element and attribute order.
Usage:
xmldiffs [OPTION] FILE1 FILE2Any extra options are passed to the
diffcommand.
Get it at https://github.com/joh/xmldiffs
Answer from joh on Stack OverflowI wrote a simple python tool for this called xmldiffs:
Compare two XML files, ignoring element and attribute order.
Usage:
xmldiffs [OPTION] FILE1 FILE2Any extra options are passed to the
diffcommand.
Get it at https://github.com/joh/xmldiffs
With Beyond Compare you can use in the File Formats-Settings the XML Sort Conversion. With this option the XML children will be sorted before the diff.
A trial / portable version of Beyond Compare is available.
For xmlunit 2.0 (I was looking for this) it is now done, by using DefaultNodeMatcher
Diff diff = Diffbuilder.compare(Input.fromFile(control))
.withTest(Input.fromFile(test))
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText))
.build()
Hope this helps this helps other people googling...
My original answer is outdated. If I would have to build it again i would use xmlunit 2 and xmlunit-matchers. Please note that for xml unit a different order is always 'similar' not equals.
@Test
public void testXmlUnit() {
String myControlXML = "<test><elem>a</elem><elem>b</elem></test>";
String expected = "<test><elem>b</elem><elem>a</elem></test>";
assertThat(myControlXML, isSimilarTo(expected)
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText)));
//In case you wan't to ignore whitespaces add ignoreWhitespace().normalizeWhitespace()
assertThat(myControlXML, isSimilarTo(expected)
.ignoreWhitespace()
.normalizeWhitespace()
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText)));
}
If somebody still want't to use a pure java implementation here it is. This implementation extracts the content from xml and compares the list ignoring order.
public static Document loadXMLFromString(String xml) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(xml));
return builder.parse(is);
}
@Test
public void test() throws Exception {
Document doc = loadXMLFromString("<test>\n" +
" <elem>b</elem>\n" +
" <elem>a</elem>\n" +
"</test>");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//test//elem");
NodeList all = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
List<String> values = new ArrayList<>();
if (all != null && all.getLength() > 0) {
for (int i = 0; i < all.getLength(); i++) {
values.add(all.item(i).getTextContent());
}
}
Set<String> expected = new HashSet<>(Arrays.asList("a", "b"));
assertThat("List equality without order",
values, containsInAnyOrder(expected.toArray()));
}
Diff Files and Dirs: Add Ignore Element order for XML documents - Oxygen XML Forum
Feature request: Ignore elements order when comparing xml - Scooter Forums
diff - Compare XML ignoring element order - Stack Overflow
Comparing XML files ignoring elements attribute Order
One difference that may need to become clearer in the 2.x documentation is the default ElementSelector - roughly what used to be ElementQualifier in 1.x. Where 1.x defaults to match elements by name, 2.x defaults to match elements in order. Maybe this is a bad idea.
Your Diff should work if you switch to matching on element names.
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byName))
You might need to add something along the lines of
.withDifferenceEvaluator(((comparison, outcome) -> {
if (outcome == ComparisonResult.DIFFERENT &&
comparison.getType() == ComparisonType.CHILD_NODELIST_SEQUENCE) {
return ComparisonResult.EQUAL;
}
return outcome;
})).build();
to your Diff builder
for me the solution mentioned above will not work as compareNodeLists has this hardcoded in DOMDifferenceEngine.compareNodes()
new Comparison(ComparisonType.CHILD_NODELIST_SEQUENCE...
i have raised a new ticket for this though bear in mind it could just be my lack of understading :-)
https://github.com/xmlunit/xmlunit/issues/258
Two approaches that I use are (a) to canonicalize both XML files and then compare their serializations, and (b) to use the XPath 2.0 deep-equal() function. Both approaches are OK for telling you whether the files are the same, but not very good at telling you where they differ.
A commercial tool that specializes in this problem is DeltaXML.
If you have things that you consider equivalent, but which aren't equivalent at the XML level - for example, elements in a different order - then you may have to be prepared to do a transformation to normalize the documents before comparison.
Good answer here:
Question: How can I diff two XML files? | Super User
Answer: How can I diff two XML files? | Super User
$ xmllint --format --exc-c14n one.xml > 1.xml
$ xmllint --format --exc-c14n two.xml > 2.xml
$ diff 1.xml 2.xml
Apologies for any failure to adhere to serverfault conventions ... I'm sure someone will let me know and I will amend appropriately.
I had a similar problem and I eventually found: https://superuser.com/questions/79920/how-can-i-diff-two-xml-files
That post suggests doing a canonical xml sort then doing a diff. Being that you are on linux, this should work for you cleanly. It worked for me on my mac, and should work for people on windows if they have something like cygwin installed:
$ xmllint --c14n a.xml > sortedA.xml
$ xmllint --c14n b.xml > sortedB.xml
$ diff sortedA.xml sortedB.xml
You're requesting a sort based on the sequence of attributes in the elements being sorted. But your top-level tag elements here have only one attribute: name. If you want multiple tag elements with name="BBB" to sort differently, you need to give them distinct sort keys.
In your example, I'd try something like select="concat(name(), @name, name(*[1]), *[1]/@name)" -- but this is a very shallow key. It uses values from the first child in the input, but the children may shift position during the process. You may be able (knowing your data better than I do) to calculate a good key for each element in a single pass, or you may just need several passes.