Here's how to ignore text differences in files in the Folder Compare:
- Load two folders in Beyond Compare's Folder Compare.
- Double click to view a pair of XML files in the Text Compare.
- Click the Rules toolbar button (referee icon).
- Click Edit Grammar.
- Click New.
- Name it MyElement.
- Select Delimited as the category.
- Text from:
<myElement>to:</myElement>. - Click OK.
- Click OK.
- Uncheck myElement to make it unimportant.
- Change the dropdown at the bottom of the dialog from Use for this view only to Use for all files within parent session or Update session defaults.
- Close the Text Compare tab.
- In the Folder Compare, click the Rules toolbar button (referee icon).
- Check Compare Contents and select Rules-based comparison.
- Click OK.
- Make sure View > Ignore Unimportant Differences is turned on.
The default settings in the Folder Compare use file size and modified date for comparison. Rules-based comparison uses the same content comparison method as double clicking to view file contents.
See also the article Define Unimportant Text in Beyond Compare. It describes ignoring differences when viewing a pair of files in the Text Compare.
Answer from Chris Kennedy on Stack OverflowVideos
One approach would be to first turn both XML files into Canonical XML, and compare the results using diff. For example, xmllint can be used to canonicalize XML.
$ xmllint --c14n one.xml > 1.xml
$ xmllint --c14n two.xml > 2.xml
$ diff 1.xml 2.xml
Or as a one-liner.
$ diff <(xmllint --c14n one.xml) <(xmllint --c14n two.xml)
Jukka's answer did not work for me, but it did point to Canonical XML. Neither --c14n nor --c14n11 sorted the attributes, but i did find the --exc-c14n switch did sort the attributes. --exc-c14n is not listed in the man page, but described on the command line as "W3C exclusive canonical format".
$ xmllint --exc-c14n one.xml > 1.xml
$ xmllint --exc-c14n two.xml > 2.xml
$ diff 1.xml 2.xml
$ xmllint | grep c14
--c14n : save in W3C canonical format v1.0 (with comments)
--c14n11 : save in W3C canonical format v1.1 (with comments)
--exc-c14n : save in W3C exclusive canonical format (with comments)
$ rpm -qf /usr/bin/xmllint
libxml2-2.7.6-14.el6.x86_64
libxml2-2.7.6-14.el6.i686
$ cat /etc/system-release
CentOS release 6.5 (Final)
Warning --exc-c14n strips out the xml header whereas the --c14n prepends the xml header if not there.
I've got a few XML files that are a couple thousand lines long I need to compare and see which values in file A are missing from file B.
Does anyone have an efficient way to diff these? They're not formatted the same at all, so normal text diff programs won't work. I just need to compare which "<value203>...</value203>" aren't in the one file.