You could try xmllint
The xmllint program parses one or more XML files, specified on the command line as xmlfile. It prints various types of output, depending upon the options selected. It is useful for detecting errors both in XML code and in the XML parser itse
It allows you select elements in the XML doc by xpath, using the --pattern option.
On Mac OS X (Yosemite), it is installed by default.
On Ubuntu, if it is not already installed, you can run apt-get install libxml2-utils
You could try xmllint
The xmllint program parses one or more XML files, specified on the command line as xmlfile. It prints various types of output, depending upon the options selected. It is useful for detecting errors both in XML code and in the XML parser itse
It allows you select elements in the XML doc by xpath, using the --pattern option.
On Mac OS X (Yosemite), it is installed by default.
On Ubuntu, if it is not already installed, you can run apt-get install libxml2-utils
Here's a fully working example.
If it's only extracting email addresses you could do something like:
Suppose XML file spam.xml is like
<spam> <victims> <victim> <name>The Pope</name> <email>[email protected]</email> <is_satan>0</is_satan> </victim> <victim> <name>George Bush</name> <email>[email protected]</email> <is_satan>1</is_satan> </victim> <victim> <name>George Bush Jr</name> <email>[email protected]</email> <is_satan>0</is_satan> </victim> </victims> </spam>You can get the emails and process them with this short bash code:
#!/bin/bash emails=($(grep -oP '(?<=email>)[^<]+' "/my_path/spam.xml")) for i in ${!emails[*]} do echo "$i" "${emails[$i]}" # instead of echo use the values to send emails, etc done
The result of this example is:
0 [email protected]
1 [email protected]
2 [email protected]
Important note:
Don't use this for serious matters. This is OK for playing around, getting quick results, learning to grep, etc. but you should definitely look for, learn and use an XML parser for production (see Micha's comment below).
command line - Parse XML to get node value in bash script? - Unix & Linux Stack Exchange
windows - How to parse a XML file with Command Line (cmd/batch) - Stack Overflow
Parsing XML using unix terminal - Stack Overflow
XML processor similar to jq
Using bash and xmllint (as given by the tags):
xmllint --version # xmllint: using libxml version 20703
# Note: Newer versions of libxml / xmllint have a --xpath option which
# makes it possible to use xpath expressions directly as arguments.
# --xpath also enables precise output in contrast to the --shell & sed approaches below.
#xmllint --help 2>&1 | grep -i 'xpath'
{
# the given XML is in file.xml
host="$(echo "cat /config/global/resources/default_setup/connection/host/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')"
username="$(echo "cat /config/global/resources/default_setup/connection/username/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')"
password="$(echo "cat /config/global/resources/default_setup/connection/password/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')"
dbname="$(echo "cat /config/global/resources/default_setup/connection/dbname/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')"
printf '%s\n' "host: $host" "username: $username" "password: $password" "dbname: $dbname"
}
# output
# host: localhost
# username: root
# password: pass123
# dbname: testdb
In case there is just an XML string and the use of a temporary file is to be avoided, file descriptors are the way to go with xmllint (which is given /dev/fd/3 as a file argument here):
set +H
{
xmlstr='<?xml version="1.0"?>
<config>
<global>
<install>
<date><![CDATA[Tue, 11 Dec 2012 12:31:25 +0000]]></date>
</install>
<crypt>
<key><![CDATA[70e75d7969b900b696785f2f81ecb430]]></key>
</crypt>
<disable_local_modules>false</disable_local_modules>
<resources>
<db>
<table_prefix><![CDATA[]]></table_prefix>
</db>
<default_setup>
<connection>
<host><![CDATA[localhost]]></host>
<username><![CDATA[root]]></username>
<password><![CDATA[pass123]]></password>
<dbname><![CDATA[testdb]]></dbname>
<initStatements><![CDATA[SET NAMES utf8]]></initStatements>
<model><![CDATA[mysql4]]></model>
<type><![CDATA[pdo_mysql]]></type>
<pdoType><![CDATA[]]></pdoType>
<active>1</active>
</connection>
</default_setup>
</resources>
<session_save><![CDATA[files]]></session_save>
</global>
<admin>
<routers>
<adminhtml>
<args>
<frontName><![CDATA[admin]]></frontName>
</args>
</adminhtml>
</routers>
</admin>
</config>
'
# exec issue
#exec 3<&- 3<<<"$xmlstr"
#exec 3<&- 3< <(printf '%s' "$xmlstr")
exec 3<&- 3<<EOF
$(printf '%s' "$xmlstr")
EOF
{ read -r host; read -r username; read -r password; read -r dbname; } < <(
echo "cat /config/global/resources/default_setup/connection/*[self::host or self::username or self::password or self::dbname]/text()" |
xmllint --nocdata --shell /dev/fd/3 |
sed -e '1d;
/d'
)
printf '%s\n' "host: $host" "username: $username" "password: $password" "dbname: $dbname"
exec 3<&-
}
set -H
# output
# host: localhost
# username: root
# password: pass123
# dbname: testdb
Using xmllint and the --xpath option, it is very easy. You can simply do this:
XML_FILE=/path/to/file.xml
HOST=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/host)' $XML_FILE
USERNAME=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/username)' $XML_FILE
PASSWORD=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/password)' $XML_FILE
DBNAME=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/dbname)' $XML_FILE
If you need to get to an element's attribute, that's also easy using XPath. Imagine you have the file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<addon id="screensaver.turnoff"
name="Turn Off"
version="0.10.0"
provider-name="Dag Wieërs">
..snip..
</addon>
The needed shell statements would be:
VERSION=$(xmllint --xpath 'string(/addon/@version)' $ADDON_XML)
AUTHOR=$(xmllint --xpath 'string(/addon/@provider-name)' $ADDON_XML)
You could try xmllint
The xmllint program parses one or more XML files, specified on the command line as xmlfile. It prints various types of output, depending upon the options selected. It is useful for detecting errors both in XML code and in the XML parser itse
It allows you select elements in the XML doc by xpath, using the --pattern option.
On Mac OS X (Yosemite), it is installed by default.
On Ubuntu, if it is not already installed, you can run apt-get install libxml2-utils
@echo off
set "xml_file=test.xml"
set /p search_for=Enter name:
for /f "skip=2 tokens=3,9 delims=;= " %%a in ('find """%search_for%""" "%xml_file%"') do (
set "name=%%~a"
set "pass=%%b"
)
echo name : %name%
echo pass : %pass%
If all connectionStrings are on separated lines and every string is on one line.Change the location of the xml_file
You can also try the xpath.bat (better option according to me) -small script that will allow you to get a xml values by xpath expression without using external binaries:
call xpath.bat connection.xml "//add[@name = 'name1']/@connectionString"
Since you indicated (in comments on the question) that powershell is also okay, put the following code in a script file (lets say Foo.ps1)
param
(
[Parameter(Mandatory=$true)]
[string] $ConfigFilePath,
[Parameter(Mandatory=$true)]
[string] $Name
)
(xml).connectionStrings.add |
Where-Object {$_.name -eq $name} |
ForEach-Object {($_.connectionString -split ' ')[1] -split ';'}
and then run the script with parameters to get the output.
Peter's answer is correct, but it outputs a trailing line feed.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:template match="root">
<xsl:for-each select="myel">
<xsl:value-of select="@name"/>
<xsl:text>,</xsl:text>
<xsl:if test="not(position() = last())">
<xsl:text>
</xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Just run e.g.
xsltproc stylesheet.xsl source.xml
to generate the CSV results into standard output.
Use a command-line XSLT processor such as xsltproc, saxon or xalan to parse the XML and generate CSV. Here's an example, which for your case is the stylesheet:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="root">
<xsl:apply-templates select="myel"/>
</xsl:template>
<xsl:template match="myel">
<xsl:for-each select="@*">
<xsl:value-of select="."/>
<xsl:value-of select="','"/>
</xsl:for-each>
<xsl:text> </xsl:text>
</xsl:template>
</xsl:stylesheet>
I'm looking for something similar to jq but for processing XML & other similar formats. I know about yq, but I'd prefer something for XML that isn't a wrapper or changes the output to be JSON.
The following linux command uses XPath to access specified values within the XML file
for xml in `find . -name "*.xml"`
do
echo $xml `xmllint --xpath "/param-value/value/text()" $xml`| awk 'NF>1'
done
Example output for matching XML files:
./test1.xml asdf
./test4.xml 1234
I worked out a couple of solutions using basic perl/awk functionality (basically a poor man's parsing of the tags). If you see any improvements using only basic perl/awk functionality, let me know. I avoided dealing with multiline regular expressions by setting a flag with I see a particular tag. Kind of clumsy but it works.
perl:
perl -ne '$h = 1 if m/Host/; $r = 1 if m/Role/; if ($h && m/<value>/) { $h = 0; print "hosts: ", $_ =~ /<value>(.*)</, "\n"}; if ($r && m/<value>/) { $r = 0; print "\nrole: ", $_ =~ /<value>(.*)</, "\n" }'
awk:
awk '/Host/ {h = 1} /Role/ {r = 1} h && /<value>/ {h = 0; match($0, "<value>(.*)<", a); print "hosts: " a[1]} r && /<value>/ {r = 0; match($0, "<value>(.*)<", a); print "\nrole: " a[1]}'
I've been using jq (https://stedolan.github.io/jq/) for a long time now to parse and reconstruct JSON from the commands line (Unix). It's a great tool for searching within JSON, extracting parts, constructing JSON, good stuff.
I'm usually doing stuff like curl http://server/endpoit.json | jq to get a pretty representation of the JSON response for example.
Lately I've been working for XML and I miss the power of jq.
What command lines tools do you use for XML parsing and filtering?