libxml2 has a number of advantages:

  1. Compliance to the spec
  2. Active development and a community participation
  3. Speed. This is really a python wrapper around a C implementation.
  4. Ubiquity. The libxml2 library is pervasive and thus well tested.

Downsides include:

  1. Compliance to the spec. It's strict. Things like default namespace handling are easier in other libraries.
  2. Use of native code. This can be a pain depending on your how your application is distributed / deployed. RPMs are available that ease some of this pain.
  3. Manual resource handling. Note in the sample below the calls to freeDoc() and xpathFreeContext(). This is not very Pythonic.

If you are doing simple path selection, stick with ElementTree ( which is included in Python 2.5 ). If you need full spec compliance or raw speed and can cope with the distribution of native code, go with libxml2.

Sample of libxml2 XPath Use


import libxml2

doc = libxml2.parseFile("tst.xml")
ctxt = doc.xpathNewContext()
res = ctxt.xpathEval("//*")
if len(res) != 2:
    print "xpath query: wrong node set size"
    sys.exit(1)
if res[0].name != "doc" or res[1].name != "foo":
    print "xpath query: wrong node set value"
    sys.exit(1)
doc.freeDoc()
ctxt.xpathFreeContext()

Sample of ElementTree XPath Use


from elementtree.ElementTree import ElementTree
mydoc = ElementTree(file='tst.xml')
for e in mydoc.findall('/foo/bar'):
    print e.get('title').text

Answer from Ryan Cox on Stack Overflow
๐ŸŒ
Python
docs.python.org โ€บ 3 โ€บ library โ€บ xml.etree.elementtree.html
xml.etree.ElementTree โ€” The ElementTree XML API
January 29, 2026 - More sophisticated specification of which elements to look for is possible by using XPath. ElementTree provides a simple way to build XML documents and write them to files.
Top answer
1 of 11
137

libxml2 has a number of advantages:

  1. Compliance to the spec
  2. Active development and a community participation
  3. Speed. This is really a python wrapper around a C implementation.
  4. Ubiquity. The libxml2 library is pervasive and thus well tested.

Downsides include:

  1. Compliance to the spec. It's strict. Things like default namespace handling are easier in other libraries.
  2. Use of native code. This can be a pain depending on your how your application is distributed / deployed. RPMs are available that ease some of this pain.
  3. Manual resource handling. Note in the sample below the calls to freeDoc() and xpathFreeContext(). This is not very Pythonic.

If you are doing simple path selection, stick with ElementTree ( which is included in Python 2.5 ). If you need full spec compliance or raw speed and can cope with the distribution of native code, go with libxml2.

Sample of libxml2 XPath Use


import libxml2

doc = libxml2.parseFile("tst.xml")
ctxt = doc.xpathNewContext()
res = ctxt.xpathEval("//*")
if len(res) != 2:
    print "xpath query: wrong node set size"
    sys.exit(1)
if res[0].name != "doc" or res[1].name != "foo":
    print "xpath query: wrong node set value"
    sys.exit(1)
doc.freeDoc()
ctxt.xpathFreeContext()

Sample of ElementTree XPath Use


from elementtree.ElementTree import ElementTree
mydoc = ElementTree(file='tst.xml')
for e in mydoc.findall('/foo/bar'):
    print e.get('title').text

2 of 11
87

The lxml package supports xpath. It seems to work pretty well, although I've had some trouble with the self:: axis. There's also Amara, but I haven't used it personally.

๐ŸŒ
DataCamp
datacamp.com โ€บ tutorial โ€บ python-xml-elementtree
Python XML Tutorial: Element Tree Parse & Read | DataCamp
December 10, 2024 - XPath expressions are query languages for selecting nodes from an XML document. They are useful for navigating through elements and attributes in XML documents, allowing for precise data retrieval and manipulation. XML documents can be validated using an XML Schema Definition (XSD) or a Document ...
๐ŸŒ
lxml
lxml.de โ€บ xpathxslt.html
XPath and XSLT with lxml
The prefixes you choose here are ... the XML document. The document may define whatever prefixes it likes, including the empty prefix, without breaking the above code. Note that XPath does not have a notion of a default namespace. The empty prefix is therefore undefined for XPath and cannot be used in namespace prefix mappings. There is also an optional extensions argument which is used to define custom extension functions in Python that are local ...
๐ŸŒ
W3Schools
w3schools.com โ€บ xml โ€บ xml_xpath.asp
XML and XPath
XPath expressions can be used in JavaScript, Java, XML Schema, PHP, Python, C and C++, and lots of other languages.
๐ŸŒ
Rip Tutorial
riptutorial.com โ€บ searching the xml with xpath
Python Language Tutorial => Searching the XML with XPath
Starting with version 2.7 ElementTree has a better support for XPath queries. XPath is a syntax to enable you to navigate through an xml like SQL is used to search through a database. Both find and findall functions support ...
๐ŸŒ
SomeBeans
ianhopkinson.org.uk โ€บ 2015 โ€บ 11 โ€บ parsing-xml-and-html-using-xpath-and-lxml-in-python
Parsing XML and HTML using xpath and lxml in Python โ€“ SomeBeans
January 1, 2018 - Xpath queries are designed to extract a set of elements or attributes from an XML/HTML document by the name of the element, the value of an attribute on an element, by the relationship an element has with another element or by the content of an element. Quite often xpath will return elements or lists of elements which, when printed in Python, donโ€™t show you the content you want to see.
๐ŸŒ
GeeksforGeeks
geeksforgeeks.org โ€บ python โ€บ web-scraping-using-lxml-and-xpath-in-python
Web Scraping using lxml and XPath in Python - GeeksforGeeks
July 23, 2025 - We create the correct XPath query and use the lxml xpath function to get the required element. ... Below is a program based on the above approach which uses a particular URL. ... # Import required modules from lxml import html import requests # Request the page page = requests.get('http://econpy.pythonanywhere.com/ex/001.html') # Parsing the page # (We need to use page.content rather than # page.text because html.fromstring implicitly # expects bytes as input.) tree = html.fromstring(page.content) # Get element using XPath buyers = tree.xpath('//div[@title="buyer-name"]/text()') print(buyers)
Find elsewhere
๐ŸŒ
W3Schools
w3schools.com โ€บ xml โ€บ xpath_intro.asp
XPath Tutorial
Today XPath expressions can also be used in JavaScript, Java, XML Schema, PHP, Python, C and C++, and lots of other languages.
๐ŸŒ
GitHub
gist.github.com โ€บ IanHopkinson โ€บ ad45831a2fb73f537a79
Examples of xpath queries using lxml in python ยท GitHub
Experimenting with lxml.etree, I found that the default, unnamed namespace in the XML is available in the tree's data in nsmap[None]. See my lxml-test-etree.py, line 11โ€ฆ ... I found that naming it _ makes it convenient to refer to it in the XPath statement, as on line 23โ€ฆ
๐ŸŒ
ScrapingDog
scrapingdog.com โ€บ blog โ€บ web-scraping-with-xpath-and-python
A Comprehensive Guide on Web Scraping with XPath (Python)
September 12, 2024 - We will use lxml library to create ... support Xpath. It is a third-party library that can help you to pass HTML documents or any kind of XML document and then you can search any node in it using the Xpath syntax....
๐ŸŒ
GitHub
github.com โ€บ VolkanSah โ€บ Python-XPath-Tutorial
GitHub - VolkanSah/Python-XPath-Tutorial: XPath is a query language used for selecting nodes in an XML or HTML document. Python supports XPath queries through various libraries such as BeautifulSoup, lxml, and more. In this tutorial, we will use BeautifulSoup to demonstrate how XPath works with Python. ยท GitHub
# Select all elements with the class "header" headers = tree.xpath('//*[contains(@class, "header")]') # Select the first element with the id "title" title = tree.xpath('//*[@id="title"][1]') # Select all paragraphs inside elements with the class "main" paragraphs = tree.xpath('//div[contains(@class, "main")]//p')
Author ย  VolkanSah
๐ŸŒ
Zyte
zyte.com โ€บ learn โ€บ a-practical-guide-to-xml-parsing-with-python
A Practical Guide to Python XML Parsing
XML is hierarchical, meaning each element can have children, attributes, and text content. You can easily access the child elements of any node in the tree: ... This loop goes through the children of the root node, printing their tags (element names) and text (element content). For more deeply nested elements, you can chain multiple loops or use XPath-like syntax:
๐ŸŒ
Java Code Geeks
examples.javacodegeeks.com โ€บ home โ€บ web development โ€บ python
How to use XPath in Python - Java Code Geeks
September 30, 2021 - Learn how to use the XPath with XSLT in Python to parse and transform an XML File. XPath stands for XML Path Language.
๐ŸŒ
Oxylabs
oxylabs.io โ€บ home โ€บ resources โ€บ web scraping faq โ€บ python โ€บ xpath examples
How to Use XPath in Python?
Use the contains() function to match elements based on partial attribute values or text, making your XPath queries more flexible and robust.
๐ŸŒ
Martiandefense
book.martiandefense.llc โ€บ notes โ€บ coding-programming โ€บ python โ€บ xml-basics-with-python
XML Basics with Python | Martian Defense NoteBook
This ensures there is no name collision between the two table elements. XPath and XQuery are languages used to select nodes from an XML document. XPath stands for XML Path Language, while XQuery stands for XML Query Language.