python parse xml from url elementtree

Parsing XML files with Python (xml.etree.ElementTree) - YouTube

July 4, 2020

youtube.com

XML & ElementTree || Python Tutorial || Learn Python Programming ...

June 3, 2021

10:07

Parse XML Files with Python - Basics in 10 Minutes

October 10, 2022

16:56

Parsing XML with Namespaces with Python (xml.etree.ElementTree) ...

December 23, 2020

03:10

The Modern Python Challenge: Using ElementTree to Parse XML | ...

June 10, 2020

View all

stackoverflow.com › questions › 32261323 › parsing-a-xml-file-from-a-specific-web-address-using-elementtree-in-python

Parsing a xml file from a specific web address using ElementTree in python - Stack Overflow

datacamp.com › tutorial › python-xml-elementtree

1 of 1

You can use a combination of ElementTree's fromstring() method and the requests module's requests.get() to accomplish this.

https://docs.python.org/2/library/xml.etree.elementtree.html#parsing-xml

fromstring() parses XML from a string directly into an Element, which is the root element of the parsed tree.

Install the requests module:

pip install requests

Use the requests.get() to get your xml file from the url as a string. Pass that into the fromstring() function.

import xml.etree.cElementTree as ET
import requests
tree = ET.fromstring(requests.get('http://synd.cricbuzz.com/j2me/1.0/livematches.xml').text)
for child in tree:
   print("%s - %s"%(child.get('srs'),child.get('mchDesc')))

Results:

None - None
India tour of Sri Lanka, 2015 - Cricbuzz Cup - SL vs IND
Australia tour of Ireland, 2015 - IRE vs AUS
New Zealand tour of South Africa, 2015 - RSA vs NZ
Royal London One-Day Cup, 2015 - SUR vs KENT
Royal London One-Day Cup, 2015 - ESS vs YORKS

DataCamp

Python XML Tutorial: Element Tree Parse & Read | DataCamp

December 10, 2024 - The XML file provided describes a basic collection of movies. The only problem is that the data is a mess! There have been many different curators of this collection, and everyone has their own way of entering data into the file. The main goal in this tutorial will be to read and understand the file with Python and then fix the problems. First, you need to read the file with ElementTree. tree = ET.parse('movies.xml') root = tree.getroot()

stackoverflow.com › questions › 61667873 › python-parsing-xml-data-with-elementtree

Python - Parsing XML data with ElementTree - Stack Overflow

Below uses the built-in urllib moduel to parse XML from URL: from urllib.request import urlopen import xml.etree.ElementTree as ET def vatbook_parse(url): with urlopen(url) as f: tree = ET.parse(f) root = tree.getroot() # CONDITIONALLY SET SEARCH PATH path = './/atcs/booking' if tree.find('atc') is None else './/atc' for atcs in root.iterfind(path): callsign = atcs.find('callsign') name = atcs.find('name') time_start = atcs.find('time_start') time_end = atcs.find('time_end') if callsign is not None: print(f"{name.text} booked {callsign.text} from {time_start.text} to {time_end.text}") First URL ·

stackoverflow.com › questions › 53212932 › trying-to-parse-xml-directly-from-a-url

python - Trying to parse XML directly from a URL - Stack Overflow

1 of 1

Inside the response element there's a row element, so your for loop should be in root[0] instead of root

Here's an example from your snippet, hope it helps you understand the issue

import xml.etree.ElementTree as ET
tree = ET.parse('rows.xml')
root = tree.getroot()

for _id in root[0].findall('row'):
    rank = _id.find('ethcty').text
    name = _id.find('cnt').text
    print(name, rank)

Also, findall should be the name of the node you want

As for loading directly from the url you should use the urllib as follows:

from urllib.request import urlopen
import xml.etree.ElementTree as ET

with urlopen('https://data.cityofnewyork.us/api/views/25th-nujf/rows.xml') as f:
    tree = ET.parse(f)
    root = tree.getroot()

    for _id in root[0].findall('row'):
        rank = _id.find('ethcty').text
        name = _id.find('cnt').text
        print(name, rank)

I edited the latter code because I forgot about the loading from the URL part of your question, i'm sorry about that

stackoverflow.com › questions › 21179272 › parsing-a-url-xml-with-the-elementtree-xml-api › 21179544

python - Parsing a URL XML with the ElementTree XML API - Stack Overflow

python-forum.io › thread-14239.html

1 of 1

You can use urllib2 to download and parse the file in the same way. For e.g. the first few lines will be changed to:

import xml.etree.cElementTree as ET
import urllib2

for i in range(3):
    tree = ET.ElementTree(file=urllib2.urlopen('http://www.trion%i.com:6060/stat.xml' % i ))


    root = tree.getroot()
    root.tag, root.attrib

    # Rest of your code goes here....

Find elsewhere

Google Bing Mojeek

Python Forum

XML parsing from URL

Hello I started my trek into Python a few days ago. I am receiving the following error: Quote:Please enter an XML URL to parse: http://py4e-data.dr-chuck.net/comments_42.xml Traceback (most recent call last): File '/home/lamidotijjo/Documents/Pyth...

stackoverflow.com › questions › 47280129 › parsing-xml-using-python-elementtree

Parsing XML using Python ElementTree - Stack Overflow

1 of 3

From ElementTree docs:

We can import this data by reading from a file:

import xml.etree.ElementTree as ET

tree = ET.parse('country_data.xml')
root = tree.getroot()

Or directly from a string:

root = ET.fromstring(country_data_as_string)

and later in the same page, 20.5.1.4. Finding interesting elements:

for neighbor in root.iter('neighbor'):
    print(neighbor.attrib)

Which translate to:

import xml.etree.ElementTree as ET

root = ET.fromstring("""
<root>
<H D="14/11/2017">
<FC>
    <F LV="0">The quick</F>
    <F LV="1">brown</F>
    <F LV="2">fox</F>
</FC>
</H>
<H D="14/11/2017">
<FC>
    <F LV="0">The lazy</F>
    <F LV="1">fox</F>
</FC>
</H>
</root>""")
# root = tree.getroot()
for h in root.iter("H"):
    print (h.attrib["D"])
for f in root.iter("F"):
    print (f.attrib, f.text)

output:

14/11/2017
14/11/2017
{'LV': '0'} The quick
{'LV': '1'} brown
{'LV': '2'} fox
{'LV': '0'} The lazy
{'LV': '1'} fox

2 of 3

You did not specifiy what exactly you whant to use so i recommend lxml for python. For getting the values you whant you have more possibiltys:

With a loop:

from lxml import etree
tree = etree.parse('XmlTest.xml')
root = tree.getroot()
text = []
for element in root:
   text.append(element.get('D',None))
     for child in element:
       for grandchild in child:
         text.append(grandchild.text)
print(text)

Output: ['14/11/2017', 'The quick', 'brown', 'fox', '14/11/2017', 'The lazy', 'fox']

With xpath:

from lxml import etree
tree = etree.parse('XmlTest.xml')
root = tree.getroot() 
D = root.xpath("./H")
F = root.xpath(".//F")

for each in D:
  print(each.get('D',None))

for each in F:
  print(each.text)

Output: 14/11/2017 14/11/2017 The quick brown fox The lazy fox

Both have there own advantages but give you a good starting point. I recommend the xpath since it gives you more freedom when values are missing.

stackoverflow.com › questions › 1786476 › parsing-xml-in-python-using-elementtree-example

Parsing XML in Python using ElementTree example - Stack Overflow

gis.stackexchange.com › questions › 21319 › parse-xml-files-in-python-elementtree

1 of 2

So I have ElementTree 1.2.6 on my box now, and ran the following code against the XML chunk you posted:

import elementtree.ElementTree as ET

tree = ET.parse("test.xml")
doc = tree.getroot()
thingy = doc.find('timeSeries')

print thingy.attrib

and got the following back:

{'name': 'NWIS Time Series Instantaneous Values'}

It appears to have found the timeSeries element without needing to use numerical indices.

What would be useful now is knowing what you mean when you say "it doesn't work." Since it works for me given the same input, it is unlikely that ElementTree is broken in some obvious way. Update your question with any error messages, backtraces, or anything you can provide to help us help you.

2 of 2

If I understand your question correctly:

for elem in doc.findall('timeSeries/values/value'):
    print elem.get('dateTime'), elem.text

or if you prefer (and if there is only one occurrence of timeSeries/values:

values = doc.find('timeSeries/values')
for value in values:
    print value.get('dateTime'), elem.text

The findall() method returns a list of all matching elements, whereas find() returns only the first matching element. The first example loops over all the found elements, the second loops over the child elements of the values element, in this case leading to the same result.

I don't see where the problem with not finding timeSeries comes from however. Maybe you just forgot the getroot() call? (note that you don't really need it because you can work from the elementtree itself too, if you change the path expression to for example /timeSeriesResponse/timeSeries/values or //timeSeries/values)

Stack Exchange

Parse XML files in Python (ElementTree) - Geographic Information Systems Stack Exchange

edureka.co › blog › python-xml-parser-tutorial

1 of 4

Before I try to answer, a tip. Your exception handler covers up the nature of the problem. Just let the original exception rise up and you'll have more information to share with people who are interested in helping you.

I like to use feedparser to parse Atom feeds. It does indeed give you dict-like objects. I submitted a patch to feedparser 4.1 to parse the GeoRSS elements into GeoJSON style dicts. See https://code.google.com/p/feedparser/issues/detail?id=62 and blog post at http://sgillies.net/blog/566/georss-patch-for-universal-feedparser/. You'd use it like this:

>>> import feedparser
>>> feed = feedparser.parse("http://earthquake.usgs.gov/earthquakes/catalogs/1hour-M1.xml")
>>> feed.entries[0]['where']
{'type': 'Point', 'coordinates': (-122.8282, 38.844700000000003)}

My patched version of 4.1 is in my Dropbox and you can get it using pip.

$ pip install http://dl.dropbox.com/u/10325831/feedparser-4.1-georss.tar.gz

Or just download and install with "python setup.py install".

2 of 4

It's more comfortable to use lxml for XML processing. Here is an example that fetches the feed and prints earthquake titles and coordinates:

import lxml.etree

feed_url = 'http://earthquake.usgs.gov/earthquakes/catalogs/1hour-M1.xml'
ns = {
    'atom': 'http://www.w3.org/2005/Atom',
    'georss': 'http://www.georss.org/georss',
}

def main():
    doc = lxml.etree.parse(feed_url)
    for entry in doc.xpath('//atom:entry', namespaces=ns):
        [title] = entry.xpath('./atom:title', namespaces=ns)
        [point] = entry.xpath('./georss:point', namespaces=ns)
        print point.text, title.text

if __name__ == '__main__':
    main()

Edureka

Python XML Parser Tutorial | ElementTree and Minidom Parsing | Edureka

December 5, 2024 - In this Python XML Parser Tutorial, you will learn how to parse, read, modify and find elements from XML files in Python using ElementTree and Minidom.

stackoverflow.com › questions › 647071 › python-xml-elementtree-from-a-string-source

Python xml ElementTree from a string source? - Stack Overflow

github.com › luminati-io › parsing-xml-with-python

1 of 4

361

You can parse the text as a string, which creates an Element, and create an ElementTree using that Element.

import xml.etree.ElementTree as ET
tree = ET.ElementTree(ET.fromstring(xmlstring))

I just came across this issue and the documentation, while complete, is not very straightforward on the difference in usage between the parse() and fromstring() methods.

2 of 4

117

If you're using xml.etree.ElementTree.parse to parse from a file, then you can use xml.etree.ElementTree.fromstring to get the root Element of the document. Often you don't actually need an ElementTree.

See xml.etree.ElementTree

GitHub

GitHub - luminati-io/parsing-xml-with-python: Parse XML in Python using ElementTree, lxml, SAX, and more for efficient data processing and structured data integration. · GitHub

Since it’s part of Python’s standard library, there’s no need for any additional installation. For instance, you can use the findall() method to retrieve all url elements from the root and print the text content of each loc element, like so:

Author luminati-io

reddit.com › r › learnpython › comments › 412rd3 › reading_xml_with_python3

r/learnpython - reading xml with python3

January 15, 2016 -

I am trying to parse a xml from a url, with python 3, but i always end up with:

xml.etree.ElementTree.ParseError: not well-formed (invalid token):

the code looks like this:

import requests
import urllib
from urllib.request import urlopen
import xml.etree.ElementTree as etree

response = urllib.request.urlopen("http://regnskaber.virk.dk/32673592/eGJybHN0b3JlOi8vWC1GNzY5MUY0Ny0yMDE0MDMyOV8xMzQxNThfMTc5L3hicmw.xml")
tree = etree.parse(response)
root = tree.getroot()

what am i missing?

gist.github.com › MichelleDalalJian › f587530b6e0a72357541f39b2022aa55

1 of 2

The xml is gzip compressed - requests handles this automatically for you which you could use instead of urllib.

response = requests.get(url)tree = etree.fromstring(response.content)

http://stackoverflow.com/a/26435241 discusses solutions for doing it with urllib

2 of 2

As mentioned, the XML is compressed. You COULD change to use Requests, but you could also (more properly) do etree.parse(response.read()). urllib does handle gzip encoding, but you've got to call the .read() method to actually do that parsing. You may also need to do .decode('utf-8') in some cases; it depends on if ElementTree can handle bytes-like objects or if it needs a plain string.

GitHub

Extracting Data from XML: The program will prompt for a URL, read the XML data from that URL using urllib and then parse and extract the comment counts from the XML data, compute the sum of the numbers in the file. · GitHub

('NoneType' object has no attribute 'text') import urllib.request, urllib.parse, urllib.error import ssl import xml.etree.ElementTree as ET ctx = ssl.create_default_context() ctx.check_hostname = False ctx.verify_mode = ssl.CERT_NONE Value = input('Enter location: ') print('Retrieving',Value) uh = urllib.request.urlopen(Value, context=ctx) data = uh.read() data = data.decode() tree = ET.fromstring(data) counts = tree.findall('.//count') print('Retrieved',len(data),'characters') counter = 0 sum = 0 for elements in counts: counter += 1 sum = (elements.find('count').text) + sum print(counter) print(sum)

GeeksforGeeks

geeksforgeeks.org › xml-parsing-python

XML parsing in Python - GeeksforGeeks

June 28, 2022 - In this article, we will discuss how to scrap paragraphs from HTML using Beautiful Soup Method 1: using bs4 and urllib. Module Needed: bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files.