The server employs User-Agent sniffing; it looks at the User-Agent header and if it doesn't like what it sees it'll return an empty response.
You can set the header yourself:
Copyimport urllib2
import shutil
headers = {'User-Agent': 'Mozilla'}
urlfile = "http://www.equibase.com/premium/eqbLateChangeXMLDownload.cfm"
request = urllib2.Request(urlfile, headers=headers)
response = urllib2.urlopen(request)
with open("c:\\test.xml", 'wb') as outfile:
shutil.copyfileobj(response, outfile)
The 'Mozilla' User-Agent string is apparently enough to convince the server to serve the file.
I used a combination of urllib2 (an updated version of the urllib library) and shutil.copyfileobj() to handle setting additional headers, then copying the response data to a file. urllib.urlretrieve() does not support adding headers, and urllib2 has no urlretrieve() equivalent.
Building a URL that downloads an XML File
How to download from the XML url using python - Stack Overflow
python - XML download from a url - Stack Overflow
xml parsing - Downloading XML file with Python 3 - Stack Overflow
Hi!
I'm trying to figure out how to set up a URL that when hit downloads an XML file. I think I would use something like Flask to set this up, but am not quite sure if that's correct
Would love some help figuring out exactly how I can set this up. Thanks!
First suggestion: do what even the official urllib docs says and don't use urllib, use requests instead.
Your problem is that you use .writelines() and it expects a list of lines, not a bytes objects (for once in Python the error message is not very helpful). Use .write() instead
import requests
resp = requests.get('URL')
with open('file.xml', 'wb') as foutput:
foutput.write(resp.content)
I found a solution :
from urllib.request import urlopen
xml = open("import.xml", "r+")
xml.write(urlopen('URL').read().decode('utf-8'))
xml.close()
Thanks for your help.
This should help. The url you mentioned gives a zip. You can download that and extract it to get your XML.
EX:
import requests
import zipfile
import StringIO
URL='http://oasis.caiso.com/oasisapi/SingleZip?queryname=PRC_FUEL&fuel_region_id=ALL&startdatetime=20130919T07:00-0000&enddatetime=20130928T07:00-0000&version=1'
r = requests.get(URL, stream=True)
z = zipfile.ZipFile(StringIO.StringIO(r.content))
z.extractall()
You could also use urllib
import urllib
urllib.urlretrieve(
"http://oasis.caiso.com/oasisapi/SingleZip?queryname=PRC_FUEL&fuel_region_id=ALL&startdatetime=20130919T07:00-0000&enddatetime=20130928T07:00-0000&version=1",
"oasis.zip")
I'd use xmltodict to make a python dictionary out of the XML data structure and pass this dictionary to the template inside the context:
import urllib2
import xmltodict
def homepage(request):
file = urllib2.urlopen('https://www.goodreads.com/review/list/20990068.xml?key=nGvCqaQ6tn9w4HNpW8kquw&v=2&shelf=toread')
data = file.read()
file.close()
data = xmltodict.parse(data)
return render_to_response('my_template.html', {'data': data})
xmltodict using requests
import requests
import xmltodict
url = "https://yoursite/your.xml"
response = requests.get(url)
data = xmltodict.parse(response.content)
You don't need to write line-by-line. Just write the whole thing in one go:
>>> import urllib2
>>> url = "webservice"
>>> s = urllib2.urlopen(url)
>>> contents = s.read()
>>> file = open("export.xml", 'w')
>>> file.write(contents)
>>> file.close()
You can store it in a string:
content = s.read()
or a StringIO if you need file-like interface
content = cStringIO.StringIO()
content.write(s.read)