I'm trying to get response from nominatim to geo-code few thousands of cities. <pre class="prettyprint"><code>import os import requests import xml.etree.ElementTree as ET txt = open('input.txt', 'r').readlines() for line in txt: lp, region, district, municipality, city = line.split('\t') baseUrl = 'http://nominatim.openstreetmap.org/search/gb/'+region+'/'+district+'/'+municipality+'/'+city+'/?format=xml' # eg. http://nominatim.openstreetmap.org/search/pl/podkarpackie/stalowowolski/Bojan%C3%B3w/Zapu%C5%9Bcie/?format=xml resp = requests.get(baseUrl) resp.encoding = 'UTF-8' # special diacritics msg = resp.text # parse response to get lat & long tree = ET.parse(msg) root = tree.getroot() print tree </code></pre> but the result is: <pre class="prettyprint"><code>Traceback (most recent call last): File "geo_miasta.py", line 17, in <module> tree = ET.parse(msg) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse tree.parse(source, parser) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 647, in parse source = open(source, "rb") IOError: [Errno 2] No such file or directory: u'<?xml version="1.0" encoding="UTF-8" ?>\n<searchresults timestamp=\'Tue, 11 Feb 14 21:13:50 +0000\' attribution=\'Data \xa9 OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright\' querystring=\'\u015awierczyna, Drzewica, opoczy\u0144ski, \u0142\xf3dzkie, gb\' polygon=\'false\' more_url=\'http://nominatim.openstreetmap.org/search?format=xml&amp;exclude_place_ids=&amp;q=%C5%9Awierczyna%2C+Drzewica%2C+opoczy%C5%84ski%2C+%C5%82%C3%B3dzkie%2C+gb\'>\n</searchresults>' </code></pre> What is wrong with this? Edit: Thant to @rob my solution is: <pre class="prettyprint"><code>#! /usr/bin/env python2.7 # -*- coding: utf-8 -*- import os import requests import xml.etree.ElementTree as ET txt = open('input.txt', 'r').read().split('\n') for line in txt: lp, region, district, municipality, city = line.split('\t') baseUrl = 'http://nominatim.openstreetmap.org/search/pl/'+region+'/'+district+'/'+municipality+'/'+city+'/?format=xml' resp = requests.get(baseUrl) msg = resp.content tree = ET.fromstring(msg) for place in tree.findall('place'): location = '{:5f}\t{:5f}'.format( float(place.get('lat')), float(place.get('lon'))) f = open('result.txt', 'a') f.write(location+'\t'+region+'\t'+district+'\t'+municipality+'\t'+city) f.close() </code></pre>

You are using <code>xml.etree.ElementTree.parse()</code>, which takes a filename or a file object as an argument. But, you are not passing a file or file object in, you are passing a unicode string. Try <code>xml.etree.ElementTree.fromstring(text)</code>. Like this: <pre class="prettyprint"><code> tree = ET.fromstring(msg) </code></pre> Here is a complete sample program: <pre class="prettyprint"><code>import os import requests import xml.etree.ElementTree as ET baseUrl = 'http://nominatim.openstreetmap.org/search/pl/podkarpackie/stalowowolski/Bojan%C3%B3w/Zapu%C5%9Bcie\n/?format=xml' resp = requests.get(baseUrl) msg = resp.content tree = ET.fromstring(msg) for place in tree.findall('place'): print u'{:s}: {:+.2f}, {:+.2f}'.format( place.get('display_name'), float(place.get('lon')), float(place.get('lat'))).encode('utf-8') </code></pre>

Xml parsing from web response

Tags:

python

xml

web-services

nominatim

I'm trying to get response from nominatim to geo-code few thousands of cities.

import os
import requests
import xml.etree.ElementTree as ET

txt = open('input.txt', 'r').readlines()
for line in txt:
 lp, region, district, municipality, city = line.split('\t')
 baseUrl = 'http://nominatim.openstreetmap.org/search/gb/'+region+'/'+district+'/'+municipality+'/'+city+'/?format=xml' 
 # eg. http://nominatim.openstreetmap.org/search/pl/podkarpackie/stalowowolski/Bojan%C3%B3w/Zapu%C5%9Bcie/?format=xml
 resp = requests.get(baseUrl)
 resp.encoding = 'UTF-8' # special diacritics
 msg = resp.text
 # parse response to get lat & long
 tree = ET.parse(msg)
 root = tree.getroot()
 print tree

but the result is:

Traceback (most recent call last):
File "geo_miasta.py", line 17, in <module>
    tree = ET.parse(msg)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1182, in parse
    tree.parse(source, parser)
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 647, in parse
    source = open(source, "rb")    
IOError: [Errno 2] No such file or directory: u'<?xml version="1.0" encoding="UTF-8" ?>\n<searchresults timestamp=\'Tue, 11 Feb 14 21:13:50 +0000\' attribution=\'Data \xa9 OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright\' querystring=\'\u015awierczyna, Drzewica, opoczy\u0144ski, \u0142\xf3dzkie, gb\' polygon=\'false\' more_url=\'http://nominatim.openstreetmap.org/search?format=xml&amp;exclude_place_ids=&amp;q=%C5%9Awierczyna%2C+Drzewica%2C+opoczy%C5%84ski%2C+%C5%82%C3%B3dzkie%2C+gb\'>\n</searchresults>'

What is wrong with this?

Edit: Thant to @rob my solution is:

#! /usr/bin/env python2.7
# -*- coding: utf-8 -*-

import os
import requests
import xml.etree.ElementTree as ET

txt = open('input.txt', 'r').read().split('\n')

for line in txt:
    lp, region, district, municipality, city = line.split('\t')
    baseUrl = 'http://nominatim.openstreetmap.org/search/pl/'+region+'/'+district+'/'+municipality+'/'+city+'/?format=xml'
    resp = requests.get(baseUrl)
    msg = resp.content
    tree = ET.fromstring(msg)
    for place in tree.findall('place'):
    location = '{:5f}\t{:5f}'.format(
        float(place.get('lat')),
        float(place.get('lon')))

    f = open('result.txt', 'a')
    f.write(location+'\t'+region+'\t'+district+'\t'+municipality+'\t'+city)
    f.close()

557

asked Feb 11 '14 21:02

m93

1 Answers

You are using xml.etree.ElementTree.parse(), which takes a filename or a file object as an argument. But, you are not passing a file or file object in, you are passing a unicode string.

Try xml.etree.ElementTree.fromstring(text).

Like this:

 tree = ET.fromstring(msg)

Here is a complete sample program:

import os
import requests
import xml.etree.ElementTree as ET

baseUrl = 'http://nominatim.openstreetmap.org/search/pl/podkarpackie/stalowowolski/Bojan%C3%B3w/Zapu%C5%9Bcie\n/?format=xml'
resp = requests.get(baseUrl)
msg = resp.content
tree = ET.fromstring(msg)
for place in tree.findall('place'):
  print u'{:s}: {:+.2f}, {:+.2f}'.format(
    place.get('display_name'),
    float(place.get('lon')),
    float(place.get('lat'))).encode('utf-8')

answered Sep 22 '22 10:09

Robᵩ

Related questions
                            
                                how to find source collections.deque?
                            
                                How to call a celery task delay function from non-python languages such as Java?
                            
                                "OSError: dlopen(libSystem.dylib, 6): image not found" (OS X + macports + Celery 3.1.7)
                            
                                get div from HTML with Python
                            
                                How to use the debugging tool in Spyder for python scripts?
                            
                                Finding a nonrecursive DOM subnode in Python using BeautifulSoup
                            
                                Why does X.dot(X.T) require so much memory in numpy?
                            
                                Flask: asynchronous response to client
                            
                                Speed up nested for loop with elements exponentiation
                            
                                BadStatusLine exception raised when returning reply from server in Python 3
                            
                                Creating custom string type in Python
                            
                                How to open a mp4 file with python?
                            
                                How can I store and print the top 20% feature names and scores?
                            
                                numpy array integer indexing in more than one dimension
                            
                                python pycurl get final url redirect
                            
                                How can I set maximum and minimum value in the color scale of contourf ?
                            
                                Python pandas removing SettingWithCopyWarning
                            
                                python requests not working with google app engine
                            
                                PyAudio 'utf8' error when listing devices
                            
                                pass an undefined method call to an attribute containing a different object

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With