Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BeautifulSoup finding xml tags

I've got some OSM data of fast food restaurants which I retrieved using the Xapi, and here are some sample results:

<osm version="0.6" generator="Osmosis SNAPSHOT-r26564">
   <node id="486275964" version="4" timestamp="2010-05-03T08:21:42Z" uid="12055" user="aude" changeset="4592597" lat="38.8959533" lon="-77.0212458">
      <tag k="name" v="Potato Valley Cafe"/>
      <tag k="amenity" v="fast_food"/>
   </node>
   <node id="486275966" version="4" timestamp="2010-08-06T16:44:13Z" uid="207745" user="NE2" changeset="5418228" lat="38.8959399" lon="-77.0196338">
      <tag k="cuisine" v="burger"/>
      <tag k="name" v="McDonald's"/>
      <tag k="amenity" v="fast_food"/>
   </node>
   <node id="612190923" version="1" timestamp="2010-01-12T14:01:27Z" uid="111209" user="cov" changeset="3603297" lat="38.893683" lon="-77.0292732">
      <tag k="level" v="-1"/>
      <tag k="cuisine" v="sandwich"/>
      <tag k="name" v="Quizno's"/>
      <tag k="amenity" v="fast_food"/>
   </node> 
</osm>
<!--corrected indentation-->

I'm trying to used BeautifulSoup in python to extract the lat, long, name and cuisine from this. I can get the lat and long no problem with this code:

soup = BeautifulSoup(results)
takeaways = soup.findAll('node')

for eachtakeaway in takeaways:
    longitude = str(eachtakeaway['lon'])
    lattitude = str(eachtakeaway['lat'])

But I can't get the name:

name = str(eachtakeaway['name'])

Which throws up the error:

TypeError: 'NoneType' object is not callable

Can you tell me what to do? Thanks.

like image 271
eamon1234 Avatar asked Feb 19 '23 12:02

eamon1234


1 Answers

The problem is, the square brackets are to retrieve attributes of a tag, ie lat and lon. Name, however, is an attribute of another tag. Try something like this:

soup = BeautifulSoup(results)
takeaways = soup.findAll('node')

for eachtakeaway in takeaways:
    another_tag = eachtakeaway('tag')
    for tag_attrs in another_tag:
        if str(tag_attrs['k']) == 'cuisine':
            print str(tag_attrs['v'])

This will return the cuisine value. The same concept applies for retrieving name.

*Untested

like image 194
That1Guy Avatar answered Feb 21 '23 01:02

That1Guy