Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

return data from HTMLParser handle_starttag

My question is a simpler version of this

I have a youtube iframe:

<iframe width="560" height="315" src="//www.youtube.com/embed/fY9UhIxitYM" frameborder="0" allowfullscreen></iframe>

I'm working on a small web app and need to extract the random code (fY9UhIxitYM in this case). I want to use the standard library rather than importing Beautiful Soup.

from HTMLParser import HTMLParser

class YoutubeLinkParser(HTMLParser):
    def __init__(self):
        HTMLParser.__init__(self)
        self.data = []

    def handle_starttag(self, tag, attrs):
        data = attrs[2][1].split('/')[-1]
        self.data.append(data)

iframe = open('iframe.html').read()
parser = YoutubeLinkParser()
linkCode = parser.feed(iframe)

The examples I have found use handle_data(self, data), but I need information on an attr of the open tag. I can print the value in the method, but when I try to get a return value, linkCode returns 'none'.

What am I missing? Thanks!

like image 464
McPedr0 Avatar asked Jun 16 '14 20:06

McPedr0


1 Answers

feed() method doesn't return anything - which is why you are getting None. Instead, read the value of data property after calling feed():

from HTMLParser import HTMLParser

class YoutubeLinkParser(HTMLParser):
    def handle_starttag(self, tag, attrs):
        self.data = attrs[2][1].split('/')[-1]

iframe = open('iframe.html').read()
parser = YoutubeLinkParser()
parser.feed(iframe)
print parser.data

Prints:

fY9UhIxitYM
like image 68
alecxe Avatar answered Sep 19 '22 14:09

alecxe