I'm parsing HTML and I need to get only tags with selector like div.content
.
For parsing I'm using HTMLParser. I'm so far that I get list of tags' attributes.
It looks something like this:
[('class', 'content'), ('title', 'source')]
The problem is that I don't know how to check that:
class
,content
;I know this is easy question, but I'm quite new with Python as well. Thanks in any advice!
When looping through your elements:
if ('class', 'content') in element_attributes:
#do stuff
l = [('class', 'content'), ('title', 'source')]
('class', 'content') in l
returns True, because there is at least one tuple with 'class' as first and 'content' as second element.
You can now use it:
if ('class', 'content') in l:
# do something
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With