Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Searching through webpage

Hey I'm working on a Python project that requires I look through a webpage. I want to look through to find a specific text and if it finds the text, then it prints something out. If not, it prints out an error message. I've already tried with different modules such as libxml but I can't figure out how I would do it.

Could anybody lend some help?

like image 650
AustinM Avatar asked Mar 22 '26 16:03

AustinM


2 Answers

You could do something simple like:


import urllib2
import re

html_content = urllib2.urlopen('http://www.domain.com').read()

matches = re.findall('regex of string to find', html_content);

if len(matches) == 0: 
   print 'I did not find anything'
else:
   print 'My string is in the html'
like image 185
dplouffe Avatar answered Mar 25 '26 04:03

dplouffe


lxml is awesome: http://lxml.de/parsing.html

I use it regularly with xpath for extracting data from the html.

The other option is http://www.crummy.com/software/BeautifulSoup/ which is great as well.

like image 35
Bassdread Avatar answered Mar 25 '26 05:03

Bassdread



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!