Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Scraping fb comments from a website

I have been trying to scrape facebook comments using Beautiful Soup on the below website pages.

import BeautifulSoup
import urllib2
import re

url = 'http://techcrunch.com/2012/05/15/facebook-lightbox/'

fd = urllib2.urlopen(url)

soup = BeautifulSoup.BeautifulSoup(fd)

fb_comment = soup("div", {"class":"postText"}).find(text=True)

print fb_comment

The output is a null set. However, I can clearly see the facebook comment is within those above tags in the inspect element of the techcrunch site (I am little new to Python and was wondering if the approach is correct and where I am going wrong?)

like image 247
Jay Setti Avatar asked Apr 21 '26 05:04

Jay Setti


1 Answers

Like Christopher and Thiefmaster: it is all because of javascript.

But, if you really need that information, you can still retrieve it thanks to Selenium on http://seleniumhq.org then use beautifulsoup on this output.

like image 97
Lynx-Lab Avatar answered Apr 23 '26 18:04

Lynx-Lab