Python Scraping JavaScript using Selenium and Beautiful Soup

Tags:

I'm trying to scrape a JavaScript enables page using BS and Selenium. I have the following code so far. It still doesn't somehow detect the JavaScript (and returns a null value). In this case I'm trying to scrape the Facebook comments in the bottom. (Inspect element shows the class as postText)
Thanks for the help!

Click to copy

from selenium import webdriver  
from selenium.common.exceptions import NoSuchElementException  
from selenium.webdriver.common.keys import Keys  
import BeautifulSoup

browser = webdriver.Firefox()  
browser.get('http://techcrunch.com/2012/05/15/facebook-lightbox/')  
html_source = browser.page_source  
browser.quit()

soup = BeautifulSoup.BeautifulSoup(html_source)  
comments = soup("div", {"class":"postText"})  
print comments

913

asked Jan 25 '13 20:01

Jay Setti

1 Answers

There are some mistakes in your code that are fixed below. However, the class "postText" must exist elsewhere, since it is not defined in the original source code. My revised version of your code was tested and is working on multiple websites.

Click to copy

from selenium import webdriver  
from selenium.common.exceptions import NoSuchElementException  
from selenium.webdriver.common.keys import Keys  
from bs4 import BeautifulSoup

browser = webdriver.Firefox()  
browser.get('http://techcrunch.com/2012/05/15/facebook-lightbox/')  
html_source = browser.page_source  
browser.quit()

soup = BeautifulSoup(html_source,'html.parser')  
#class "postText" is not defined in the source code
comments = soup.findAll('div',{'class':'postText'})  
print comments

answered Sep 20 '22 19:09

user3186527

Related questions
                            
                                How to assert that a method is decorated with python unittest?
                            
                                How to use Google application-specific password in script?
                            
                                having a separate database for django-admin in django
                            
                                Django custom management command running Scrapy: How to include Scrapy's options?
                            
                                Selenium/WebDriver script gets interrupted by alert - exception "Message: u'Modal dialog present'"
                            
                                Python package that supports weighted covariance computation
                            
                                Reading a parent's scope in python
                            
                                Can a game made with pygame be submitted to Steam? [closed]
                            
                                Fast(er) numpy fancy indexing and reduction?
                            
                                In Python, how do I voxelize a 3D mesh
                            
                                How to exclude a file from coverage.py?
                            
                                fix pyflakes dealing with @property setter decorator
                            
                                Scale images with PIL preserving transparency and color?
                            
                                Create SQL table with correct column types from CSV
                            
                                SQLAlchemy with PostgreSQL and Full Text Search
                            
                                Sublime text: How to get the file name of the current view
                            
                                Run Python CGI Application on Heroku
                            
                                Django: override get_FOO_display()
                            
                                Using statements on either side of a python ternary conditional
                            
                                SQL Parsing library for Python [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python Scraping JavaScript using Selenium and Beautiful Soup

Tags:

python

beautifulsoup

selenium

screen-scraping

Jay Setti

People also ask

1 Answers

user3186527

Recent Activity

Donate For Us