Scraping websites with Javascript enabled?

2 Answers

I wrote a small tutorial on this subject, this might help:

http://koaning.io.s3-website.eu-west-2.amazonaws.com/dynamic-scraping-with-python.html

Basically what you do is you have the selenium library pretend that it is a firefox browser, the browser will wait until all javascript has loaded before it continues passing you the html string. Once you have this string, you can then parse it with beautifulsoup.

154

answered Oct 04 '22 14:10

cantdutchthis

I've had exactly the same problem. It is not simple at all, but I finally found a great solution, using PyQt4.QtWebKit.

You will find the explanations on this webpage : http://blog.motane.lu/2009/07/07/downloading-a-pages-content-with-python-and-webkit/

I've tested it, I currently use it, and that's great !

Its great advantage is that it can run on a server, only using X, without a graphic environment.

answered Oct 04 '22 14:10

Guillaume Lebourgeois

Related questions
                            
                                AngularJS Directive Two-Way Data Binding Not Working When Observing Boolean
                            
                                Sanitizing HTML input value
                            
                                Is google apps script synchronous?
                            
                                javascript: replace ↵ from string [duplicate]
                            
                                How to check a user watched the full video in html5 video player
                            
                                D3 transition looping throwing Uncaught TypeError: t.call is not a function
                            
                                How to implement dropzone.js to upload file into Amazon s3 server?
                            
                                Injecting $mdMenu into scope. Cannot read property open() of undefined
                            
                                Access the value of Symbol(id) property on an object
                            
                                node-sass installation issue
                            
                                Draggable line chart in R/Shiny
                            
                                React-Native FlatList performance problems with large list
                            
                                Install font awesome 5 with npm for scss usage
                            
                                How to totally disable a react component?
                            
                                TypeError: expect(...).to.startsWith is not a function - chai and chakram
                            
                                rxjs execute tap only at the first time
                            
                                Correct way to reference Javascript in ASP.NET MVC?
                            
                                Is there a preferred way of formatting jQuery chains to make them more readable?
                            
                                jQuery works in Firefox when Firebug is running, does not work when Firebug is NOT running
                            
                                Preserve text selection in contenteditable while interacting with jQuery UI Dialog and text input

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Scraping websites with Javascript enabled?

Tags:

python

javascript

screen-scraping

user216171

People also ask

2 Answers

cantdutchthis

Guillaume Lebourgeois

Recent Activity

Donate For Us