I am writing a web scraper for a particular webpage and I am doing this with "urllib2.Request(MyURL)" and "BeautifulSoup" but the problem is that there is a Paging on page in MyURL and the next page loads (in same myURL/page) by clicking on a link, behind this link is the javascript method written as
{ javascript:__doPostBack('rptPagingBottom$ctl01$btnPage','') }.
Now without executing this Javascript function from Python, I can not get a complete page listing. How can I call this Javascript method from Python so that I can get all pages of that web page?
I found one related question here where it is suggested to use (Rhino, V8, SeaMonkey) but I did not get this at all. I need some example code if it is possible.
Try Selenium for this kind of dirty work(inline js, ajax page loading). It is able to emulate exact what browsers can do with python and browser-driver.
You can get some info about how to use it as a crawler by search google with keyword 'selenium crawler'.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With