Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why urllib.urlopen.read() does not correspond to source code?

I'm trying to fetch the following webpage:

import urllib
urllib.urlopen("http://www.gallimard-jeunesse.fr/searchjeunesse/advanced/(order)/author?catalog[0]=1&SearchAction=1").read()

The result does not correspond to what I see when inspecting the source code of the webpage using Google Chrome for example.

Could you tell me why this happens and how I could improve my code to overcome the problem?

Thank you for your help.

like image 309
Nikolay Nikolov Avatar asked Sep 17 '12 20:09

Nikolay Nikolov


1 Answers

you can use python Selenium to solved your issue. Here is a example code have a look.

from selenium import webdriverr
url = "http://www.gallimard-jeunesse.fr/searchjeunesse/advanced/(order)/author?catalog[0]=1&SearchAction=1"
browser = webdriver.Firefox()
browser.get(url)
sleep(10)
all_body_id_html =  browser.find_element_by_id('body') # you can also get all html

Then due your rest of work according to your choice some more example with browser instance

def login(user='ssdf', password="cisin123"):
content = browser.find_element_by_id('content')
content.find_element_by_xpath('.//tbody/tr[2]//input[contains(@class,"textbox")]').send_keys(user)
content.find_element_by_xpath('.//tbody/tr[3]//input[contains(@class,"textbox")]').send_keys(password)
content.find_element_by_css_selector(".button").click()
like image 167
Yogesh dwivedi Geitpl Avatar answered Oct 13 '22 03:10

Yogesh dwivedi Geitpl