There are many good resources already on stackoverflow but I'm still having an issue. I've visited these sources:
I'm attempting to visit http://www.latax.state.la.us/Menu_ParishTaxRolls/TaxRolls.aspx and select a Parish. I believe this forces a post and allows me to select a year, which posts again, and allows for yet more selection. I've written my script a few different ways following the above sources and haven't successfully been able to submit the site to allow me to enter a year.
My current code
import urllib
from bs4 import BeautifulSoup
import mechanize
headers = [
('Accept','text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'),
('Origin', 'http://www.indiapost.gov.in'),
('User-Agent', 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17'),
('Content-Type', 'application/x-www-form-urlencoded'),
('Referer', 'http://www.latax.state.la.us/Menu_ParishTaxRolls/TaxRolls.aspx'),
('Accept-Encoding', 'gzip,deflate,sdch'),
('Accept-Language', 'en-US,en;q=0.8'),
]
br = mechanize.Browser()
br.addheaders = headers
url = 'http://www.latax.state.la.us/Menu_ParishTaxRolls/TaxRolls.aspx'
response = br.open(url)
# first HTTP request without form data
soup = BeautifulSoup(response)
# parse and retrieve two vital form values
viewstate = soup.findAll("input", {"type": "hidden", "name": "__VIEWSTATE"})
eventvalidation = soup.findAll("input", {"type": "hidden", "name": "__EVENTVALIDATION"})
formData = (
('__EVENTVALIDATION', eventvalidation[0]['value']),
('__VIEWSTATE', viewstate[0]['value']),
('__VIEWSTATEENCRYPTED',''),
)
try:
fout = open('C:\\GIS\\tmp.htm', 'w')
except:
print('Could not open output file\n')
fout.writelines(response.readlines())
fout.close()
I've also attempted this in the shell and what I entered plus what I received (modified to cut down on the bulk) can be found http://pastebin.com/KAW5VtXp
Anyway I try to change the value in the Parish dropdown list and post I get taken to a webmaster login page.
Am I approaching this the correct way? Any thoughts would be extremely helpful.
Thanks!
I ended up using selenium.
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()
driver.get("http://www.latax.state.la.us/Menu_ParishTaxRolls/TaxRolls.aspx")
elem = driver.find_element_by_name("ctl00$ContentPlaceHolderMain$ddParish")
elem.send_keys("TERREBONNE PARISH")
elem.send_keys(Keys.RETURN)
elem = driver.find_element_by_name("ctl00$ContentPlaceHolderMain$ddYear")
elem.send_keys("2013")
elem.send_keys(Keys.RETURN)
elem = driver.find_element_by_id("ctl00_ContentPlaceHolderMain_rbSearchField_1")
elem.click()
APN = 'APN # here'
elem = driver.find_element_by_name("ctl00$ContentPlaceHolderMain$txtSearch")
elem.send_keys(APN)
elem.send_keys(Keys.RETURN)
# Access the PDF
elem = driver.find_element_by_link_text('Generate Report')
elem.click()
elements = driver.find_elements_by_tag_name('a')
elements[1].click()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With