I'm trying to log in to a website from this URL: "https://pollev.com/login". Since I'm using a school email, the portal redirects to the school's login portal and uses that portal to authenticate the login. It shows up when you type in a uw.edu email (example: [email protected]). After logging in, UW sends a POST request callback to https://www.polleverywhere.com/auth/washington/callback with a SAMLResponse header like this. I think I need to simulate the GET request from pollev's login page and then send the login headers to the UW login page, but what I'm doing right now isn't working.
Here's my code:
import requests
with requests.session() as s:
header_data = {
'user - agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 '
'(KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36',
'referer': 'https://pollev.com/login'
}
login_data = {
'j_username' : 'username',
'j_password' : 'password',
'_eventId_proceed' : 'Sign in'
}
r = s.get('https://idp.u.washington.edu/idp/profile/SAML2/Redirect/SSO?execution=e2s1',
headers=header_data, data=login_data)
print(r.text)
Right now, r.text shows a NoSuchFlowExecutionException html page. What am I missing? Logging into the website normally requires a login, password, Referrer, and X-CSRF token which I was able to do, but I don't know how to navigate a redirect for authentication.
Old question but I had nearly identical needs and carried on until I solved it. In my case, which may still be the case of the OP, I have the required credentials. I am certain this could be made more efficient / pythonic and would greatly appreciate those tips / corrections.
import re
import requests
# start HTTP request session
s = requests.Session()
# Prepare for first request - This is the ultimate target URL
url1 = '/URL/needing/shibbolethSAML/authentication'
header_data = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.82 Safari/537.36'}
# Make first request
r1 = s.get(url1, headers = header_data)
# Prepare for second request - extract URL action for next POST from response, append header, and add login credentials
ss1 = re.search('action="', r1.text)
ss2 = re.search('" autocomplete', r1.text)
url2 = 'https://idp.u.washington.edu' + r1.text[ss1.span(0)[1]:ss2.span(0)[0]]
header_data.update({'Accept-Encoding': 'gzip, deflate, br', 'Content-Type': 'application/x-www-form-urlencoded'})
cred = {'j_username': 'username', 'j_password':'password', '_eventId_proceed' : 'Sign in'}
# Make second request
r2 = s.post(url2, data = cred)
# Prepare for third request - format and extract URL, RelayState, and SAMLResponse
ss3 = re.search('<form action="',r2.text) # expect only one instance of this pattern in string
ss4 = re.search('" method="post">',r2.text) # expect only one instance of this pattern in string
url3 = r2.text[ss3.span(0)[1]:ss4.span(0)[0]].replace(':',':').replace('/','/')
ss4 = re.search('name="RelayState" value="', r2.text) # expect only one instance of this pattern in string
ss5 = re.search('"/>', r2.text)
relaystate_value = r2.text[ss4.span(0)[1]:ss5.span(0)[0]].replace(':',':')
ss6 = re.search('name="SAMLResponse" value="', r2.text)
ss7 = [m.span for m in re.finditer('"/>',r2.text)] # expect multiple matches with the second match being desired
saml_value = r2.text[ss6.span(0)[1]:ss7[1](0)[0]]
data = {'RelayState': relaystate_value, 'SAMLResponse': [saml_value, 'Continue']}
header_data.update({'Host': 'training.ehs.washington.edu', 'Referer': 'https://idp.u.washington.edu/', 'Connection': 'keep-alive'})
# Make third request
r3 = s.post(url3, headers=header_data, data = data)
# You should now be at the intended URL
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With