Authenticating on ADFS with Python script

Tags:

I need to parse site, which is hidden by ADFS service.

and struggling with authentication to it.

Is there any options to get in?

what i can see, most of solutions for backend applications, or for "system users"(with app_id, app_secret). in my case, i can't use it, only login and password.

example of problem: in chrome I open www.example.com and it redirects me to to https://login.microsoftonline.com/ and then to https://federation-sts.example.com/adfs/ls/?blabla with login and password form.

and how to get access into it with python3?

755

asked Feb 27 '19 09:02

Psychozoic

2 Answers

ADFS uses complicated redirection and CSRF protection techniques. Thus, it is better to use a browser automation tool to perform the authentication and parse the webpage afterwards. I recommend the selenium toolkit with python bindings. Here is a working example:

from selenium import webdriver
def MS_login(usrname, passwd):  # call this with username and password
    driver = webdriver.Edge()   # change to your browser (supporting Firefox, Chrome, ...)
    driver.delete_all_cookies() # clean up the prior login sessions
    driver.get('https://login.microsoftonline.com/') # change the url to your website
    time.sleep(5) # wait for redirection and rendering

    driver.find_element_by_xpath("//input[@name='loginfmt'").send_keys(usrname)
    driver.find_element_by_xpath("//input[@type='submit']").click()
    time.sleep(5)

    driver.find_element_by_xpath("//input[@name='passwd'").send_keys(passwd)
    driver.find_element_by_xpath("//input[@name='KMSI' and @type='checkbox'").click()
    driver.find_element_by_xpath("//input[@type='submit']").click()
    time.sleep(5)

    driver.find_element_by_xpath("//input[@type='submit']").click()

    # Successfully login

    # parse the site ...

    driver.close() # close the browser
    return driver

This script calls Microsoft Edge to open the website. It injects the username and password to the correct DOM elements and then let the browser to handle the rest. It has been tested on the webpage "https://login.microsoftonline.com". You may need to modify it to suit your website.

144

answered Sep 18 '22 15:09

gdlmx

To Answer your question "How to Get in with python" i am assuming you want perform some web scraping operation on the pages which is secured by Azure AD authentication.

In these kind of scenario, you have to do the following steps.

For this script we will only need to import the following:

import requests from lxml import html

First, we would like to create our session object. This object will allow us to persist the login session across all our requests.

session_requests = requests.session()

Second, we would like to extract the csrf token from the web page, this token is used during login. For this example we are using lxml and xpath, we could have used regular expression or any other method that will extract this data.

login_url = "https://bitbucket.org/account/signin/?next=/"
result = session_requests.get(login_url)

tree = html.fromstring(result.text)
authenticity_token = list(set(tree.xpath("//input[@name='csrfmiddlewaretoken']/@value")))[0]

Next, we would like to perform the login phase. In this phase, we send a POST request to the login url. We use the payload that we created in the previous step as the data. We also use a header for the request and add a referer key to it for the same url.

result = session_requests.post(
    login_url, 
    data = payload, 
    headers = dict(referer=login_url)
)

Payload would be a dictionary object of user name and password etc.

payload = {
    "username": "<USER NAME>", 
    "password": "<PASSWORD>", 
    "csrfmiddlewaretoken": "<CSRF_TOKEN>"
}

Note:- This is just an example.

Step 2:

Scrape content

Now, that we were able to successfully login, we will perform the actual scraping

url = 'https://bitbucket.org/dashboard/overview'
result = session_requests.get(
    url, 
    headers = dict(referer = url)
)

So in other words, you need to get the request details payload from Azure AD and then create a session object using logged in method and then finally do the scraping.

Here is a very good example of Web scraping of a secured website.

Hope it helps.

answered Sep 22 '22 15:09

Mohit Verma

Related questions
                            
                                how to return the order index of each element of a list? [duplicate]
                            
                                React Tutorial history map (step, move)
                            
                                pythonic style for functional programming
                            
                                Tensorflow: Different results with the same random seed
                            
                                Top N rows by group using python datatable
                            
                                Read excel file from S3 into Pandas DataFrame
                            
                                Django: Run a script right after runserver
                            
                                How to denormalize YAML for Pandas Dataframe?
                            
                                create a dictionary from string that each character is key and value
                            
                                Merging content of two rows in Pandas
                            
                                aws cognito list-users function only returns 60 users
                            
                                Executing one line of code inside a while loop only once
                            
                                How to preprocess training set for VGG16 fine tuning in Keras?
                            
                                How to make nested enum also have value
                            
                                how to get Ip address of client after connection is established in python socket programming?
                            
                                Transfer pandas dataframe column names to dictionary
                            
                                pip is giving conflict error while installing package
                            
                                Get the highest String Version number in Python
                            
                                Ordering of elements in Pandas stacked bar chart
                            
                                Implementation of softmax function returns nan for high inputs

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Authenticating on ADFS with Python script

Tags:

python

parsing

adfs

Psychozoic

People also ask

2 Answers

gdlmx

Mohit Verma

Recent Activity

Donate For Us