Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get SalesForce data to Python Panda dataframes

Currently we are taking SalesForce data in to CSV file and reading this CSV file in Pandas using read_csv, to_csv methods. Do we have any other way to get data from SalesForce to pandas dataframe.

like image 263
DaraRamu Avatar asked Aug 31 '18 14:08

DaraRamu


3 Answers

With Python - you can download a package called Simple Salesforce and write SOQL queries to return data

https://github.com/simple-salesforce/simple-salesforce

Here's an example of how to do this:

from simple_salesforce import Salesforce
sf = Salesforce(username='<enter username>', password='<enter password>', 
     security_token = '<enter your access token from your profile>')

a_query= pd.DataFrame(sf.query(
     "SELECT Name, CreatedDate FROM User")['records'])
like image 50
ShapelyOwl Avatar answered Oct 06 '22 16:10

ShapelyOwl


In my case, to display the information as a dataframe I had to use the following code:

# Import libraries
import simple_salesforce as ssf, pandas

# Create the connection
session_id, instance = ssf.SalesforceLogin(username='<username>', password='<password>', security_token='<token>', sandbox=False)
sf_ = ssf.Salesforce(instance=instance, session_id=session_id)

# Query to execute
sql_code = "SELECT id, name FROM main_table"

# Store query result as dataframe
information = sf_.query(query= sql_code)
table = pandas.DataFrame(information['records']).drop(columns='attributes')
like image 26
João Avatar answered Oct 06 '22 16:10

João


Adding up to the original answer, the function below is also suitable for simple joins.

def sf_results_to_dataframe(results, drop_index=True) -> pd.DataFrame:

    df = pd.DataFrame(results['records'])
    df.drop('attributes', axis=1, inplace=True)  # clean up from technical info
    df.set_index('Id', drop=drop_index, inplace=True)

    for table in ['Account', 'Contact', 'Lead', 'Opportunity']:
        if table in results['records'][0].keys(): # detect JOIN
            local_keys = list(results['records'][0][table].keys()) # keys from the joined table
            if 'attributes' in local_keys:
                local_keys.remove('attributes')

            global_keys  = [table + key for key in local_keys] # name for the fields in the output table

            # fields of the joined table and the record index
            table_records = [{'Id': record['Id'],
                              **{global_key:record[table][local_key] for global_key, local_key in zip(global_keys, local_keys)}}
                              for record in results['records']]
            df_extra = pd.DataFrame(table_records)
            df_extra.set_index('Id', drop=True, inplace=True) # match index
            df.drop(table, axis=1, inplace=True) # drop duplicated info
            df = df.merge(df_extra, left_index=True, right_index=True) # merge on index

    return df

Example:

import pandas as pd
from simple_salesforce import Salesforce

SALESFORCE_EMAIL = '...'
SALESFORCE_TOKEN = '...'
SALESFORCE_PASSWORD = '...'

sf = Salesforce(username=SALESFORCE_EMAIL, password=SALESFORCE_PASSWORD, security_token=SALESFORCE_TOKEN)

query = """SELECT Id, Name, Account.Name
FROM Contact
LIMIT 1
"""

results = sf.query(query)
df = sf_results_to_dataframe(results)
like image 1
icemtel Avatar answered Oct 06 '22 16:10

icemtel