Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

returning results from python script to variable in Jupyter notebook

I have a python script that returns a pandas dataframe and I want to run the script in a Jupyter notebook and then save the results to a variable.

The data are in a file called data.csv and a shortened version of the dataframe.py file whose results I want to access in my Jupyter notebook is:

# dataframe.py
import pandas as pd
import sys

def return_dataframe(file):
    df = pd.read_csv(file)
    return df

if __name__ == '__main__':
    return_dataframe(sys.argv[1])

I tried running:

data = !python dataframe.py data.csv

in my Jupyter notebook but data does not contain the dataframe that dataframe.py is supposed to return.

like image 964
tshwizz Avatar asked Mar 06 '26 10:03

tshwizz


1 Answers

This is how I did it:

# dataframe.py 
import pandas as pd
import sys

def return_dataframe(f): # don't shadow built-in `file`
    df = pd.read_csv(f)
    return df

if __name__ == '__main__':
    return_dataframe(sys.argv[1]).to_csv(sys.stdout,index=False)

Then in the notebook you need to convert an 'IPython.utils.text.SList' into a DataFrame as shown in the comments to this question: Convert SList to Dataframe:

data = !python3 dataframe.py data.csv
df = pd.DataFrame(data=data)[0].str.split(',',expand=True)

If the DataFrame is already going to be put into CSV format then you could simply do this in the notebook:

df = pd.read_csv('data.csv')
like image 175
mechanical_meat Avatar answered Mar 08 '26 21:03

mechanical_meat