Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I pass in a python pandas.Dataframe object as an argument to a celery task?

I would like to pass a pandas dataframe object as an argument to a celery task. Is there a way I can achieve this? I understand that dataframe objects are not JSON serializable and therefore cannot be used as arguments based on my current setup.

like image 672
Shetty Avatar asked Jan 29 '18 22:01

Shetty


2 Answers

Looks like I can use the pandas.Dataframe.to_json() method to convert a given dataframe to JSON to begin with. Once I pass the JSON value to my celery task, I can use the pd.read_json() method to get back my pandas.Dataframe object.

[1] https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_json.html

[2] https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_json.html

like image 85
Shetty Avatar answered Sep 28 '22 01:09

Shetty


to_json didn't work for me, it automatically tried to parse strings as datetimes so my result was invalid.

I'm trying to serialize and deserialize celery result, so something similar. to_dict works for me.

df_as_dict = df.to_dict()
df = pd.DataFrame.from_dict(df_as_dict)
like image 32
Tom Wojcik Avatar answered Sep 28 '22 01:09

Tom Wojcik