I am dealing with a huge dataframe. I would like to avoid pickling in-between user queries. Want to know if i can save the DataFrame in Flask Session and access it from session hence avoiding pickling.
I wrote the below code but i am faced with the error: [17578 rows x 319 columns] is not JSON serializable
#=====================================================================================
#=====================================================================================
@app.route('/start', methods=['GET', 'POST'])
def index():
if 'catalogueDF' in session:
if request.method == 'POST':
query = request.get_json('query') # Read user query
df = session['catalogueDF']
result = str(list(set(df['brandname']))[2])
else:
query = request.args.get('query')
result = 'User query: '+str(query)
else:
df = pd.read_excel('errorfree.xlsx', sheetname='Sheet1').fillna('NA')
df = pd.DataFrame([df[col].astype(str, na=False).str.lower() for col in df]).transpose()
session['catalogueDF'] = df
result = 'no query posted yet'
response = app.response_class(
response=json.dumps(result),
status=200,
mimetype='application/json'
)
return response
# Flask start of app
if __name__ == '__main__':
app.secret_key = os.urandom(24) # Sessions need encryption
app.run(debug = True)
Just to clarify, it looks like you want to store a DataFrame into flask sessions.
Sessions object needs to be serialized i.e. the value that is stored in session['my_object_name']
needs to be a serialized object.
I find it easiest to convert it into a dictionary before saving it in the session object:
dict_obj = df.to_dict('list')
session['data'] = dict_obj
To retrieve the session object in another function as a dataframe, convert the dictionary back to the original dataframe:
dict_obj = session['data'] if 'data' in session else ""
df = pd.DataFrame(dict_obj)
This method support only pandas version 0.24.2 or lower for new pandas version MessagePack is obsoleted
If I understand your question, It's seem you need to store DataFrame into Flask sessions. Unfortunately the Flask sessions don't understand pandas DataFrame.
However, If you really need to keep it. you can store as a binary by using the MessagePack.
data = df.to_msgpack()
session['data'] = data
Read the MessagePack
df1 = pd.read_mesgpack(session['data'])
Another Idea. You can pass DataFrame to StringIO and save in it into session again.
PS. Before you decide to use sessions, please check the size of session first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With