Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flask - Store values in memory between requests

I have a Single page application - Angularjs on the front and flask on the backend which lets the user upload a file (xlsx, csv...) and then interactively analyze/query the file

Essentially the user loads the file into memory on first upload and then subsequent ajax calls will tap into this file in memory. Im not sure how to keep the file in memory between subsequent requests (ajax).

The g variable is erased after each request and if I understand right used to access values across a request (set usually by before_request and available all through the views

The request context is local to the request. I did manage to set the value on the current_app and then was able to access this in my subsequent ajax calls

# On my first file upload, i load the file into memory
and set it to a variable on current_app:

from flask import current_app
@app.route('/upload', methods =['POST'])
def upload():
   ...
   upload file into memory
   ...
   current_app.file = file_in_memory



@app.route('/subsequent_call')
def subsequent():
    # i'm able to access the file in memory through 
    the current_app.file which i set earlier

    return current_app.file.number_of_lines()

This method of storing the file in memory on the current_app just doesnt seem right and feels too dirty/hackish. Would this scale at all ?

I could pickle the file after every request and pull it back up on each request. But Storing/pickling and refetching the file every time into memory when a user is interactively querying the data seems too heavy/inefficient

Is there any other elegant/right way to do this, app_context, werkzeug locals etc ? Or am i thinking about it all wrong ?

like image 296
Shankar ARUL Avatar asked Jul 14 '15 13:07

Shankar ARUL


People also ask

Are global variables thread safe in Flask how do I share data between requests?

You can't use global variables to hold this sort of data. Not only is it not thread safe, it's not process safe, and WSGI servers in production spawn multiple processes. Not only would your counts be wrong if you were using threads to handle requests, they would also vary depending on which process handled the request.

How does a Flask handle multiple requests?

As of Flask 1.0, flask server is multi-threaded by default. Each new request is handled in a new thread. This is a simple Flask application using default settings.

Are flasks Threadsafe?

Flask code has to be thread-safe, also called re-entrant. Local variables and parameters are always thread-safe.

Is Flask synchronous or asynchronous?

Flask has been claimed as synchronous on many occasions, yet still possible to get async working but takes extra work.


2 Answers

Storing file in this way is not going to work if your webserver is spawning multiple processes (workers) to handle requests, and that is how most production servers works.

Further to keep file object in memory is not going to scale if your server load increases, you can either save file in file system and initialize the pandas object during every requests. You can compare this with loading pickled object and see which is faster. You will also have to consider overhead of pickling not just unpickling.

EDIT: explanation of why it wont work in production

Gunicorn and similar webservers are likely to spawn multiple workers unless you are restricting in config, a worker is essentially a separate process and each process has its own python execution environment. So lets say your first request hits worker1 and you create a variable current_app.file = file_in_memory in that process. Then your second request could hit worker2 which has its own python execution environment where your variable is not available because they are not shared across processes. In fact there might be a value in that variable but it belongs to different user request.

So all in all

  1. It does not guarantee that same object is available across requests
  2. It could get overridden by another user who is also simultaneously using your app
like image 103
shreyas Avatar answered Oct 18 '22 01:10

shreyas


Although I am answering the question very late, this would still help many of us. For sharing values between requests you should trust on using cache memory. Cache memory in flask is simple to implement using flask-cache and redis as database. It would be most efficient and reliable way of doing it. Read any article on implementation of redis as cache db in flask for further reference.

like image 44
Mousam Singh Avatar answered Oct 18 '22 03:10

Mousam Singh