I have been googling for a while and couldn't figure out a way to do this. I have a simple Flask app which takes a CSV file, reads it into a Pandas dataframe, converts it and output as a new CSV file. I have managed to upload and convert it successfully with HTML
<div class="container">
<form method="POST" action="/convert" enctype="multipart/form-data">
<div class="form-group">
<br />
<input type="file" name="file">
<input type="submit" name="upload"/>
</div>
</form>
</div>
where after I click submit, it runs the conversion in the background for a while and automatically triggers a download once it's done. The code that takes the result_df and triggers download looks like
@app.route('/convert', methods=["POST"])
def convert(
if request.method == 'POST':
# Read uploaded file to df
input_csv_f = request.files['file']
input_df = pd.read_csv(input_csv_f)
# TODO: Add progress bar for pd_convert
result_df = pd_convert(input_df)
if result_df is not None:
resp = make_response(result_df.to_csv())
resp.headers["Content-Disposition"] = "attachment; filename=export.csv"
resp.headers["Content-Type"] = "text/csv"
return resp
I'd like to add a progress bar to pd_convert
which is essentially a pandas apply operation. I found that tqdm
works with pandas now and it has a progress_apply
method instead of apply
. But I'm not sure if it is relevant for making a progress bar on a web page. I guess it should be since it works on Jupyter notebooks. How do I add a progress bar for pd_convert()
here?
The ultimate result I want is:
1 and 2 are done now. Then the next question is how to trigger the download. For now, my convert
function triggers the download with no problem because the response is formed with a file. If I want to render the page I form a response with return render_template(...)
. Since I can only have one response, is it possible to have 3 and 4 with only one call to /convert
?
Not a web developer, still learning about the basics. Thanks in advance!
====EDIT====
I tried the example here with some modifications. I get the progress from the row index in a for loop on the dataframe and put it in Redis. The client gets the progress from Redis from the stream by asking this new endpoint /progress
. Something like
@app.route('/progress')
def progress():
"""Get percentage progress for the dataframe process"""
r = redis.StrictRedis(
host=redis_host, port=redis_port, password=redis_password, decode_responses=True)
r.set("progress", str(0))
# TODO: Problem, 2nd submit doesn't clear progress to 0%. How to make independent progress for each client and clear to 0% on each submit
def get_progress():
p = int(r.get("progress"))
while p <= 100:
p = int(r.get("progress"))
p_msg = "data:" + str(p) + "\n\n"
yield p_msg
logging.info(p_msg)
if p == 100:
r.set("progress", str(0))
time.sleep(1)
return Response(get_progress(), mimetype='text/event-stream')
It is currently working but with some issues. The reason is definitely my lack of understanding in this solution.
Issues:
submit
button is pressed. I tried several places to reset it to 0 but haven't found the working version yet. It's definitely related to my lack of understanding in how stream works. Now it only resets when I refresh the page.job_id
for each submit
event and make it the key in Redis. Since I don't need the entry after each job is done, I will just delete the entry after it's done.I feel my missing part is the understanding of text/event-stream
. Feeling I'm close to a working solution. Please share your opinion on what is the "proper" way to do this. I'm just guessing and trying to put together something that works with my very limited understanding.
Using Track Track is the quick way to add a progress bar to your code. You don't need to make major changes to the code, you can just insert the function where you declared the size of the for loop and you are good to go.
To use it, just include a manual=True argument into alive_bar (or config_handler ), and you get to send any percentage to the bar() handler! For example, to set it to 15%, just call bar(0.15) — which is 15 / 100. You can also use total here!
tqdm is a library in Python which is used for creating Progress Meters or Progress Bars. tqdm got its name from the Arabic name taqaddum which means 'progress'. Implementing tqdm can be done effortlessly in our loops, functions or even Pandas.
OK, I narrowed down the problems I was missing and figured it out. The concepts I needed include
Backend
/progress
for an event stream (HTML5)text/event-stream
MIME type responseFrontend
The sample HTML
<script>
function getProgress() {
var source = new EventSource("/progress");
source.onmessage = function(event) {
$('.progress-bar').css('width', event.data+'%').attr('aria-valuenow', event.data);
$('.progress-bar-label').text(event.data+'%');
// Event source closed after hitting 100%
if(event.data == 100){
source.close()
}
}
}
</script>
<body>
<div class="container">
...
<form method="POST" action="/autoattr" enctype="multipart/form-data">
<div class="form-group">
...
<input type="file" name="file">
<input type="submit" name="upload" onclick="getProgress()" />
</div>
</form>
<div class="progress" style="width: 80%; margin: 50px;">
<div class="progress-bar progress-bar-striped active"
role="progressbar" aria-valuenow="0" aria-valuemin="0" aria-valuemax="100" style="width: 0%">
<span class="progress-bar-label">0%</span>
</div>
</div>
</div>
</body>
Sample backend Flask code
redis_host = "localhost"
redis_port = 6379
redis_password = ""
r = redis.StrictRedis(
host=redis_host, port=redis_port, password=redis_password, decode_responses=True)
@app.route('/progress')
def progress():
"""Get percentage progress for auto attribute process"""
r.set("progress", str(0))
def progress_stream():
p = int(r.get("progress"))
while p < 100:
p = int(r.get("progress"))
p_msg = "data:" + str(p) + "\n\n"
yield p_msg
# Client closes EventSource on 100%, gets reopened when `submit` is pressed
if p == 100:
r.set("progress", str(0))
time.sleep(1)
return Response(progress_stream(), mimetype='text/event-stream')
The rest is the code for Pandas for loop writing to Redis.
I pieced together a lot of the results from hours of Googling so I feel it's best to document here for people who also need this basic feature: add a progress bar in a Flask web app for Pandas dataframe processing.
Some useful references
• https://medium.com/code-zen/python-generator-and-html-server-sent-events-3cdf14140e56
• https://codeburst.io/polling-vs-sse-vs-websocket-how-to-choose-the-right-one-1859e4e13bd9
• What are Long-Polling, Websockets, Server-Sent Events (SSE) and Comet?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With