Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use tqdm with pandas in a jupyter notebook?

I'm doing some analysis with pandas in a jupyter notebook and since my apply function takes a long time I would like to see a progress bar. Through this post here I found the tqdm library that provides a simple progress bar for pandas operations. There is also a Jupyter integration that provides a really nice progress bar where the bar itself changes over time.

However, I would like to combine the two and don't quite get how to do that. Let's just take the same example as in the documentation

import pandas as pd import numpy as np from tqdm import tqdm  df = pd.DataFrame(np.random.randint(0, 100, (100000, 6)))  # Register `pandas.progress_apply` and `pandas.Series.map_apply` with `tqdm` # (can use `tqdm_gui`, `tqdm_notebook`, optional kwargs, etc.) tqdm.pandas(desc="my bar!")  # Now you can use `progress_apply` instead of `apply` # and `progress_map` instead of `map` df.progress_apply(lambda x: x**2) # can also groupby: # df.groupby(0).progress_apply(lambda x: x**2) 

It even says "can use 'tqdm_notebook' " but I don't find a way how. I've tried a few things like

tqdm_notebook(tqdm.pandas(desc="my bar!")) 

or

tqdm_notebook.pandas 

but they don't work. In the definition it looks to me like

tqdm.pandas(tqdm_notebook(desc="my bar!")) 

should work, but the bar doesn't properly show the progress and there is still additional output.

Any other ideas?

like image 958
grinsbaeckchen Avatar asked Nov 07 '16 23:11

grinsbaeckchen


People also ask

How do I use tqdm in Python?

Usage. Using tqdm is very simple, you just need to add your code between tqdm() after importing the library in your code. You need to make sure that the code you put in between the tqdm() function must be iterable or it would not work at all.

Does Jupyter notebook support pandas?

In JupyterLab, create a new (Python 3) notebook: In the first cell of the notebook, you can import pandas and check the version with: Now you are ready to use pandas, and you can write your code in the next cells.


2 Answers

My working solution (copied from the documentation):

from tqdm.auto import tqdm tqdm.pandas() 
like image 73
Vincenzo Lavorini Avatar answered Sep 17 '22 20:09

Vincenzo Lavorini


You can use:

tqdm_notebook().pandas(*args, **kwargs) 

This is because tqdm_notebook has a delayer adapter, so it's necessary to instanciate it before accessing its methods (including class methods).

In the future (>v5.1), you should be able to use a more uniform API:

tqdm_pandas(tqdm_notebook, *args, **kwargs) 
like image 37
gaborous Avatar answered Sep 19 '22 20:09

gaborous