Measure execution time in Jupyter Notebook: %timeit , %%timeit. In Jupyter Notebook (IPython), you can use the magic commands %timeit and %%timeit to measure the execution time of your code. No need to import the timeit module.
%%time measures how long it took something to run.
To measure time elapsed during program's execution, either use time. clock() or time. time() functions. The python docs state that this function should be used for benchmarking purposes.
The only way I found to overcome this problem is by executing the last statement with print.
Do not forget that cell magic starts with %%
and line magic starts with %
.
%%time
clf = tree.DecisionTreeRegressor().fit(X_train, y_train)
res = clf.predict(X_test)
print(res)
Notice that any changes performed inside the cell are not taken into consideration in the next cells, something that is counter intuitive when there is a pipeline:
An easier way is to use ExecuteTime plugin in jupyter_contrib_nbextensions package.
pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user
jupyter nbextension enable execute_time/ExecuteTime
%time
and %timeit
now come part of ipython's built-in magic commands
Use cell magic and this project on github by Phillip Cloud:
Load it by putting this at the top of your notebook or put it in your config file if you always want to load it by default:
%install_ext https://raw.github.com/cpcloud/ipython-autotime/master/autotime.py
%load_ext autotime
If loaded, every output of subsequent cell execution will include the time in min and sec it took to execute it.
import time
start = time.time()
"the code you want to test stays here"
end = time.time()
print(end - start)
You can use timeit
magic function for that.
%timeit CODE_LINE
Or on the cell
%%timeit
SOME_CELL_CODE
Check more IPython magic functions at https://nbviewer.jupyter.org/github/ipython/ipython/blob/1.x/examples/notebooks/Cell%20Magics.ipynb
I simply added %%time
at the beginning of the cell and got the time. You may use the same on Jupyter Spark cluster/ Virtual environment using the same. Just add %%time
at the top of the cell and you will get the output. On spark cluster using Jupyter, I added to the top of the cell and I got output like below:-
[1] %%time
import pandas as pd
from pyspark.ml import Pipeline
from pyspark.ml.classification import LogisticRegression
import numpy as np
.... code ....
Output :-
CPU times: user 59.8 s, sys: 4.97 s, total: 1min 4s
Wall time: 1min 18s
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With