I have the following dataframe. <pre class="prettyprint"><code>In [12]: dfFinal Out[12]: module vectime vecvalue 1906 client1.tcp [1.1007512, 1.1015024, 1.1022536, 1.1030048, 1... [0.0007512, 0.0007512, 0.0007512, 0.0007512, 0... 1912 client2.tcp [1.10079784, 1.10159568, 1.10239352, 1.1031913... [0.00079784, 0.00079784, 0.00079784, 0.0007978... 1918 client3.tcp [1.10084448, 1.10168896, 1.10258008, 1.1036111... [0.00084448, 0.00084448, 0.00089112, 0.0010310... </code></pre> I want to plot the timeSeries <code>vecvalue</code> vs <code>vectime</code> for each module. The result is the following: <img src="https://i.stack.imgur.com/EydIw.png" alt="enter image description here"> To do so I can do as follows: 1) Matplotlib <pre class="prettyprint"><code>start = datetime.datetime.now() for row in dfFinal.itertuples(): t = row.vectime x = row.vecvalue x = runningAvg(x) plot(t,x) total = (datetime.datetime.now() - start).total_seconds() print("Total time: ",total) </code></pre> Doing so, takes <code>0.07005</code> seconds to accomplish. 2) Seaborn <pre class="prettyprint"><code>start = datetime.datetime.now() for row in dfFinal.itertuples(): t = row.vectime x = row.vecvalue x = runningAvg(x) DF = pd.DataFrame({'x':x, 't':t}) sns.lineplot(x='t', y='x', data=DF) total = (datetime.datetime.now() - start).total_seconds() print("Total time: ",total) </code></pre> Doing so, takes <code>19.157463</code> seconds to accomplish. Why is there such a huge difference? What is it that I'm doing so wrong that it takes that long to process a rather small DF?

Set <code>ci=None</code> in the call to <code>lineplot</code>; otherwise, confidence intervals will be computed resulting in some expensive (and unnecessary) <code>df.groupby</code> calls. An aside: the <code>snakeviz</code> module is a great tool for quickly finding computational bottlenecks.

Seaborn lineplot high cpu; very slow compared to matplotlib

Tags:

performance

python

pandas

matplotlib

seaborn

I have the following dataframe.

In [12]: dfFinal
Out[12]: 
           module                                            vectime                                           vecvalue
1906  client1.tcp  [1.1007512, 1.1015024, 1.1022536, 1.1030048, 1...  [0.0007512, 0.0007512, 0.0007512, 0.0007512, 0...
1912  client2.tcp  [1.10079784, 1.10159568, 1.10239352, 1.1031913...  [0.00079784, 0.00079784, 0.00079784, 0.0007978...
1918  client3.tcp  [1.10084448, 1.10168896, 1.10258008, 1.1036111...  [0.00084448, 0.00084448, 0.00089112, 0.0010310...

I want to plot the timeSeries vecvalue vs vectime for each module.

The result is the following: enter image description here

To do so I can do as follows:

1) Matplotlib

start = datetime.datetime.now()

for row in dfFinal.itertuples():
    t = row.vectime
    x = row.vecvalue
    x = runningAvg(x)
    plot(t,x)

total = (datetime.datetime.now() - start).total_seconds()
print("Total time: ",total)

Doing so, takes 0.07005 seconds to accomplish.

2) Seaborn

start = datetime.datetime.now()

for row in dfFinal.itertuples():
    t = row.vectime
    x = row.vecvalue
    x = runningAvg(x)
    DF = pd.DataFrame({'x':x, 't':t})
    sns.lineplot(x='t', y='x', data=DF)

total = (datetime.datetime.now() - start).total_seconds()
print("Total time: ",total)

Doing so, takes 19.157463 seconds to accomplish.

Why is there such a huge difference? What is it that I'm doing so wrong that it takes that long to process a rather small DF?

531

asked May 16 '19 14:05

Lucas Aimaretto

1 Answers

Set ci=None in the call to lineplot; otherwise, confidence intervals will be computed resulting in some expensive (and unnecessary) df.groupby calls.

An aside: the snakeviz module is a great tool for quickly finding computational bottlenecks.

105

answered Sep 20 '22 15:09

Ross B.

Related questions
                            
                                Python click, Can you make -h as an alias
                            
                                How does Python's matplotlib.pyplot.quiver exactly work?
                            
                                How to count a boolean in grouped Spark data frame
                            
                                How to load sparse data with TensorFlow?
                            
                                Tuples: += operator throws exception, but succeeds? [duplicate]
                            
                                argparse: flatten the result of action='append'
                            
                                Paramiko: "FutureWarning: CTR mode needs counter parameter"
                            
                                How to get the coordinates of the bounding box in YOLO object detection?
                            
                                mock_s3 decorating pytest fixture
                            
                                python multiprocessing in Jupyter on Windows: AttributeError: Can't get attribute "abc"
                            
                                Count occurrences of a substring in a list of strings
                            
                                Multiply doubles in Python with same precision as C++
                            
                                Getting around tf.argmax which is not differentiable
                            
                                Getting error 403 while installing package with pip
                            
                                Python Logging - How to inherit root logger level & handler
                            
                                (Tensorflow-GPU) import tensorflow ImportError: Could not find 'cudnn64_7.dll'
                            
                                Django: Filter a Queryset made of unions not working
                            
                                pandas dataframe: loc vs query performance
                            
                                Python unpacking operator (*)
                            
                                google.auth.exceptions.DefaultCredentialsError:

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With