How to plot the difference of two distributions in a seaborn?

Tags:

I have the following code to compare two distributions:

sns.kdeplot(df['term'][df['outcome'] == 0], shade=1, color='red')
sns.kdeplot(df['term'][df['outcome'] == 1], shade=1, color='green');

It looks like this:

enter image description here

How do to plot just the difference of both distributions (disA - disB)? Of course, it could contain negative values.

480

asked Mar 26 '18 09:03

mllamazares

1 Answers

Since the difference between two kde curves is not a kde curve itself, you cannot use kdeplot to plot that difference.

A kde is easily calculated using scipy.stats.gaussian_kde. The result is easily plotted with pyplot.

import numpy as np; np.random.seed(0)
import matplotlib.pyplot as plt
import scipy.stats

a = np.random.gumbel(80, 25, 1000)
b = np.random.gumbel(90, 46, 4000)

kdea = scipy.stats.gaussian_kde(a)
kdeb = scipy.stats.gaussian_kde(b)

grid = np.linspace(0,500, 501)

plt.plot(grid, kdea(grid), label="kde A")
plt.plot(grid, kdeb(grid), label="kde B")
plt.plot(grid, kdea(grid)-kdeb(grid), label="difference")

plt.legend()
plt.show()

enter image description here

Mind that the result is really just the difference between the curves (as being asked for); it has no statistical relevance at all.

answered Oct 06 '22 06:10

ImportanceOfBeingErnest

Related questions
                            
                                Open .h5 file in Python
                            
                                SparkSession initialization error - Unable to use spark.read
                            
                                Why do we need the asyncio.coroutine decorator?
                            
                                How to apply a special methods 'Mixin' to a typing.NamedTuple
                            
                                How to create an animation with Folium?
                            
                                Searching in multiple fields respecting the row order
                            
                                pip3 install pandas hangs
                            
                                Replace all but last occurrences of a character in a string with pandas
                            
                                Python: "FileNotFoundError" on all Subprocess calls
                            
                                Python mysql does not commit
                            
                                botocore.exceptions.ClientError An error occurred (SignatureDoesNotMatch) when calling the GetObject operation
                            
                                Conda importing one environment.yml into another
                            
                                DecisionTreeClassifier predict_proba returns 0 or 1
                            
                                Run py.test test in different process
                            
                                Can one have Python receive a variable-length string array from C#?
                            
                                In the Django REST framework, how are the default permission classes combined with per-view(set) ones?
                            
                                Questions on using ttk.Style()?
                            
                                How to incorporate data from two distinct sources (that don't have a RDBMS relationship) in a single serializer?
                            
                                Where is the API documentation for boto3 resources?
                            
                                matplotlib's xkcd() not working

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to plot the difference of two distributions in a seaborn?

Tags:

python

matplotlib

seaborn

mllamazares

People also ask

1 Answers

ImportanceOfBeingErnest

Recent Activity

Donate For Us