How to locate the median in a (seaborn) KDE plot?

Tags:

I am trying to do a Kernel Density Estimation (KDE) plot with seaborn and locate the median. The code looks something like this:

import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

sns.set_palette("hls", 1)
data = np.random.randn(30)
sns.kdeplot(data, shade=True)

# x_median, y_median = magic_function()
# plt.vlines(x_median, 0, y_median)

plt.show()

As you can see I need a magic_function() to fetch the median x and y values from the kdeplot. Then I would like to plot them with e.g. vlines. However, I can't figure out how to do that. The result should look something like this (obviously the black median bar is wrong here):

enter image description here

I guess my question is not strictly related to seaborn and also applies to other kinds of matplotlib plots. Any ideas are greatly appreciated.

821

asked Mar 10 '15 05:03

n1000

1 Answers

You need to:

Extract the data of the kde line
Integrate it to calculate the cumulative distribution function (CDF)
Find the value that makes CDF equal 1/2, that is the median

import numpy as np
import scipy
import seaborn as sns
import matplotlib.pyplot as plt

sns.set_palette("hls", 1)
data = np.random.randn(30)
p=sns.kdeplot(data, shade=True)

x,y = p.get_lines()[0].get_data()

#care with the order, it is first y
#initial fills a 0 so the result has same length than x
cdf = scipy.integrate.cumtrapz(y, x, initial=0)

nearest_05 = np.abs(cdf-0.5).argmin()

x_median = x[nearest_05]
y_median = y[nearest_05]

plt.vlines(x_median, 0, y_median)
plt.show()

Result

148

answered Sep 19 '22 08:09

agomcas

Related questions
                            
                                Why can't torrent traffic be encrypted? [closed]
                            
                                How can I add a connection timeout with asyncio?
                            
                                Azure Powershell script to swap Azure App Service (website) deployment slots
                            
                                Increment Android build number in Continuous Integration
                            
                                C++: Is the ignored return value destruction behavior well-defined
                            
                                Is the Javascript String length constant time?
                            
                                Why doesn't SFINAE (enable_if) work for member functions of a class template?
                            
                                How to adapt Fenwick tree to answer range minimum queries
                            
                                System.out.println where does it write to in Intellij Idea?
                            
                                How to get the current location in watchOS 2?
                            
                                Serialize Property, but Do Not Deserialize Property in Json.Net
                            
                                Python: Split NumPy array based on values in the array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With