How do I find out what bin width was used when doing a distplot in Seaborn? I have two datasets I would like to share bin widhts, but don't know how to return the default value used for the first dataset. for something like the simple example below, how would I find out the bin width used?
import nump as np
import seaborn as sns
f, axs = plt.subplots(1,1)
distribution=np.random.rand(1000)
sns.distplot(distribution, hist=True , kde_kws={"shade": True},ax=axs)
Seaborn uses Freedman-Diaconis rule to calculate bin width if bins
parameter is not specified in the function seaborn.distplot()
The equation is as follows (from wikipedia):
We can calculate IQR and the cube-root of n with the following code.
Q1 = np.quantile(distribution, 0.25)
Q3 = np.quantile(distribution, 0.75)
IQR = Q3 - Q1
cube = np.cbrt(len(distribution)
The bin width is:
In[] : 2*IQR/cube
Out[]: 0.10163947994817446
Finally, we can now calculate the number of bins.
In[] : 1/(2*IQR/cube) # '1' is the range of the array for this example
Out[]: 9.838696543015526
When we round up the result, it amounts to 10. That's our number of bins. We can now specify bins
parameter to get the same number of bins (or same bin width for the same range)
Graph w/o specifying bins:
f, axs = plt.subplots(1,1)
distribution=np.random.rand(1000)
sns.distplot(distribution, hist=True , kde_kws={"shade": True},ax=axs)
Graph w/ specifying the parameter bins=10
:
f, axs = plt.subplots(1,1)
sns.distplot(distribution, bins=10, hist=True , kde_kws={"shade": True},ax=axs)
Update:
Seaborn version 0.9 was mentioning Freedman-Diaconis rule as a way to calculate bin size:
Specification of hist bins, or None to use Freedman-Diaconis rule.
The description changed in version 0.10 as follows:
Specification of hist bins. If unspecified, as reference rule is used that tries to find a useful default.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With