Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dot-boxplots from DataFrames

Dataframes in Pandas have a boxplot method, but is there any way to create dot-boxplots in Pandas, or otherwise with seaborn?

By a dot-boxplot, I mean a boxplot that shows the actual data points (or a relevant sample of them) inside the plot, e.g. like the example below (obtained in R).

enter image description here

like image 261
Amelio Vazquez-Reina Avatar asked May 07 '14 13:05

Amelio Vazquez-Reina


People also ask

What are the dots in a boxplot?

Sometimes, the mean is also indicated by a dot or a cross on the box plot. The whiskers are the two lines outside the box, that go from the minimum to the lower quartile (the start of the box) and then from the upper quartile (the end of the box) to the maximum.

How do I make a horizontal box plot in Python?

Horizontal Box plots We can turn the boxplot into a horizontal boxplot by two methods first, we need to switch x and y attributes and pass it to the boxplot( ) method, and the other is to use the orient=”h” option and pass it to the boxplot() method.


2 Answers

For a more precise answer related to OP's question (with Pandas):

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = pd.DataFrame({ "A":np.random.normal(0.8,0.2,20),
                      "B":np.random.normal(0.8,0.1,20), 
                      "C":np.random.normal(0.9,0.1,20)} )

data.boxplot()

for i,d in enumerate(data):
    y = data[d]
    x = np.random.normal(i+1, 0.04, len(y))
    plt.plot(x, y, mfc = ["orange","blue","yellow"][i], mec='k', ms=7, marker="o", linestyle="None")

plt.hlines(1,0,4,linestyle="--")

boxplot


Old version (more generic) :

With matplotlib :

import numpy as np
import matplotlib.pyplot as plt

a = np.random.normal(0,2,1000)
b = np.random.normal(-2,7,100)
data = [a,b]

plt.boxplot(data) # Or you can use the boxplot from Pandas

for i in [1,2]:
    y = data[i-1]
    x = np.random.normal(i, 0.02, len(y))
    plt.plot(x, y, 'r.', alpha=0.2)

Which gives that : dot-boxplot

Inspired from this tutorial

Hope this helps !

like image 156
jrjc Avatar answered Sep 20 '22 08:09

jrjc


This will be possible with seaborn version 0.6 (currently in the master branch on github) using the stripplot function. Here's an example:

import seaborn as sns
tips = sns.load_dataset("tips")
sns.boxplot(x="day", y="total_bill", data=tips)
sns.stripplot(x="day", y="total_bill", data=tips,
              size=4, jitter=True, edgecolor="gray")

enter image description here

like image 22
mwaskom Avatar answered Sep 21 '22 08:09

mwaskom