Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Color seaborn boxplot based in DataFrame column name

I'd like to create a list of boxplots with the color of the box dependent on the name of the pandas.DataFrame column I use as input.

The column names contain strings that indicate an experimental condition based on which I want the box of the boxplot colored.

I do this to make the boxplots:

sns.boxplot(data = data.dropna(), orient="h")
plt.show()

This creates a beautiful list of boxplots with correct names. Now I want to give every boxplot that has 'prog +, DMSO+' in its name a red color, leaving the rest as blue.

I tried creating a dictionary with column names as keys and colors as values:

color = {}
for column in data.columns:
    if 'prog+, DMSO+' in column:
        color[column] = 'red'
    else:
        color[column] = 'blue'

And then using the dictionary as color:

sns.boxplot(data = data.dropna(), orient="h", color=color[column])
plt.show()

This does not work, understandably (there is no loop to go through the dictionary). So I make a loop:

for column in data.columns:
    sns.boxplot(data = data[column], orient='h', color=color[column])
plt.show()

This does make boxplots of different colors but all on top of each other and without the correct labels. If I could somehow put these boxplot nicely in one plot below each other I'd be almost at what I want. Or is there a better way?

like image 488
Freek Avatar asked Nov 05 '15 12:11

Freek


2 Answers

You should use the palette parameter, which handles multiple colors, rather than color, which handles a specific one. You can give palette a name, an ordered list, or a dictionary. The latter seems best suited to your question:

import seaborn as sns
sns.set_color_codes()
tips = sns.load_dataset("tips")
pal = {day: "r" if day == "Sat" else "b" for day in tips.day.unique()}
sns.boxplot(x="day", y="total_bill", data=tips, palette=pal)

enter image description here

like image 99
mwaskom Avatar answered Oct 21 '22 13:10

mwaskom


You can set the facecolor of individual boxes after plotting them all in one go, using ax.artists[i].set_facecolor('r')

For example:

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

df = pd.DataFrame(
        [[2, 4, 5, 6, 1],
         [4, 5, 6, 7, 2],
         [5, 4, 5, 5, 1],
         [10, 4, 7, 8, 2],
         [9, 3, 4, 6, 2],
         [3, 3, 4, 4, 1]
        ],columns=['bar', 'prog +, DMSO+ 1', 'foo', 'something', 'prog +, DMSO+ 2'])

ax = sns.boxplot(data=df,orient='h')

boxes = ax.artists

for i,box in enumerate(boxes):
    if 'prog +, DMSO+' in df.columns[i]:
        box.set_facecolor('r')
    else:
        box.set_facecolor('b')

plt.tight_layout()
plt.show()

enter image description here

like image 24
tmdavison Avatar answered Oct 21 '22 14:10

tmdavison