Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matplotlib/seaborn histogram using different colors for grouped bins

I have this code, using a pandas df:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os


path_to = 'Data\\2017-04\\MonthlyData\q1analysis\Energy Usage'  # where to save

df = pd.read_csv('April2017NEW.csv', index_col =1)


df1 = df.loc['Output Energy, (Wh/h)']  # choose index value and Average
df1['Average'] = df1.mean(axis=1)
print df1
print df1['Average'].describe()

def hist():
    p = sns.distplot(df1['Average'],kde=False, bins=25).set(xlim=(0, 100));
    plt.xlabel('Watt hours')
    plt.ylabel('Households')

    return plt.show()

which returns:

enter image description here

I would like to use three different colors (low, medium, high) to represent higher values on the x =axis with a legend, like this:

enter image description here

EDIT1:

I found this example: here, so I am trying to use this.

I've come up with this: enter image description here Which is almost there. How does one split the range into 3, with 3 different colors?

like image 827
warrenfitzhenry Avatar asked May 07 '17 11:05

warrenfitzhenry


4 Answers

Solution:

N, bins, patches = plt.hist(df1['Average'], 30)

cmap = plt.get_cmap('jet')
low = cmap(0.5)
medium =cmap(0.2)
high = cmap(0.7)


for i in range(0,3):
    patches[i].set_facecolor(low)
for i in range(4,13):
    patches[i].set_facecolor(medium)
for i in range(14,30):
    patches[i].set_facecolor(high)

plt.xlabel("Watt Hours", fontsize=16)  
plt.ylabel("Households", fontsize=16)
plt.xticks(fontsize=14)  
plt.yticks(fontsize=14)
ax = plt.subplot(111)  
ax.spines["top"].set_visible(False)  
ax.spines["right"].set_visible(False)

plt.show()

output:

enter image description here

like image 91
warrenfitzhenry Avatar answered Oct 06 '22 07:10

warrenfitzhenry


If you want to color specific divisions with specific colors and label them accordingly you can use the following code:

import matplotlib.pyplot as plt
import numpy             as np
import seaborn as sns; sns.set(color_codes=True)

number_of_bins = 20
N, bins, patches = plt.hist(np.random.rand(1000), number_of_bins, rwidth=0.8)

#Define the colors for your pathces (you can write them in any format):
colors    = [(0, 0, 0), "b", "#ffff00", "red"]
#Define the ranges of your patches:
divisions = [range(1), range(1, 9), range(9, 14), range(14, 20)]
#If you want to label the regions/divisions:
labels    = ["Black", "Blue", "Yellow", "Red"]

#for each division color the parches according to the specified colors:
for d in divisions:
    patches[list(d)[0]].set_label(labels[divisions.index(d)])
    for i in d:
        patches[i].set_color(colors[divisions.index(d)])


plt.title("Plot Title")
plt.xlabel("X label")
plt.ylabel("Y label")
plt.legend(title="Legend Title")

enter image description here

like image 30
Ștefan Avatar answered Oct 06 '22 06:10

Ștefan


I recommend you to use the plt.hist() function 3 times, each with a different color. You can set the range of each histogramm using the range parameter of the function. The legend is genereated by using the label parameter followed by a subsequent call of plt.legend().

like image 41
Padix Key Avatar answered Oct 06 '22 07:10

Padix Key


A more readable general solution without using the cmap as you want just 3 colors for specific intervals.

n, bins, patches = plt.hist(df1['Average'], 30)

for c, p in zip(bins, patches):
    if c > 0 and c < 4:
        plt.setp(p, 'facecolor', 'green')
    elif c >= 4 and c < 14  :
        plt.setp(p, 'facecolor', 'blue')
    else c>=14:
        plt.setp(p, 'facecolor', 'yellow')

plt.show()
like image 29
RMS Avatar answered Oct 06 '22 08:10

RMS