Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to plot a histogram by different groups in matplotlib?

I have a table like:

value    type
10       0
12       1
13       1
14       2

Generate a dummy data:

import numpy as np

value = np.random.randint(1, 20, 10)
type = np.random.choice([0, 1, 2], 10)

I want to accomplish a task in Python 3 with matplotlib (v1.4):

  • plot a histogram of value
  • group by type, i.e. use different colors to differentiate types
  • the position of the "bars" should be "dodge", i.e. side by side
  • since the range of value is small, I would use identity for bins, i.e. the width of a bin is 1

The questions are:

  • how to assign colors to bars based on the values of type and draw colors from colormap (e.g. Accent or other cmap in matplotlib)? I don't want to use named color (i.e. 'b', 'k', 'r')
  • the bars in my histogram overlap each other, how to "dodge" the bars?

Note

  1. I have tried on Seaborn, matplotlib and pandas.plot for two hours and failed to get the desired histogram.
  2. I read the examples and Users' Guide of matplotlib. Surprisingly, I found no tutorial about how to assign colors from colormap.
  3. I have searched on Google but failed to find a succinct example.
  4. I guess one could accomplish the task with matplotlib.pyplot, without import a bunch of modules such as matplotlib.cm, matplotlib.colors.
like image 583
Zelong Avatar asked Jul 06 '15 23:07

Zelong


People also ask

How do you plot a histogram with different variables in Python?

In python, we plot histogram using plt. hist() method.

How do you plot multiple histograms on the same plot?

For example, to make a plot with two histograms, we need to use pyplot's hist() function two times. Here we adjust the transparency with alpha parameter and specify a label for each variable. Here we customize our plot with two histograms with larger labels, title and legend using the label we defined.

How do you separate histogram bars?

The space between bars can be added by using rwidth parameter inside the “plt. hist()” function. This value specifies the width of the bar with respect to its default width and the value of rwidth cannot be greater than 1.

How do I plot multiple things in Matplotlib?

In Matplotlib, we can draw multiple graphs in a single plot in two ways. One is by using subplot() function and other by superimposition of second graph on the first i.e, all graphs will appear on the same plot.


1 Answers

For your first question, we can create a dummy column equal to 1, and then generate counts by summing this column, grouped by value and type.

For your second question you can pass the colormap directly into plot using the colormap parameter:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import seaborn
seaborn.set() #make the plots look pretty

df = pd.DataFrame({'value': value, 'type': type})
df['dummy'] = 1
ag = df.groupby(['value','type']).sum().unstack()
ag.columns = ag.columns.droplevel()

ag.plot(kind = 'bar', colormap = cm.Accent, width = 1)
plt.show()

enter image description here

like image 127
maxymoo Avatar answered Nov 01 '22 13:11

maxymoo