Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas plot bar order categories

I have a dataset with a categorical variable that contains three unique values, "low", "medium" and "high":

df.CatVar.value_counts()
Out[93]: 
Medium    35832
Low       25311
High      12527
Name: CatVar, dtype: int64

I am trying to plot the number of unique values as a bar-plot. However, the following code gives me the bars in the order ["Medium", "Low", "High"]

df.CatVar.value_counts().plot(kind="bar")

How do I change the order of the bars in the plot?

like image 779
ViggoTW Avatar asked May 27 '18 15:05

ViggoTW


People also ask

How do you plot a series in a bar graph?

The plot. bar() function is used to create a vertical bar plot. A bar plot is a plot that presents categorical data with rectangular bars with lengths proportional to the values that they represent. A bar plot shows comparisons among discrete categories.

How do I change the order of the bars in MatPlotLib?

We can use Pandas' sort_values() function to order by Salary variable. Pandas sort_values() function orders the dataframe in ascending order by default. Now we can use the sorted dataframe with our bar() function to make barplot ordered in ascending order. And we get barplot in ascending order with Matplotlib.


2 Answers

There are 2 possible solutions - change order of index before plot - by reindex or loc:

df.CatVar.value_counts().reindex(["Low", "Medium", "High"]).plot(kind="bar")
df.CatVar.value_counts().loc[["Low", "Medium", "High"]].plot(kind="bar")

Or use ordered categorical, so after value_counts get order by categories parameter:

df.CatVar = pd.Categorical(df.CatVar, categories=["Low", "Medium", "High"], ordered=True)
df.CatVar.value_counts(sort=False).plot(kind="bar")

Sample:

df = pd.DataFrame({'CatVar':['Low','Medium','Low','Low','Medium','High']})
print (df)
   CatVar
0     Low
1  Medium
2     Low
3     Low
4  Medium
5    High

df.CatVar.value_counts().reindex(["Low", "Medium", "High"]).plot(kind="bar")

pic

like image 52
jezrael Avatar answered Oct 01 '22 10:10

jezrael


The following code solved my problem:

df.CatVar.value_counts()[['Low', 'Medium', 'High']].plot(kind="bar")
like image 34
ViggoTW Avatar answered Oct 01 '22 09:10

ViggoTW