Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Problems assigning color to bars in Pandas v0.20 and matplotlib

I am struggling for a while with the definition of colors in a bar plot using Pandas and Matplotlib. Let us imagine that we have following dataframe:

import pandas as pd
pers1 = ["Jesús","lord",2]
pers2 = ["Mateo","apostel",1]
pers3 = ["Lucas","apostel",1]
    
dfnames = pd.DataFrame(
    [pers1,pers2, pers3],
    columns=["name","type","importance"]
)

Now, I want to create a bar plot with the importance as the numerical value, the names of the people as ticks and use the type column to assign colors. I have read other questions (for example: Define bar chart colors for Pandas/Matplotlib with defined column) but it doesn't work...

So, first I have to define colors and assign them to different values:

colors = {'apostel':'blue','lord':'green'}

And finally use the .plot() function:

dfnames.plot(
    x="name",
    y="importance",
    kind="bar",
    color = dfnames['type'].map(colors)
)

Good. The only problem is that all bars are green:

enter image description here

Why?? I don't know... I am testing it in Spyder and Jupyter... Any help? Thanks!

like image 920
José Avatar asked Jan 02 '23 23:01

José


1 Answers

As per this GH16822, this is a regression bug introduced in version 0.20.3, wherein only the first colour was picked from the list of colours passed. This was not an issue with prior versions.

The reason, according to one of the contributors was this -

The problem seems to be in _get_colors. I think that BarPlot should define a _get_colors that does something like

def _get_colors(self, num_colors=None, color_kwds='color'):
    color = self.kwds.get('color')
    if color is None:
        return super()._get_colors(self, num_colors=num_colors, color_kwds=color_kwds)
    else:
        num_colors = len(self.data)  # maybe? may not work for some cases
        return _get_standard_colors(color=kwds.get('color'), num_colors=num_colors)

There's a couple of options for you -

  1. The most obvious choice would be to update to the latest version of pandas (currently v0.22)
  2. If you need a workaround, there's one (also mentioned in the issue tracker) whereby you wrap the arguments within an extra tuple -

    dfnames.plot(x="name",  
                 y="importance", 
                 kind="bar", 
                 color=[tuple(dfnames['type'].map(colors))]
    

Though, in the interest of progress, I'd recommend updating your pandas.

like image 121
cs95 Avatar answered Jan 12 '23 18:01

cs95