Is there a simply way to specify bar colors by column name using Pandas <code>DataFrame.plot(kind='bar')</code> method? I have a script that generates multiple DataFrames from several different data files in a directory. For example it does something like this: <pre class="prettyprint"><code>import numpy as np import matplotlib.pyplot as plt import pandas as pds data_files = ['a', 'b', 'c', 'd'] df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1]) df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:]) df1.plot(kind='bar', ax=plt.subplot(121)) df2.plot(kind='bar', ax=plt.subplot(122)) plt.show() </code></pre> With the following output: <img src="https://i.stack.imgur.com/ECWp2.png" alt="Output"> Unfortunately, the column colors aren't consistent for each label in the different plots. Is it possible to pass in a dictionary of (filenames:colors), so that any particular column always has the same color. For example, I could imagine creating this by zipping up the filenames with the Matplotlib color_cycle: <pre class="prettyprint"><code>data_files = ['a', 'b', 'c', 'd'] colors = plt.rcParams['axes.color_cycle'] print zip(data_files, colors) [('a', u'b'), ('b', u'g'), ('c', u'r'), ('d', u'c')] </code></pre> I could figure out how to do this directly with Matplotlib: I just thought there might be a simpler, built-in solution. Edit: Below is a partial solution that works in pure Matplotlib. However, I'm using this in an IPython notebook that will be distributed to non-programmer colleagues, and I'd like to minimize the amount of excessive plotting code. <pre class="prettyprint"><code>import numpy as np import matplotlib.pyplot as plt import pandas as pds data_files = ['a', 'b', 'c', 'd'] mpl_colors = plt.rcParams['axes.color_cycle'] colors = dict(zip(data_files, mpl_colors)) def bar_plotter(df, colors, sub): ncols = df.shape[1] width = 1./(ncols+2.) starts = df.index.values - width*ncols/2. plt.subplot(120+sub) for n, col in enumerate(df): plt.bar(starts + width*n, df[col].values, color=colors[col], width=width, label=col) plt.xticks(df.index.values) plt.grid() plt.legend() df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1]) df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:]) bar_plotter(df1, colors, 1) bar_plotter(df2, colors, 2) plt.show() </code></pre> <img src="https://i.stack.imgur.com/7nZqE.png" alt="Desired Output">

You can pass a list as the colors. This will require a little bit of manual work to get it to line up, unlike if you could pass a dictionary, but may be a less cluttered way to accomplish your goal. <pre class="prettyprint"><code>import numpy as np import matplotlib.pyplot as plt import pandas as pds data_files = ['a', 'b', 'c', 'd'] df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1]) df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:]) color_list = ['b', 'g', 'r', 'c'] df1.plot(kind='bar', ax=plt.subplot(121), color=color_list) df2.plot(kind='bar', ax=plt.subplot(122), color=color_list[1:]) plt.show() </code></pre> <img src="https://i.stack.imgur.com/Fgb3J.png" alt="enter image description here"> EDIT Ajean came up with a simple way to return a list of the correct colors from a dictionary: <pre class="prettyprint"><code>import numpy as np import matplotlib.pyplot as plt import pandas as pds data_files = ['a', 'b', 'c', 'd'] color_list = ['b', 'g', 'r', 'c'] d2c = dict(zip(data_files, color_list)) df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1]) df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:]) df1.plot(kind='bar', ax=plt.subplot(121), color=map(d2c.get,df1.columns)) df2.plot(kind='bar', ax=plt.subplot(122), color=map(d2c.get,df2.columns)) plt.show() </code></pre>

Pandas version 1.1.0 makes this easier. You can pass a dictionary to specify different color for each column in the pandas.DataFrame.plot.bar() function: <img src="https://i.stack.imgur.com/m8JAQ.png" alt="enter image description here"> Here is an example: <pre class="prettyprint"><code>df1 = pd.DataFrame({'a': [1.2, .8, .9], 'b': [.2, .9, .7]}) df2 = pd.DataFrame({'b': [0.2, .5, .4], 'c': [.5, .6, .7], 'd': [1.1, .6, .7]}) color_dict = {'a':'green', 'b': 'red', 'c':'blue', 'd': 'cyan'} df1.plot.bar(color = color_dict) df2.plot.bar(color = color_dict) </code></pre>

Pandas bar plot -- specify bar color by column

Tags:

pandas

matplotlib

Is there a simply way to specify bar colors by column name using Pandas DataFrame.plot(kind='bar') method?

I have a script that generates multiple DataFrames from several different data files in a directory. For example it does something like this:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

df1.plot(kind='bar', ax=plt.subplot(121))
df2.plot(kind='bar', ax=plt.subplot(122))

plt.show()

With the following output:

Output

Unfortunately, the column colors aren't consistent for each label in the different plots. Is it possible to pass in a dictionary of (filenames:colors), so that any particular column always has the same color. For example, I could imagine creating this by zipping up the filenames with the Matplotlib color_cycle:

data_files = ['a', 'b', 'c', 'd']
colors = plt.rcParams['axes.color_cycle']
print zip(data_files, colors)

[('a', u'b'), ('b', u'g'), ('c', u'r'), ('d', u'c')]

I could figure out how to do this directly with Matplotlib: I just thought there might be a simpler, built-in solution.

Edit:

Below is a partial solution that works in pure Matplotlib. However, I'm using this in an IPython notebook that will be distributed to non-programmer colleagues, and I'd like to minimize the amount of excessive plotting code.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']
mpl_colors = plt.rcParams['axes.color_cycle']
colors = dict(zip(data_files, mpl_colors))

def bar_plotter(df, colors, sub):
    ncols = df.shape[1]
    width = 1./(ncols+2.)
    starts = df.index.values - width*ncols/2.
    plt.subplot(120+sub)
    for n, col in enumerate(df):
        plt.bar(starts + width*n, df[col].values, color=colors[col],
                width=width, label=col)
    plt.xticks(df.index.values)
    plt.grid()
    plt.legend()

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

bar_plotter(df1, colors, 1)
bar_plotter(df2, colors, 2)

plt.show()

Desired Output

822

asked Sep 05 '14 15:09

Ryan

2 Answers

You can pass a list as the colors. This will require a little bit of manual work to get it to line up, unlike if you could pass a dictionary, but may be a less cluttered way to accomplish your goal.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

color_list = ['b', 'g', 'r', 'c']


df1.plot(kind='bar', ax=plt.subplot(121), color=color_list)
df2.plot(kind='bar', ax=plt.subplot(122), color=color_list[1:])

plt.show()

enter image description here

EDIT Ajean came up with a simple way to return a list of the correct colors from a dictionary:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pds

data_files = ['a', 'b', 'c', 'd']
color_list = ['b', 'g', 'r', 'c']
d2c = dict(zip(data_files, color_list))

df1 = pds.DataFrame(np.random.rand(4,3), columns=data_files[:-1])
df2 = pds.DataFrame(np.random.rand(4,3), columns=data_files[1:])

df1.plot(kind='bar', ax=plt.subplot(121), color=map(d2c.get,df1.columns))
df2.plot(kind='bar', ax=plt.subplot(122), color=map(d2c.get,df2.columns))

plt.show()

answered Oct 10 '22 17:10

DataSwede

Pandas version 1.1.0 makes this easier. You can pass a dictionary to specify different color for each column in the pandas.DataFrame.plot.bar() function:

enter image description here

Here is an example:

df1 = pd.DataFrame({'a': [1.2, .8, .9], 'b': [.2, .9, .7]})
df2 = pd.DataFrame({'b': [0.2, .5, .4], 'c': [.5, .6, .7], 'd': [1.1, .6, .7]})
color_dict = {'a':'green', 'b': 'red', 'c':'blue', 'd': 'cyan'}
df1.plot.bar(color = color_dict)
df2.plot.bar(color = color_dict)

answered Oct 10 '22 19:10

Kumar

Related questions
                            
                                Set scientific notation with fixed exponent and significant digits for multiple subplots
                            
                                MatplotlibDeprecationWarning with Pyinstaller .exe
                            
                                Underlining Text in Python/Matplotlib
                            
                                matplotlib - Legend in separate subplot
                            
                                Change size of arrows using matplotlib quiver
                            
                                Python (matplotlib) less-than-or-equal-to symbol in text
                            
                                Smoothed 2D histogram using matplotlib and imshow
                            
                                Plotting with Matplotlib in Visual Studio using Python Tools for Visual Studio
                            
                                Plotting a dataframe (pandas) in pycharm, not displaying
                            
                                Plot Piecewise Function in Python
                            
                                python matplotlib with a line color gradient and colorbar
                            
                                Check if points are inside ellipse faster than contains_point method
                            
                                How can I make a barplot and a lineplot in the same seaborn plot with different Y axes nicely?
                            
                                How to get subplots of matplotlib Figure?
                            
                                Creating a 3D surface plot from three 1D arrays
                            
                                matplotlib for R user? [closed]
                            
                                Python matplotlib scatter plot : changing colour of data points based on given conditions
                            
                                Display multiple mpld3 exports on a single HTML page
                            
                                How to check whether/which matplotlibrc was used
                            
                                matplotlib: continuous colormap fill between two lines

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With