Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Dataframe: plot colors by column name

I'm plotting a Pandas DataFrame with a few lines, each in a specific color (specified by rgb value). I'm looking for a way to make my code more readable by assigning the plot line colors directly to DataFrame column names instead of listing them in sequence.

I know I can do this:

import pandas as pd

df = pd.DataFrame(columns=['red zero line', 'blue one line'], data=[[0, 1], [0, 1]])
df.plot(colors = ['#BB0000', '#0000BB']) # red and blue

but with a lot more than two lines, I'd really like to be able to specify the colors by column header, to make the code easy to maintain. Such as this:

df.plot(colors = {'red zero line': '#FF0000', 'blue one line': '#0000FF'})

The colors keyword can't actually be a dictionary though. (Technically it's type-converted to list, which yields a list of the column labels.)

I understand that pd.DataFrame.plot inherits from matplotlib.pyplot.plot but I can't find the documentation for the colors keyword. Neither of the documentations for the two methods lists such a keyword.

like image 455
Joooeey Avatar asked Nov 03 '17 21:11

Joooeey


2 Answers

If you create a dictionary mapping the column names to colors, you can build the color list on the fly using a list comprehension where you just get the color from the column name. This also allows you to specify a default color in case you missed a column.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame([[0, 1, 2], [0, 1, 2]], 
                  columns=['red zero line', 'blue one line', 'extra'])

color_dict = {'red zero line': '#FF0000', 'blue one line': '#0000FF'}

# use get to specify dark gray as the default color.
df.plot(color=[color_dict.get(x, '#333333') for x in df.columns])
plt.show()

enter image description here

like image 64
James Avatar answered Sep 19 '22 01:09

James


You can specify the order of the columns before plotting with df[cols]:

import pandas as pd

cols = ['red zero line', 'blue one line', 'green two line']
colors = ['#BB0000', '#0000BB', 'green']
df = pd.DataFrame(columns=cols, data=[[0, 1, 2], [0, 1, 2], [0, 1, 3]])

df[cols].plot(colors = colors)

example plot

If you want to be sure columns and colors are strictly paired, you can always just zip ahead of time:

columns_and_colors = zip(cols, colors)
df[cols].plot(colors = [cc[1] for cc in columns_and_colors])
like image 38
andrew_reece Avatar answered Sep 20 '22 01:09

andrew_reece