There are many questions and answers that touch upon this topic one way or another. With this contribution I'd like to clearly show why an easy approch such as marker = {'color' : 'red'}
will work for plotly.graph_objects (go)
, but color='red'
will not for plotly.express (px)
although color is an attribute of both px.Line
and px.Scatter
. And I'd like to demonstrate why it's awesome that it doesn't.
So, if px
is supposed to be the easiest way to make a plotly figure, then why does something as apparently obvious as color='red'
return the error
ValueError: Value of 'color' is not the name of a column in 'data_frame'.
To put it short, it's because color
in px
does not accept an arbitrary color name or code, but rather a variable name in your dataset in order to assign a color cycle to unique values and display them as lines with different colors.
Let me demonstrate by applyig a gapminder dataset and show a scatterplot of Life expectancy
versus GDP per capita
for all (at least most) countries across the world as of 2007
. A basic setup like below will produce the following plot
go
:The color is set by a cycle named plotly but is here specified using marker = {'color' : 'red'}
import plotly.graph_objects as go
df = px.data.gapminder()
df=df.query("year==2007")
fig = go.Figure()
fig.add_traces(go.Scatter(x=df['gdpPercap'], y=df["lifeExp"],
mode = 'markers',
marker = {'color' : 'red'}
))
fig.show()
So let's try this with px
, and assume that color='red'
would do the trick:
px
:# imports
import plotly.express as px
import pandas as pd
# dataframe
df = px.data.gapminder()
df=df.query("year==2007")
# plotly express scatter plot
px.scatter(df, x="gdpPercap", y="lifeExp",
color = 'red',
)
ValueError: Value of 'color' is not the name of a column in 'data_frame'. Expected one of ['country', 'continent', 'year', 'lifeExp', 'pop', 'gdpPercap', 'iso_alpha', 'iso_num'] but received: red
So what's going on here?
First, if an explanation of the broader differences between go
and px
is required, please take a look here and here. And if absolutely no explanations are needed, you'll find a complete code snippet at the very end of the answer which will reveal many of the powers with colors in plotly.express
It might not seem so at first, but there are very good reasons why color='red'
does not work as you might expect using px
. But first of all, if all you'd like to do is manually set a particular color for all markers you can do so using .update_traces(marker=dict(color='red'))
thanks to pythons chaining method. But first, lets look at the deafult settings:
px.Scatter
px.Scatter
# imports
import plotly.express as px
import pandas as pd
# dataframe
df = px.data.gapminder()
df=df.query("year==2007")
# plotly express scatter plot
px.scatter(df, x="gdpPercap", y="lifeExp")
Here, as already mentioned in the question, the color is set as the first color in the default plotly sequence available through px.colors.qualitative.Plotly
:
['#636EFA', # the plotly blue you can see above
'#EF553B',
'#00CC96',
'#AB63FA',
'#FFA15A',
'#19D3F3',
'#FF6692',
'#B6E880',
'#FF97FF',
'#FECB52']
And that looks pretty good. But what if you want to change things and even add more information at the same time?
As we alread touched upon with px.scatter
, the color
attribute does not take a color like red
as an argument. Rather, you can for example use color='continent'
to easily distinguish between different variables in a dataset. But there's so much more to colors in px
:
The combination of the six following methods will let you do exactly what you'd like with colors using plotly express. Bear in mind that you do not even have to choose. You can use one, some, or all of the methods below at the same time. And one particular useful approach will reveal itself as a combinatino of 1
and 3
. But we'll get to that in a bit. This is what you need to know:
1. Change the color sequence used by px with:
color_discrete_sequence=px.colors.qualitative.Alphabet
2. Assign different colors to different variables with the color
argument
color = 'continent'
3. customize one or more variable colors with
color_discrete_map={"Asia": 'red'}
4. Easily group a larger subset of your variables using dict comprehension and color_discrete_map
subset = {"Asia", "Africa", "Oceania"}
group_color = {i: 'red' for i in subset}
5. Set opacity using rgba()
color codes.
color_discrete_map={"Asia": 'rgba(255,0,0,0.4)'}
6. Override all settings with:
.update_traces(marker=dict(color='red'))
The following snippet will produce the plot below that shows life expectany for all continents for varying levels of GDP. The size of the markers representes different levels of populations to make things more interesting right from the get go.
import plotly.express as px
import pandas as pd
# dataframe, input
df = px.data.gapminder()
df=df.query("year==2007")
px.scatter(df, x="gdpPercap", y="lifeExp",
color = 'continent',
size='pop',
)
To illustrate the flexibility of the methods above, lets first just change the color sequence. Since we for starters are only showing one category and one color, you'll have to wait for the subsequent steps to see the real effects. But here's the same plot now with color_discrete_sequence=px.colors.qualitative.Alphabet
as per step 1:
1. Change the color sequence used by px with
color_discrete_sequence=px.colors.qualitative.Alphabet
Now, let's apply the colors from the Alphabet
color sequence to the different continents:
2. Assign different colors to different variables with the color
argument
color = 'continent'
If you, like me, think that this particular color sequence is easy on the eye but perhaps a bit indistinguishable, you can assign a color of your choosing to one or more continents like this:
3. customize one or more variable colors with
color_discrete_map={"Asia": 'red'}
And this is pretty awesome: Now you can change the sequence and choose any color you'd like for particularly interesting variables. But the method above can get a bit tedious if you'd like to assign a particular color to a larger subset. So here's how you can do that too with a dict comprehension:
4. Assign colors to a group using a dict comprehension and color_discrete_map
# imports
import plotly.express as px
import pandas as pd
# dataframe
df = px.data.gapminder()
df=df.query("year==2007")
subset = {"Asia", "Europe", "Oceania"}
group_color = {i: 'red' for i in subset}
# plotly express scatter plot
px.scatter(df, x="gdpPercap", y="lifeExp",
size='pop',
color='continent',
color_discrete_sequence=px.colors.qualitative.Alphabet,
color_discrete_map=group_color
)
5. Set opacity using rgba()
color codes.
Now let's take one step back. If you think red
suits Asia just fine, but is perhaps a bit too strong, you can adjust the opacity using a rgba
color like 'rgba(255,0,0,0.4)'
to get this:
import plotly.express as px
import pandas as pd
# dataframe, input
df = px.data.gapminder()
df=df.query("year==2007")
px.scatter(df, x="gdpPercap", y="lifeExp",
color_discrete_sequence=px.colors.qualitative.Alphabet,
color = 'continent',
size='pop',
color_discrete_map={"Asia": 'rgba(255,0,0,0.4)'}
)
And if you think we're getting a bit too complicated by now, you can override all settings like this again:
6. Override all settings with:
.update_traces(marker=dict(color='red'))
And this brings us right back to where we started. I hope you'll find this useful!
# imports
import plotly.express as px
import pandas as pd
# dataframe
df = px.data.gapminder()
df=df.query("year==2007")
subset = {"Asia", "Europe", "Oceania"}
group_color = {i: 'red' for i in subset}
# plotly express scatter plot
px.scatter(df, x="gdpPercap", y="lifeExp",
size='pop',
color='continent',
color_discrete_sequence=px.colors.qualitative.Alphabet,
#color_discrete_map=group_color
color_discrete_map={"Asia": 'rgba(255,0,0,0.4)'}
)#.update_traces(marker=dict(color='red'))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With