I have a scatterplot and I want to color it based on another value (naively assigned to np.random.random()
in this case).
Is there a way to use seaborn
to map a continuous value (not directly associated with the data being plotted) for each point to a value along a continuous gradient in seaborn
?
Here's my code to generate the data:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
from sklearn import decomposition
import seaborn as sns; sns.set_style("whitegrid", {'axes.grid' : False})
%matplotlib inline
np.random.seed(0)
# Iris dataset
DF_data = pd.DataFrame(load_iris().data,
index = ["iris_%d" % i for i in range(load_iris().data.shape[0])],
columns = load_iris().feature_names)
Se_targets = pd.Series(load_iris().target,
index = ["iris_%d" % i for i in range(load_iris().data.shape[0])],
name = "Species")
# Scaling mean = 0, var = 1
DF_standard = pd.DataFrame(StandardScaler().fit_transform(DF_data),
index = DF_data.index,
columns = DF_data.columns)
# Sklearn for Principal Componenet Analysis
# Dims
m = DF_standard.shape[1]
K = 2
# PCA (How I tend to set it up)
Mod_PCA = decomposition.PCA(n_components=m)
DF_PCA = pd.DataFrame(Mod_PCA.fit_transform(DF_standard),
columns=["PC%d" % k for k in range(1,m + 1)]).iloc[:,:K]
# Plot
fig, ax = plt.subplots()
ax.scatter(x=DF_PCA["PC1"], y=DF_PCA["PC2"], color="k")
ax.set_title("No Coloring")
Ideally, I wanted to do something like this:
# Color classes
cmap = {obsv_id:np.random.random() for obsv_id in DF_PCA.index}
# Plot
fig, ax = plt.subplots()
ax.scatter(x=DF_PCA["PC1"], y=DF_PCA["PC2"], color=[cmap[obsv_id] for obsv_id in DF_PCA.index])
ax.set_title("With Coloring")
# ValueError: to_rgba: Invalid rgba arg "0.2965562650640299"
# to_rgb: Invalid rgb arg "0.2965562650640299"
# cannot convert argument to rgb sequence
but it didn't like the continuous value.
I want to use a color palette like:
sns.palplot(sns.cubehelix_palette(8))
I also tried doing something like below, but it wouldn't make sense b/c it doesn't know which values I used in my cmap
dictionary above:
ax.scatter(x=DF_PCA["PC1"], y=DF_PCA["PC2"],cmap=sns.cubehelix_palette(as_cmap=True)
Scatterplot with Seaborn Default Colors In addition to these arguments we can use hue and specify we want to color the data points based on another grouping variable. This will produce points with different colors. g =sns. scatterplot(x="gdpPercap", y="lifeExp", hue="continent", data=gapminder); g.
To change the color of a scatter point in matplotlib, there is the option "c" in the function scatter.
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
x, y, z = np.random.rand(3, 100)
cmap = sns.cubehelix_palette(as_cmap=True)
f, ax = plt.subplots()
points = ax.scatter(x, y, c=z, s=50, cmap=cmap)
f.colorbar(points)
from matplotlib.cm import ScalarMappable
from matplotlib.colors import Normalize
cmap = {obsv_id:np.random.random() for obsv_id in DF_PCA.index}
sm = ScalarMappable(norm=Normalize(vmin=min(list(cmap.values())), vmax=max(list(cmap.values()))), cmap=sns.cubehelix_palette(as_cmap=True))
# Plot
fig, ax = plt.subplots()
ax.scatter(x=DF_PCA["PC1"], y=DF_PCA["PC2"], color=[sm.to_rgba(cmap[obsv_id]) for obsv_id in DF_PCA.index])
ax.set_title("With Coloring")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With